Pourquoi la fonction python statistics.mean() agit-elle différemment lorsqu'on lui passe un numpy.ndarray ou une liste ?

Question

Pourquoi la fonction python statistics.mean() agit-elle différemment lorsqu'on lui passe un numpy.ndarray ou une liste ?

Demandé el 6 de Novembre, 2020: Quand la question a-t-elle été
182 affichage: Nombre de visites la question a
3 Réponses: Nombre de réponses aux questions
Résolu: Situation réelle de la question

Pourquoi statistics.mean agit-il si bizarrement ? lorsqu'on lui passe un numpy.ndarray, il produit la moyenne.

statistics.mean(np.array([1,4,9])) 
4

lorsqu'on lui passe une liste, produit la moyenne réelle

statistics.mean([1,4,9]) 
4.666666666666667

J'utilise python 3.7

Demandé el 6 de Novembre, 2020 par ds90

Answer 1

3 Réponses

Answer 2

2voto

Thomas Sablik Points 1854

Non, il ne renvoie pas la médiane dans le premier cas. Il renvoie la valeur moyenne comme numpy.int64 car l'entrée est un tableau d'entiers non primitifs.

Si vous passez des objets non primitifs à statistics.mean le résultat sera converti dans le type de données d'entrée. Dans votre cas statistics.mean fait quelque chose d'équivalent à :

numpy.int64(sum(np.array([1,4,9]))/len(np.array([1,4,9])))

J'utilise Python 3.8. Voici le code pour mean :

def mean(data):
    """Return the sample arithmetic mean of data.

    >>> mean([1, 2, 3, 4, 4])
    2.8

    >>> from fractions import Fraction as F
    >>> mean([F(3, 7), F(1, 21), F(5, 3), F(1, 3)])
    Fraction(13, 21)

    >>> from decimal import Decimal as D
    >>> mean([D("0.5"), D("0.75"), D("0.625"), D("0.375")])
    Decimal('0.5625')

    If ``data`` is empty, StatisticsError will be raised.
    """
    if iter(data) is data:
        data = list(data)
    n = len(data)
    if n < 1:
        raise StatisticsError('mean requires at least one data point')
    T, total, count = _sum(data)
    assert count == n
    return _convert(total/n, T)

Voici le code pour _sum :

def _sum(data, start=0):
    """_sum(data [, start]) -> (type, sum, count)

    Return a high-precision sum of the given numeric data as a fraction,
    together with the type to be converted to and the count of items.

    If optional argument ``start`` is given, it is added to the total.
    If ``data`` is empty, ``start`` (defaulting to 0) is returned.

    Examples
    --------

    >>> _sum([3, 2.25, 4.5, -0.5, 1.0], 0.75)
    (<class 'float'>, Fraction(11, 1), 5)

    Some sources of round-off error will be avoided:

    # Built-in sum returns zero.
    >>> _sum([1e50, 1, -1e50] * 1000)
    (<class 'float'>, Fraction(1000, 1), 3000)

    Fractions and Decimals are also supported:
    >>> from fractions import Fraction as F
    >>> _sum([F(2, 3), F(7, 5), F(1, 4), F(5, 6)])
    (<class 'fractions.Fraction'>, Fraction(63, 20), 4)

    >>> from decimal import Decimal as D
    >>> data = [D("0.1375"), D("0.2108"), D("0.3061"), D("0.0419")]
    >>> _sum(data)
    (<class 'decimal.Decimal'>, Fraction(6963, 10000), 4)

    Mixed types are currently treated as an error, except that int is
    allowed.
    """
    count = 0
    n, d = _exact_ratio(start)
    partials = {d: n}
    partials_get = partials.get
    T = _coerce(int, type(start))
    for typ, values in groupby(data, type):
        T = _coerce(T, typ)  # or raise TypeError
        for n,d in map(_exact_ratio, values):
           count += 1
            partials[d] = partials_get(d, 0) + n
    if None in partials:
        # The sum will be a NAN or INF. We can ignore all the finite
        # partials, and just look at this special one.
        total = partials[None]
        assert not _isfinite(total)
    else:
        # Sum all the partial sums using builtin sum.
        # FIXME is this faster if we sum them in order of the denominator?
        total = sum(Fraction(n, d) for d, n in sorted(partials.items()))
    return (T, total, count)

Voici le code pour _convert :

def _convert(value, T):
    """Convert value to given numeric type T."""
    if type(value) is T:
        # This covers the cases where T is Fraction, or where value is
        # a NAN or INF (Decimal or float).
        return value
    if issubclass(T, int) and value.denominator != 1:
        T = float
    try:
        # FIXME: what do we do if this overflows?
        return T(value)
    except TypeError:
        if issubclass(T, Decimal):
            return T(value.numerator)/T(value.denominator)
        else:
            raise

Répondu el 6 de Novembre, 2020 par Thomas Sablik (1854 Points )

Answer 3

1voto

programmer365 Points 12651

Non, il ne s'agit pas d'une médiane. statistics.mean() attend la liste, vous obtenez la valeur arrondie parce que vous passez un tableau numérique de nombres entiers. Pour calculer la moyenne d'un tableau numpy, utilisez np.mean(np.array([1,4,9]))

Répondu el 6 de Novembre, 2020 par programmer365 (12651 Points )

Answer 4

1voto

abc Points 2863

Ceci est dû à la définition de la fonction statistique.moyenne . La fonction utilise un sous-programme Convertir .

Dans le cas d'une liste, elle sera appelée en tant que _convert(Fraction(14, 3), int) .

En étant int une sous-classe de int le code exécuté sera
```
if issubclass(T, int) and value.denominator != 1:
    T = float
try:
    return T(value)
```
Dans le cas d'un tableau numpy, il sera appelé comme suit _convert(Fraction(14, 3), np.int64) et le code exécuté sera simplement
```
try:
  return T(value) 
```
depuis np.int64 n'est pas une sous-classe de int .

Répondu el 6 de Novembre, 2020 par abc (2863 Points )

Pourquoi la fonction python statistics.mean() agit-elle différemment lorsqu'on lui passe un numpy.ndarray ou une liste ?

Réponses

Questions en vedette

Top Tags

Prograide.com

Powered by:

Pourquoi la fonction python statistics.mean() agit-elle différemment lorsqu'on lui passe un numpy.ndarray ou une liste ?

Réponses

Questions en vedette

Top Tags

Dans notre réseau

Prograide.com

Powered by: