19
19
--------------
20
20
21
21
This module provides functions for calculating mathematical statistics of
22
- numeric (:class: `Real `-valued) data.
23
-
24
- .. note ::
25
-
26
- Unless explicitly noted otherwise, these functions support :class: `int `,
27
- :class: `float `, :class: `decimal.Decimal ` and :class: `fractions.Fraction `.
28
- Behaviour with other types (whether in the numeric tower or not) is
29
- currently unsupported. Collections with a mix of types are also undefined
30
- and implementation-dependent. If your input data consists of mixed types,
31
- you may be able to use :func: `map ` to ensure a consistent result, for
32
- example: ``map(float, input_data) ``.
22
+ numeric (:class: `~numbers.Real `-valued) data.
23
+
24
+ The module is not intended to be a competitor to third-party libraries such
25
+ as `NumPy <https://numpy.org >`_, `SciPy <https://www.scipy.org/ >`_, or
26
+ proprietary full-featured statistics packages aimed at professional
27
+ statisticians such as Minitab, SAS and Matlab. It is aimed at the level of
28
+ graphing and scientific calculators.
29
+
30
+ Unless explicitly noted, these functions support :class: `int `,
31
+ :class: `float `, :class: `~decimal.Decimal ` and :class: `~fractions.Fraction `.
32
+ Behaviour with other types (whether in the numeric tower or not) is
33
+ currently unsupported. Collections with a mix of types are also undefined
34
+ and implementation-dependent. If your input data consists of mixed types,
35
+ you may be able to use :func: `map ` to ensure a consistent result, for
36
+ example: ``map(float, input_data) ``.
33
37
34
38
Averages and measures of central location
35
39
-----------------------------------------
@@ -107,7 +111,7 @@ However, for reading convenience, most of the examples show sorted sequences.
107
111
:func: `median ` and :func: `mode `.
108
112
109
113
The sample mean gives an unbiased estimate of the true population mean,
110
- which means that, taken on average over all the possible samples,
114
+ so that when taken on average over all the possible samples,
111
115
``mean(sample) `` converges on the true mean of the entire population. If
112
116
*data * represents the entire population rather than a sample, then
113
117
``mean(data) `` is equivalent to calculating the true population mean μ.
@@ -163,8 +167,16 @@ However, for reading convenience, most of the examples show sorted sequences.
163
167
will be equivalent to ``3/(1/a + 1/b + 1/c) ``.
164
168
165
169
The harmonic mean is a type of average, a measure of the central
166
- location of the data. It is often appropriate when averaging quantities
167
- which are rates or ratios, for example speeds. For example:
170
+ location of the data. It is often appropriate when averaging
171
+ rates or ratios, for example speeds.
172
+
173
+ Suppose a car travels 10 km at 40 km/hr, then another 10 km at 60 km/hr.
174
+ What is the average speed?
175
+
176
+ .. doctest ::
177
+
178
+ >>> harmonic_mean([40 , 60 ])
179
+ 48.0
168
180
169
181
Suppose an investor purchases an equal value of shares in each of
170
182
three companies, with P/E (price/earning) ratios of 2.5, 3 and 10.
@@ -175,9 +187,6 @@ However, for reading convenience, most of the examples show sorted sequences.
175
187
>>> harmonic_mean([2.5 , 3 , 10 ]) # For an equal investment portfolio.
176
188
3.6
177
189
178
- Using the arithmetic mean would give an average of about 5.167, which
179
- is well over the aggregate P/E ratio.
180
-
181
190
:exc: `StatisticsError ` is raised if *data * is empty, or any element
182
191
is less than zero.
183
192
@@ -190,9 +199,9 @@ However, for reading convenience, most of the examples show sorted sequences.
190
199
middle two" method. If *data * is empty, :exc: `StatisticsError ` is raised.
191
200
*data * can be a sequence or iterator.
192
201
193
- The median is a robust measure of central location, and is less affected by
194
- the presence of outliers in your data . When the number of data points is
195
- odd, the middle data point is returned:
202
+ The median is a robust measure of central location and is less affected by
203
+ the presence of outliers. When the number of data points is odd, the
204
+ middle data point is returned:
196
205
197
206
.. doctest ::
198
207
@@ -210,13 +219,10 @@ However, for reading convenience, most of the examples show sorted sequences.
210
219
This is suited for when your data is discrete, and you don't mind that the
211
220
median may not be an actual data point.
212
221
213
- If your data is ordinal (supports order operations) but not numeric (doesn't
214
- support addition), you should use :func: `median_low ` or :func: `median_high `
222
+ If the data is ordinal (supports order operations) but not numeric (doesn't
223
+ support addition), consider using :func: `median_low ` or :func: `median_high `
215
224
instead.
216
225
217
- .. seealso :: :func:`median_low`, :func:`median_high`, :func:`median_grouped`
218
-
219
-
220
226
.. function :: median_low(data)
221
227
222
228
Return the low median of numeric data. If *data * is empty,
@@ -319,7 +325,7 @@ However, for reading convenience, most of the examples show sorted sequences.
319
325
desired instead, use ``min(multimode(data)) `` or ``max(multimode(data)) ``.
320
326
If the input *data * is empty, :exc: `StatisticsError ` is raised.
321
327
322
- ``mode `` assumes discrete data, and returns a single value. This is the
328
+ ``mode `` assumes discrete data and returns a single value. This is the
323
329
standard treatment of the mode as commonly taught in schools:
324
330
325
331
.. doctest ::
@@ -522,7 +528,7 @@ However, for reading convenience, most of the examples show sorted sequences.
522
528
cut-point will evaluate to ``104 ``.
523
529
524
530
The *method * for computing quantiles can be varied depending on
525
- whether the data in *data * includes or excludes the lowest and
531
+ whether the *data * includes or excludes the lowest and
526
532
highest possible values from the population.
527
533
528
534
The default *method * is "exclusive" and is used for data sampled from
0 commit comments