Skip to content

Commit 292a971

Browse files
committed
Add docs
1 parent 06ff21e commit 292a971

File tree

2 files changed

+43
-18
lines changed

2 files changed

+43
-18
lines changed

Doc/library/statistics.rst

Lines changed: 36 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -648,31 +648,57 @@ However, for reading convenience, most of the examples show sorted sequences.
648648

649649
.. versionadded:: 3.10
650650

651-
.. function:: correlation(x, y, /)
651+
.. function:: correlation(x, y, /, *, by_rank=False)
652652

653653
Return the `Pearson's correlation coefficient
654654
<https://en.wikipedia.org/wiki/Pearson_correlation_coefficient>`_
655655
for two inputs. Pearson's correlation coefficient *r* takes values
656-
between -1 and +1. It measures the strength and direction of the linear
657-
relationship, where +1 means very strong, positive linear relationship,
658-
-1 very strong, negative linear relationship, and 0 no linear relationship.
656+
between -1 and +1. It measures the strength and direction of a linear
657+
relationship.
658+
659+
If *by_rank* is true, computes `Spearman's rank correlation coefficient
660+
<https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient>`_
661+
for two inputs. The data is replaced by ranks. Ties are averaged so that
662+
equal values receive the same rank. The resulting coefficient measures the
663+
strength of a monotonic relationship.
664+
665+
Spearman's correlation coefficient is appropriate for ordinal data or for
666+
continuous data that doesn't meet the linear proportion requirement for
667+
Pearson's correlation coefficient.
659668

660669
Both inputs must be of the same length (no less than two), and need
661670
not to be constant, otherwise :exc:`StatisticsError` is raised.
662671

663-
Examples:
672+
Example with `Kepler's laws of planetary motion
673+
<https://en.wikipedia.org/wiki/Kepler's_laws_of_planetary_motion>`_:
664674

665675
.. doctest::
666676

667-
>>> x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
668-
>>> y = [9, 8, 7, 6, 5, 4, 3, 2, 1]
669-
>>> correlation(x, x)
677+
>>> # Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune
678+
>>> orbital_period = [88, 225, 365, 687, 4331, 10_756, 30_687, 60_190] # days
679+
>>> dist_from_sun = [58, 108, 150, 228, 778, 1_400, 2_900, 4_500] # million km
680+
681+
>>> # Show that a perfect monotonic relationship exists
682+
>>> correlation(orbital_period, dist_from_sun, by_rank=True)
683+
1.0
684+
685+
>>> # Observe that a linear relationship is imperfect
686+
>>> round(correlation(orbital_period, dist_from_sun), 4)
687+
0.9882
688+
689+
>>> # Demonstrate Kepler's third law: There is a linear correlation
690+
>>> # between the square of the orbital period and the cube of the
691+
>>> # distance from the sun
692+
>>> period_squared = [p * p for p in orbital_period]
693+
>>> dist_cubed = [d * d * d for d in dist_from_sun]
694+
>>> round(correlation(period_squared, dist_cubed), 4)
670695
1.0
671-
>>> correlation(x, y)
672-
-1.0
673696

674697
.. versionadded:: 3.10
675698

699+
.. versionchanged:: 3.12
700+
Added support for Spearman's rank correlation coefficient.
701+
676702
.. function:: linear_regression(x, y, /, *, proportional=False)
677703

678704
Return the slope and intercept of `simple linear regression

Lib/statistics.py

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1017,10 +1017,8 @@ def correlation(x, y, /, *, by_rank=False):
10171017
"""Pearson's correlation coefficient
10181018
10191019
Return the Pearson's correlation coefficient for two inputs. Pearson's
1020-
correlation coefficient *r* takes values between -1 and +1. It measures the
1021-
strength and direction of the linear relationship, where +1 means very
1022-
strong, positive linear relationship, -1 very strong, negative linear
1023-
relationship, and 0 no linear relationship.
1020+
correlation coefficient *r* takes values between -1 and +1. It measures
1021+
the strength and direction of a linear relationship.
10241022
10251023
>>> x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
10261024
>>> y = [9, 8, 7, 6, 5, 4, 3, 2, 1]
@@ -1029,12 +1027,13 @@ def correlation(x, y, /, *, by_rank=False):
10291027
>>> correlation(x, y)
10301028
-1.0
10311029
1032-
If *by_rank* is true, computes Spearman's correlation coefficient
1030+
If *by_rank* is true, computes Spearman's rank correlation coefficient
10331031
for two inputs. The data is replaced by ranks. Ties are averaged
1034-
so that equal values receive the same rank.
1032+
so that equal values receive the same rank. The resulting coefficient measures
1033+
the strength of a monotonic relationship.
10351034
1036-
Spearman's correlation coefficient is appropriate for ordinal data
1037-
or for continuous data that doesn't meet the linear proportion
1035+
Spearman's rank correlation coefficient is appropriate for ordinal
1036+
data or for continuous data that doesn't meet the linear proportion
10381037
requirement for Pearson's correlation coefficient.
10391038
"""
10401039
n = len(x)

0 commit comments

Comments
 (0)