@@ -648,31 +648,57 @@ However, for reading convenience, most of the examples show sorted sequences.
648
648
649
649
.. versionadded :: 3.10
650
650
651
- .. function :: correlation(x, y, /)
651
+ .. function :: correlation(x, y, /, *, by_rank=False )
652
652
653
653
Return the `Pearson's correlation coefficient
654
654
<https://en.wikipedia.org/wiki/Pearson_correlation_coefficient> `_
655
655
for two inputs. Pearson's correlation coefficient *r * takes values
656
- between -1 and +1. It measures the strength and direction of the linear
657
- relationship, where +1 means very strong, positive linear relationship,
658
- -1 very strong, negative linear relationship, and 0 no linear relationship.
656
+ between -1 and +1. It measures the strength and direction of a linear
657
+ relationship.
658
+
659
+ If *by_rank * is true, computes `Spearman's rank correlation coefficient
660
+ <https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient> `_
661
+ for two inputs. The data is replaced by ranks. Ties are averaged so that
662
+ equal values receive the same rank. The resulting coefficient measures the
663
+ strength of a monotonic relationship.
664
+
665
+ Spearman's correlation coefficient is appropriate for ordinal data or for
666
+ continuous data that doesn't meet the linear proportion requirement for
667
+ Pearson's correlation coefficient.
659
668
660
669
Both inputs must be of the same length (no less than two), and need
661
670
not to be constant, otherwise :exc: `StatisticsError ` is raised.
662
671
663
- Examples:
672
+ Example with `Kepler's laws of planetary motion
673
+ <https://en.wikipedia.org/wiki/Kepler's_laws_of_planetary_motion> `_:
664
674
665
675
.. doctest ::
666
676
667
- >>> x = [1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ]
668
- >>> y = [9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , 1 ]
669
- >>> correlation(x, x)
677
+ >>> # Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune
678
+ >>> orbital_period = [88 , 225 , 365 , 687 , 4331 , 10_756 , 30_687 , 60_190 ] # days
679
+ >>> dist_from_sun = [58 , 108 , 150 , 228 , 778 , 1_400 , 2_900 , 4_500 ] # million km
680
+
681
+ >>> # Show that a perfect monotonic relationship exists
682
+ >>> correlation(orbital_period, dist_from_sun, by_rank = True )
683
+ 1.0
684
+
685
+ >>> # Observe that a linear relationship is imperfect
686
+ >>> round (correlation(orbital_period, dist_from_sun), 4 )
687
+ 0.9882
688
+
689
+ >>> # Demonstrate Kepler's third law: There is a linear correlation
690
+ >>> # between the square of the orbital period and the cube of the
691
+ >>> # distance from the sun
692
+ >>> period_squared = [p * p for p in orbital_period]
693
+ >>> dist_cubed = [d * d * d for d in dist_from_sun]
694
+ >>> round (correlation(period_squared, dist_cubed), 4 )
670
695
1.0
671
- >>> correlation(x, y)
672
- -1.0
673
696
674
697
.. versionadded :: 3.10
675
698
699
+ .. versionchanged :: 3.12
700
+ Added support for Spearman's rank correlation coefficient.
701
+
676
702
.. function :: linear_regression(x, y, /, *, proportional=False)
677
703
678
704
Return the slope and intercept of `simple linear regression
0 commit comments