Skip to content

Commit eebcff8

Browse files
miss-islingtonrhettinger
authored andcommitted
bpo-36018: Add another example for NormalDist() (GH-18191) (GH-18192)
1 parent eec7636 commit eebcff8

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed

Doc/library/statistics.rst

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -772,6 +772,42 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
772772
>>> quantiles(map(model, X, Y, Z)) # doctest: +SKIP
773773
[1.4591308524824727, 1.8035946855390597, 2.175091447274739]
774774

775+
Normal distributions can be used to approximate `Binomial
776+
distributions <http://mathworld.wolfram.com/BinomialDistribution.html>`_
777+
when the sample size is large and when the probability of a successful
778+
trial is near 50%.
779+
780+
For example, an open source conference has 750 attendees and two rooms with a
781+
500 person capacity. There is a talk about Python and another about Ruby.
782+
In previous conferences, 65% of the attendees preferred to listen to Python
783+
talks. Assuming the population preferences haven't changed, what is the
784+
probability that the rooms will stay within their capacity limits?
785+
786+
.. doctest::
787+
788+
>>> n = 750 # Sample size
789+
>>> p = 0.65 # Preference for Python
790+
>>> q = 1.0 - p # Preference for Ruby
791+
>>> k = 500 # Room capacity
792+
793+
>>> # Approximation using the cumulative normal distribution
794+
>>> from math import sqrt
795+
>>> round(NormalDist(mu=n*p, sigma=sqrt(n*p*q)).cdf(k + 0.5), 4)
796+
0.8402
797+
798+
>>> # Solution using the cumulative binomial distribution
799+
>>> from math import comb, fsum
800+
>>> round(fsum(comb(n, r) * p**r * q**(n-r) for r in range(k+1)), 4)
801+
0.8402
802+
803+
>>> # Approximation using a simulation
804+
>>> from random import seed, choices
805+
>>> seed(8675309)
806+
>>> def trial():
807+
... return choices(('Python', 'Ruby'), (p, q), k=n).count('Python')
808+
>>> mean(trial() <= k for i in range(10_000))
809+
0.8398
810+
775811
Normal distributions commonly arise in machine learning problems.
776812

777813
Uncyclopedia has a `nice example of a Naive Bayesian Classifier

0 commit comments

Comments
 (0)