Skip to content

pandas-docstrings: Specify when random data in examples might be OK #77

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions pandas/guide/_sources/pandas_docstring.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ After the title, each parameter in the signature must be documented, including
The parameters are defined by their name, followed by a space, a colon, another
space, and the type (or types). Note that the space between the name and the
colon is important. Types are not defined for `*args` and `**kwargs`, but must
be defined for all other parameters. After the parameter definition, it is
be defined for all other parameters. After the parameter definition, it is
required to have a line with the parameter description, which is indented, and
can have multiple lines. The description must start with a capital letter, and
finish with a dot.
Expand Down Expand Up @@ -840,7 +840,15 @@ be tricky. Here are some attention points:
imported as ``import pandas as pd`` and ``import numpy as np``) and define
all variables you use in the example.

* Try to avoid using random data.
* Try to avoid using random data. However random data might be OK in some
cases, like if the function you are documenting deals with probability
distributions, or if the amount of data needed to make the function result
meaningful is too much, such that creating it manually is very cumbersome.
In those cases, always use a fixed random seed to make the generated examples
predictable. Example::

>>> np.random.seed(42)
>>> df = pd.DataFrame({'normal': np.random.normal(100, 5, 20)})

* If you have a code snippet that wraps multiple lines, you need to use '...'
on the continued lines: ::
Expand Down
11 changes: 10 additions & 1 deletion pandas/guide/pandas_docstring.html
Original file line number Diff line number Diff line change
Expand Up @@ -764,7 +764,16 @@ <h2>About docstrings and standards<a class="headerlink" href="#about-docstrings-
imported as <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">pandas</span> <span class="pre">as</span> <span class="pre">pd</span></code> and <code class="docutils literal notranslate"><span class="pre">import</span> <span class="pre">numpy</span> <span class="pre">as</span> <span class="pre">np</span></code>) and define
all variables you use in the example.</p>
</li>
<li><p class="first">Try to avoid using random data.</p>
<li><p class="first">Try to avoid using random data. However random data might be OK in some
cases, like if the function you are documenting deals with probability
distributions, or if the amount of data needed to make the function result
meaningful is too much, such that creating it manually is very cumbersome.
In those cases, always use a fixed random seed to make the generated examples
predictable. Example:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="gp">&gt;&gt;&gt; </span><span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="mi">42</span><span class="p">)</span>
<span class="gp">&gt;&gt;&gt; </span><span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">({</span><span class="s1">&#39;normal&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">normal</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">20</span><span class="p">)})</span>
</pre></div>
</div>
</li>
<li><p class="first">If you have a code snippet that wraps multiple lines, you need to use ‘…’
on the continued lines:</p>
Expand Down
Loading