Skip to content

DOC: Make explicit in pandas IO doc the imports and options #28089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Nov 7, 2019
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 15 additions & 13 deletions doc/source/user_guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,6 @@
.. currentmodule:: pandas


{{ header }}

.. ipython:: python
:suppress:

clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']},
index=['x', 'y', 'z'])


===============================
IO tools (text, CSV, HDF5, ...)
===============================
Expand Down Expand Up @@ -137,7 +128,9 @@ usecols : list-like or callable, default ``None``

.. ipython:: python

from io import StringIO, BytesIO
import pandas as pd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I missed the original discussion but just to be clear, is the assumption here that we only show the import the first time it shows up one of the rst files and that the rest of the code blocks use it from there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's an assumption. It is implicitly being proposed, for simplicity and for inertia on how the header worked, but it's surely open to discussion. I thought about that too, and I was also wondering if it'd make sense to split the long pages we have now in the user guide in shorter pages. I guess that could make things easier if we finally make the code runnable.

pd.options.display.max_rows = 15
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can actually remove this line.

I quickly checked and there are only a few longer dataframes shown in this page, and they are all longer. So with the current default of only showing first/last 5 rows when truncated, this option has no effect (it would only have the effect that dataframes longer than 15 but shorter than 60 would still be truncated, but such dataframes are not present in this file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the changes in commit d3036f3.

from io import StringIO
data = ('col1,col2,col3\n'
'a,b,1\n'
'a,b,2\n'
Expand Down Expand Up @@ -363,6 +356,8 @@ columns:

.. ipython:: python

import numpy as np
np.set_printoptions(precision=4, suppress=True)
data = ('a,b,c,d\n'
'1,2,3,4\n'
'5,6,7,8\n'
Expand Down Expand Up @@ -447,7 +442,6 @@ worth trying.
:suppress:

import os

os.remove('foo.csv')

.. _io.categorical:
Expand Down Expand Up @@ -757,6 +751,7 @@ result in byte strings being decoded to unicode in the result:

.. ipython:: python

from io import BytesIO
data = (b'word,length\n'
b'Tr\xc3\xa4umen,7\n'
b'Gr\xc3\xbc\xc3\x9fe,5')
Expand Down Expand Up @@ -1579,6 +1574,7 @@ class of the csv module. For this, you have to specify ``sep=None``.
.. ipython:: python
:suppress:

np.random.seed(123456)
df = pd.DataFrame(np.random.randn(10, 4))
df.to_csv('tmp.sv', sep='|')
df.to_csv('tmp2.sv', sep=':')
Expand Down Expand Up @@ -3266,6 +3262,12 @@ clipboard (CTRL-C on many operating systems):

And then import the data directly to a ``DataFrame`` by calling:

.. ipython:: python
:suppress:

clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']},
index=['x', 'y', 'z'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is really needed. Do you mind having a look @TanyaaCJain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes.. I missed mentioning this. Thanks @datapythonista for the reminder! Even I couldn't find this assignment be of use anywhere in the file. So, should we go ahead with getting rid of this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have he feeling that the read_clipboard was an ipython block before, and we had this clipdf and a to_clipboard before reading from the clipboard. But since not it's a python block and the code is written manually, I guess we can simply get rid of this.


.. code-block:: python

>>> clipdf = pd.read_clipboard()
Expand Down Expand Up @@ -5562,10 +5564,10 @@ Given the next test set:

.. code-block:: python

from numpy.random import randn
import os

sz = 1000000
df = pd.DataFrame({'A': randn(sz), 'B': [1] * sz})
df = pd.DataFrame({'A': np.random.randn(sz), 'B': [1] * sz})


def test_sql_write(df):
Expand Down