-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DOC: Make explicit in pandas IO doc the imports and options #28089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
5e0fbe3
3591894
035e35b
18a9b7c
1ff8b3f
d3036f3
37d420a
f80331e
faec275
6487a68
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,15 +3,6 @@ | |
.. currentmodule:: pandas | ||
|
||
|
||
{{ header }} | ||
|
||
.. ipython:: python | ||
:suppress: | ||
|
||
clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']}, | ||
index=['x', 'y', 'z']) | ||
|
||
|
||
=============================== | ||
IO tools (text, CSV, HDF5, ...) | ||
=============================== | ||
|
@@ -137,7 +128,9 @@ usecols : list-like or callable, default ``None`` | |
|
||
.. ipython:: python | ||
|
||
from io import StringIO, BytesIO | ||
import pandas as pd | ||
pd.options.display.max_rows = 15 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can actually remove this line. I quickly checked and there are only a few longer dataframes shown in this page, and they are all longer. So with the current default of only showing first/last 5 rows when truncated, this option has no effect (it would only have the effect that dataframes longer than 15 but shorter than 60 would still be truncated, but such dataframes are not present in this file) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Made the changes in commit d3036f3. |
||
from io import StringIO | ||
data = ('col1,col2,col3\n' | ||
'a,b,1\n' | ||
'a,b,2\n' | ||
|
@@ -363,6 +356,8 @@ columns: | |
|
||
.. ipython:: python | ||
|
||
import numpy as np | ||
np.set_printoptions(precision=4, suppress=True) | ||
data = ('a,b,c,d\n' | ||
'1,2,3,4\n' | ||
'5,6,7,8\n' | ||
|
@@ -447,7 +442,6 @@ worth trying. | |
:suppress: | ||
|
||
import os | ||
|
||
os.remove('foo.csv') | ||
|
||
.. _io.categorical: | ||
|
@@ -757,6 +751,7 @@ result in byte strings being decoded to unicode in the result: | |
|
||
.. ipython:: python | ||
|
||
from io import BytesIO | ||
data = (b'word,length\n' | ||
b'Tr\xc3\xa4umen,7\n' | ||
b'Gr\xc3\xbc\xc3\x9fe,5') | ||
|
@@ -1579,6 +1574,7 @@ class of the csv module. For this, you have to specify ``sep=None``. | |
.. ipython:: python | ||
:suppress: | ||
|
||
np.random.seed(123456) | ||
df = pd.DataFrame(np.random.randn(10, 4)) | ||
df.to_csv('tmp.sv', sep='|') | ||
df.to_csv('tmp2.sv', sep=':') | ||
|
@@ -3266,6 +3262,12 @@ clipboard (CTRL-C on many operating systems): | |
|
||
And then import the data directly to a ``DataFrame`` by calling: | ||
|
||
.. ipython:: python | ||
:suppress: | ||
|
||
clipdf = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6], 'C': ['p', 'q', 'r']}, | ||
index=['x', 'y', 'z']) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this is really needed. Do you mind having a look @TanyaaCJain There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh yes.. I missed mentioning this. Thanks @datapythonista for the reminder! Even I couldn't find this assignment be of use anywhere in the file. So, should we go ahead with getting rid of this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have he feeling that the |
||
|
||
.. code-block:: python | ||
|
||
>>> clipdf = pd.read_clipboard() | ||
|
@@ -5562,10 +5564,10 @@ Given the next test set: | |
|
||
.. code-block:: python | ||
|
||
from numpy.random import randn | ||
import os | ||
|
||
sz = 1000000 | ||
df = pd.DataFrame({'A': randn(sz), 'B': [1] * sz}) | ||
df = pd.DataFrame({'A': np.random.randn(sz), 'B': [1] * sz}) | ||
|
||
|
||
def test_sql_write(df): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize I missed the original discussion but just to be clear, is the assumption here that we only show the import the first time it shows up one of the rst files and that the rest of the code blocks use it from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's an assumption. It is implicitly being proposed, for simplicity and for inertia on how the header worked, but it's surely open to discussion. I thought about that too, and I was also wondering if it'd make sense to split the long pages we have now in the user guide in shorter pages. I guess that could make things easier if we finally make the code runnable.