Skip to content

DOC: read_excel skiprows documentation matches read_csv (#36435) #36437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 18, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions pandas/io/excel/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -120,8 +120,12 @@
Values to consider as True.
false_values : list, default None
Values to consider as False.
skiprows : list-like
Rows to skip at the beginning (0-indexed).
skiprows : list-like, int, or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the
start of the file. If callable, the callable function will be evaluated
against the row indices, returning True if the row should be skipped and
False otherwise. An example of a valid callable argument would be ``lambda
x: x in [0, 2]``.
nrows : int, default None
Number of rows to parse.
na_values : scalar, str, list-like, or dict, default None
Expand Down
27 changes: 26 additions & 1 deletion pandas/tests/io/excel/test_readers.py
Original file line number Diff line number Diff line change
Expand Up @@ -894,7 +894,7 @@ def test_read_excel_bool_header_arg(self, read_ext):
with pytest.raises(TypeError, match=msg):
pd.read_excel("test1" + read_ext, header=arg)

def test_read_excel_skiprows_list(self, read_ext):
def test_read_excel_skiprows(self, read_ext):
# GH 4903
if pd.read_excel.keywords["engine"] == "pyxlsb":
pytest.xfail("Sheets containing datetimes not supported by pyxlsb")
Expand All @@ -920,6 +920,31 @@ def test_read_excel_skiprows_list(self, read_ext):
)
tm.assert_frame_equal(actual, expected)

# GH36435
actual = pd.read_excel(
"testskiprows" + read_ext,
sheet_name="skiprows_list",
skiprows=lambda x: x in [0, 2],
)
tm.assert_frame_equal(actual, expected)

actual = pd.read_excel(
"testskiprows" + read_ext,
sheet_name="skiprows_list",
skiprows=3,
names=["a", "b", "c", "d"],
)
expected = DataFrame(
[
# [1, 2.5, pd.Timestamp("2015-01-01"), True],
[2, 3.5, pd.Timestamp("2015-01-02"), False],
[3, 4.5, pd.Timestamp("2015-01-03"), False],
[4, 5.5, pd.Timestamp("2015-01-04"), True],
],
columns=["a", "b", "c", "d"],
)
tm.assert_frame_equal(actual, expected)

def test_read_excel_nrows(self, read_ext):
# GH 16645
num_rows_to_pull = 5
Expand Down