Skip to content

BUG: Fix problem with SparseDataFrame not persisting to csv #19441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 1, 2018

Conversation

hexgnu
Copy link
Contributor

@hexgnu hexgnu commented Jan 29, 2018

@@ -704,7 +704,9 @@ def to_native_types(self, slicer=None, na_rep='nan', quoting=None,
**kwargs):
""" convert to our native types format, slicing if desired """

values = self.values
# values = self.values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can remove the comment

@@ -253,3 +253,12 @@ def test_to_csv_string_array_utf8(self):
df.to_csv(path, encoding='utf-8')
with open(path, 'r') as f:
assert f.read() == expected_utf8

def test_to_csv_sparse_dataframe(self):
sdf = pd.SparseDataFrame({'a': [1, 2]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment here for the issue number. I would rather have this test in pandas/tests/sparse/frame/test_to_csv.py

can you add several variations, e.g. using fill value and with some nulls.

@jreback jreback added IO CSV read_csv, to_csv Sparse Sparse Data Type labels Jan 30, 2018
@pep8speaks
Copy link

pep8speaks commented Jan 30, 2018

Hello @hexgnu! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 01, 2018 at 13:26 Hours UTC

@hexgnu
Copy link
Contributor Author

hexgnu commented Jan 30, 2018

K updated thanks @jreback

@TomAugspurger
Copy link
Contributor

@hexgnu do the tests you've written pass for you locally?

@hexgnu
Copy link
Contributor Author

hexgnu commented Jan 30, 2018

oi sorry, should pass now.

@TomAugspurger
Copy link
Contributor

Still one lint failure: https://travis-ci.org/pandas-dev/pandas/jobs/335405859#L3035

I'm looking into the (unrelated) S3 failures later this morning.

@codecov
Copy link

codecov bot commented Jan 31, 2018

Codecov Report

Merging #19441 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19441      +/-   ##
==========================================
+ Coverage   91.62%   91.67%   +0.05%     
==========================================
  Files         150      148       -2     
  Lines       48724    48553     -171     
==========================================
- Hits        44642    44513     -129     
+ Misses       4082     4040      -42
Flag Coverage Δ
#multiple 90.04% <100%> (+0.05%) ⬆️
#single 41.72% <100%> (-0.03%) ⬇️
Impacted Files Coverage Δ
pandas/core/internals.py 95.47% <100%> (ø) ⬆️
pandas/plotting/_converter.py 65.22% <0%> (-1.74%) ⬇️
pandas/core/frame.py 97.42% <0%> (-0.16%) ⬇️
pandas/core/indexes/multi.py 95.06% <0%> (-0.09%) ⬇️
pandas/io/parsers.py 95.49% <0%> (ø) ⬆️
pandas/core/resample.py 96.43% <0%> (ø) ⬆️
pandas/plotting/_core.py 82.41% <0%> (ø) ⬆️
pandas/io/formats/format.py 96.24% <0%> (ø) ⬆️
pandas/core/dtypes/generic.py 100% <0%> (ø) ⬆️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 238499a...87946d2. Read the comment docs.

@hexgnu
Copy link
Contributor Author

hexgnu commented Feb 1, 2018

Ok for real this time it passed @jreback @TomAugspurger thanks 😄

def test_to_csv_sparse_dataframe():
fill_values = [np.nan, 0, None, 1]

for fill_value in fill_values:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you parameterize

@jreback jreback added this to the 0.23.0 milestone Feb 1, 2018


def test_to_csv_sparse_dataframe():
fill_values = [np.nan, 0, None, 1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the issue number here as well

@@ -497,7 +497,7 @@ I/O
- Bug in :func:`DataFrame.to_parquet` where an exception was raised if the write destination is S3 (:issue:`19134`)
- :class:`Interval` now supported in :func:`DataFrame.to_excel` for all Excel file types (:issue:`19242`)
- :class:`Timedelta` now supported in :func:`DataFrame.to_excel` for xls file type (:issue:`19242`, :issue:`9155`)
-
- Bug in :class:`SparseDataFrame.to_csv` where too many indices for values (:issue:`19384`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to sparse, and don't need this kind of detail, just say was raising.

@jreback
Copy link
Contributor

jreback commented Feb 1, 2018

a lint error.ping when pushed and green.

@jreback
Copy link
Contributor

jreback commented Feb 1, 2018

great. ping on green.

@TomAugspurger TomAugspurger merged commit 78ba063 into pandas-dev:master Feb 1, 2018
@TomAugspurger
Copy link
Contributor

Thanks @hexgnu!

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
…ev#19441)

* BUG: Fix problem with SparseDataFrame not persisting to csv

* FIX: Remove comment and move test with more coverage

* FIX: Flake8 issues cleanup

* Fix failing test due to blank lines

* FIX: linting errors on whitespace

* Use parametrize on test

* Move bug description to sparse header

* Add GH issue to test

* Fix linting error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SparseDataFrame to_csv returns IndexError: too many indices for array
4 participants