REF: io.pytables operate on DataFrames instead of Blocks #29871

jbrockmendel · 2019-11-26T21:53:43Z

Wouldn't be surprised if there is a perf hit here, will run asvs.

jbrockmendel · 2019-11-26T22:50:50Z

asvs seem like noise:

       before           after         ratio
-      10.7±0.1ms      9.26±0.07ms     0.86  io.hdf.HDFStoreDataFrame.time_query_store_table
-         186±3ms          159±1ms     0.86  io.hdf.HDFStoreDataFrame.time_write_store_table_dc
-      7.59±0.2ms      6.34±0.07ms     0.84  io.hdf.HDFStoreDataFrame.time_store_info

       before           after         ratio
+        30.6±2ms         42.6±2ms     1.39  io.hdf.HDF.time_read_hdf('fixed')
-      7.55±0.2ms      6.80±0.01ms     0.90  io.hdf.HDFStoreDataFrame.time_store_info

       before           after         ratio
+        29.3±1ms         41.3±5ms     1.41  io.hdf.HDF.time_read_hdf('fixed')
+     4.00±0.04ms      4.40±0.07ms     1.10  io.hdf.HDFStoreDataFrame.time_read_store_table

…n-pytables-blocks

pandas/io/pytables.py

jreback

lgtm, minor comments, pls rebase

pandas/io/pytables.py

…n-pytables-blocks

jbrockmendel · 2019-11-27T21:51:39Z

rebased+green

jreback · 2019-11-29T23:04:28Z

pandas/io/pytables.py

-            block = make_block(values, placement=np.arange(len(cols_)), ndim=2)
-            mgr = BlockManager([block], [cols_, index_])
-            frames.append(DataFrame(mgr))
+            if isinstance(values, np.ndarray):


I would merge this with the line above

i dont understand the suggestion

put this

if values.ndim == 1 and isinstance(values, np.ndarray): values = values.reshape((1, values.shape[0]))

inside this if

the purpose of that check is orthogonal to the purpose of this check. 4341-4348 are logically grouped together

ok, I don't particularly like the fact that we have to construct things like this, but i guess ok

jreback

ok to merge like this or handle suggestions as followup.

jreback · 2019-12-01T23:26:02Z

pandas/io/pytables.py

@@ -3100,17 +3099,23 @@ def read(self, start=None, stop=None, **kwargs):
            axes.append(ax)

        items = axes[0]
-        blocks = []
+        dfs = []
+


i think it would be better to use a list comprehension here and make a function for the DataFrame creation (e.g. lines 3106-3110), but can be a followup

jreback · 2019-12-01T23:26:40Z

pandas/io/pytables.py

+            dfs.append(df)
+
+        if len(dfs) > 0:
+            out = concat(dfs, axis=1)


i actually find this simpler as a chained operation

ok. will change in follow-up (since CI is failing for unrelated reasons right now)

jreback · 2019-12-01T23:27:23Z

pandas/io/pytables.py

-            block = make_block(values, placement=np.arange(len(cols_)), ndim=2)
-            mgr = BlockManager([block], [cols_, index_])
-            frames.append(DataFrame(mgr))
+            if isinstance(values, np.ndarray):


ok, I don't particularly like the fact that we have to construct things like this, but i guess ok

jreback · 2019-12-03T13:49:50Z

thanks

…29871)

REF: io.pytables operate on DataFrames instead of Blocks

90a6b0c

Merge branch 'master' of https://github.com/pandas-dev/pandas into cl…

78e6434

…n-pytables-blocks

gfyoung added IO Data IO issues that don't fit into a more specific label Internals Related to non-user accessible pandas implementation Refactor Internal refactoring of code labels Nov 27, 2019

gfyoung reviewed Nov 27, 2019

View reviewed changes

pandas/io/pytables.py Outdated Show resolved Hide resolved

jreback approved these changes Nov 27, 2019

View reviewed changes

pandas/io/pytables.py Outdated Show resolved Hide resolved

jreback added this to the 1.0 milestone Nov 27, 2019

jbrockmendel added 2 commits November 27, 2019 08:24

Merge branch 'master' of https://github.com/pandas-dev/pandas into cl…

0420f0e

…n-pytables-blocks

copy=False

c14f4da

jreback requested changes Nov 29, 2019

View reviewed changes

jbrockmendel added IO HDF5 read_hdf, HDFStore and removed IO Data IO issues that don't fit into a more specific label labels Dec 1, 2019

jreback approved these changes Dec 1, 2019

View reviewed changes

jreback merged commit 6705b2a into pandas-dev:master Dec 3, 2019

jbrockmendel deleted the cln-pytables-blocks branch December 3, 2019 16:08

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

REF: io.pytables operate on DataFrames instead of Blocks (pandas-dev#…

da05f07

…29871)

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

REF: io.pytables operate on DataFrames instead of Blocks (pandas-dev#…

9a9ceed

…29871)

Uh oh!

REF: io.pytables operate on DataFrames instead of Blocks #29871

REF: io.pytables operate on DataFrames instead of Blocks #29871

Uh oh!

Conversation

jbrockmendel commented Nov 26, 2019

Uh oh!

jbrockmendel commented Nov 26, 2019

Uh oh!

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jbrockmendel commented Nov 27, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Dec 3, 2019

Uh oh!

Uh oh!