-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
ENH: HDFStore.flush() to optionally perform fsync (GH5364) #5369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,7 @@ | |
import copy | ||
import itertools | ||
import warnings | ||
import os | ||
|
||
import numpy as np | ||
from pandas import (Series, TimeSeries, DataFrame, Panel, Panel4D, Index, | ||
|
@@ -525,12 +526,30 @@ def is_open(self): | |
return False | ||
return bool(self._handle.isopen) | ||
|
||
def flush(self): | ||
def flush(self, fsync=False): | ||
""" | ||
Force all buffered modifications to be written to disk | ||
Force all buffered modifications to be written to disk. | ||
By default this method requests PyTables to flush, and PyTables in turn | ||
requests the HDF5 library to flush any changes to the operating system. | ||
There is no guarantee the operating system will actually commit writes | ||
to disk. | ||
To request the operating system to write the file to disk, pass | ||
``fsync=True``. The method will then block until the operating system | ||
reports completion, although be aware there might be other caching | ||
layers (eg disk controllers, disks themselves etc) which further delay | ||
durability. | ||
Parameters | ||
---------- | ||
fsync : boolean, invoke fsync for the file handle, default False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you need all of the explanation above. Just put a short summary under fsync:
Then at the end you could add:
|
||
""" | ||
if self._handle is not None: | ||
self._handle.flush() | ||
if fsync: | ||
os.fsync(self._handle.fileno()) | ||
|
||
def get(self, key): | ||
""" | ||
|
@@ -4072,5 +4091,4 @@ def timeit(key, df, fn=None, remove=True, **kwargs): | |
store.close() | ||
|
||
if remove: | ||
import os | ||
os.remove(fn) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -466,6 +466,12 @@ def test_flush(self): | |
store['a'] = tm.makeTimeSeries() | ||
store.flush() | ||
|
||
def test_flush_fsync(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just combine with the previous flush test. Not very different, especially since we're not mocking anything. |
||
|
||
with ensure_clean(self.path) as store: | ||
store['a'] = tm.makeTimeSeries() | ||
store.flush(fsync=True) | ||
|
||
def test_get(self): | ||
|
||
with ensure_clean(self.path) as store: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you shorten this to: