-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
TST: query with timezone aware index & column #34021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
ac8b828
e7dfe1e
61efc12
50b02c5
e6cf243
d7aaf2d
b7817a0
e4eb829
7fcd7a8
986cadd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
import pandas as pd | ||
import pandas._testing as tm | ||
|
||
|
||
class TestColumnvsIndexTZEquality: | ||
# https://github.com/pandas-dev/pandas/issues/29463 | ||
def check_for_various_tz(self, tz): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As mentioned by @MarcoGorelli, you can just use the |
||
df = pd.DataFrame( | ||
{ | ||
"val": range(10), | ||
"time": pd.date_range(start="2019-01-01", freq="1d", periods=10, tz=tz), | ||
} | ||
) | ||
df_query = df.query('"2019-01-03 00:00:00+00" < time') | ||
l1 = pd.DataFrame(list(df_query["time"])) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this is the "expected" output, we'll want to construct this with a different method without query
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mroeschke , There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want to make sure the "expected" DataFrame takes an independent path than query so if So want to make sure the result of: gives a result of: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When we create an expected dataframe , it passes for few tz and fails for others , bcoz the query string always expects the time zone to be "UTC" , query('"2019-01-03 00:00:00+00" < time'), and therefore there is shape mismatch for example for "Asia/Kolkata" it is failing whereas it passes for "US/Eastern". I assume we cannot change the query string because that is what we are testing for , in that case knowing the expected o/p when comparing to different tz will result in shape mismatch which is exactly happening now.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since we're just testing that we can query with a tz offset in the query string when the index is tz-aware, we can be with what the query string is. You can just change the test to be then
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mroeschke , Yes this works , I am now banging my head for not thinking this earlier. As suggested , I have added the test in test_query_eval.py file. |
||
|
||
# # This was earlier raising an exception. | ||
index_query = df.set_index("time").query('"2019-01-03 00:00:00+00" < time') | ||
l2 = pd.DataFrame(list(index_query.index)) | ||
tm.assert_frame_equal(l1, l2) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you rename the variables so this reads |
||
|
||
def test_check_column_vs_index_tz_query(self): | ||
tz_list = [ | ||
"Africa/Abidjan", | ||
"Africa/Douala", | ||
"Africa/Mbabane", | ||
"America/Argentina/Catamarca", | ||
"America/Belize", | ||
"America/Curacao", | ||
"America/Guatemala", | ||
"America/Kentucky/Louisville", | ||
"America/Mexico_City", | ||
"America/Port-au-Prince", | ||
"America/Sitka", | ||
"Antarctica/Casey", | ||
"Asia/Ashkhabad", | ||
"Asia/Dubai", | ||
"Asia/Khandyga", | ||
"Asia/Qatar", | ||
"Asia/Tomsk", | ||
"Atlantic/Reykjavik", | ||
"Australia/Queensland", | ||
"Canada/Yukon", | ||
"Etc/GMT+7", | ||
"Etc/UCT", | ||
"Europe/Guernsey", | ||
"Europe/Paris", | ||
"Europe/Vienna", | ||
"Indian/Cocos", | ||
"NZ", | ||
"Pacific/Honolulu", | ||
"Pacific/Samoa", | ||
"US/Eastern", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @vipulrai91 There's a fixture you can use here, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @MarcoGorelli thank you for the feedback. As far as I understand the change has to be.
Got this snippet from test_datetime.py , but what actually is tz_aware_fixture? Thanks There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It's a fixture which you can use to test different timezones.
See using pytest. So here, $ pytest pandas/tests/frame/indexing/test_column_vs_index_tz.py should be enough There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you for guiding. tz_aware_fixture is a combination of TimeZones and other object types , also even
|
||
] | ||
|
||
for tz in tz_list: | ||
self.check_for_various_tz(tz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need an individual class for this. Writing the test as a function is sufficient.
Don't need a new file for just this test. This test can live in
pandas/tests/frame/test_query_eval.py