Skip to content

BUG: infer_string not inferring string dtype when NA is first value #55655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 24, 2023

Conversation

phofl
Copy link
Member

@phofl phofl commented Oct 23, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

@phofl phofl added Strings String extension data type and string data Arrow pyarrow functionality labels Oct 23, 2023
@phofl phofl added this to the 2.1.2 milestone Oct 23, 2023
@phofl phofl requested a review from WillAyd as a code owner October 23, 2023 21:49
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lithomas1 was looking at this function recently. Might want to review as well

@@ -2665,6 +2665,8 @@ def maybe_convert_objects(ndarray[object] objects,
else:
seen.object_ = True
break
elif val is C_NA:
seen.object_ = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think missing a break after this? See it in all the other branches at least

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is missing on purpose. We don't know which dtype we have with NA and infer-string set, so have to do another pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK. Does continue work here too? Assuming no difference with the current code setup, but continue would be more future-proof in case of a refactor, while also signaling intent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure

@lithomas1 lithomas1 merged commit 7959578 into pandas-dev:main Oct 24, 2023
@lithomas1
Copy link
Member

thanks @phofl.

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Oct 24, 2023
@phofl phofl deleted the na_infer_String branch October 24, 2023 16:33
phofl added a commit that referenced this pull request Oct 24, 2023
…tring dtype when NA is first value) (#55666)

Backport PR #55655: BUG: infer_string not inferring string dtype when NA is first value

Co-authored-by: Patrick Hoefler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants