-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
BUG/CLN: Clean float / complex string formatting #36799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 17 commits
406ed4d
a1228c9
eabef62
72be97f
e1d1ba3
d38839c
0da1c46
4ea7dcf
376b06e
473d674
e564fee
6cec317
50ccbc0
e725cb2
3a94daa
03e8cab
46949fd
11f20d0
44f0792
229b5fc
eec9d89
0e0dd92
956ecf3
25dd8f1
eca0413
00b71b2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1473,9 +1473,10 @@ def format_values_with(float_format): | |
|
||
if self.fixed_width: | ||
if is_complex: | ||
result = _trim_zeros_complex(values, self.decimal, na_rep) | ||
result = _trim_zeros_complex(values, self.decimal) | ||
result = _post_process_complex(result) | ||
else: | ||
result = _trim_zeros_float(values, self.decimal, na_rep) | ||
result = _trim_zeros_float(values, self.decimal) | ||
return np.asarray(result, dtype="object") | ||
|
||
return values | ||
|
@@ -1855,29 +1856,45 @@ def just(x): | |
return result | ||
|
||
|
||
def _trim_zeros_complex( | ||
str_complexes: np.ndarray, decimal: str = ".", na_rep: str = "NaN" | ||
) -> List[str]: | ||
def _trim_zeros_complex(str_complexes: np.ndarray, decimal: str = ".") -> List[str]: | ||
""" | ||
Separates the real and imaginary parts from the complex number, and | ||
executes the _trim_zeros_float method on each of those. | ||
""" | ||
return [ | ||
"".join(_trim_zeros_float(re.split(r"([j+-])", x), decimal, na_rep)) | ||
"".join(_trim_zeros_float(re.split(r"([j+-])", x), decimal)) | ||
for x in str_complexes | ||
] | ||
|
||
|
||
def _post_process_complex(complex_strings: List[str]) -> List[str]: | ||
""" | ||
Zero pad complex number strings produced by _trim_zeros_complex. | ||
""" | ||
lengths = [len(s) for s in complex_strings] | ||
max_length = max(lengths) | ||
padded = [ | ||
s[: -((k - 1) // 2 + 1)] # real part | ||
+ (max_length - k) // 2 * "0" | ||
+ s[-((k - 1) // 2 + 1) : -((k - 1) // 2)] # + / - | ||
+ s[-((k - 1) // 2) : -1] # imaginary part | ||
+ (max_length - k) // 2 * "0" | ||
+ s[-1] | ||
for s, k in zip(complex_strings, lengths) | ||
] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be safer to split real and imaginary parts via +- and then process decimal and fractional parts by splitting via the dot? This way you would not need to rely on the symmetry of the original string provided. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean trim zeros after splitting into fractional non-fractional parts? I think the trimming has to be done with the decimal there. (I realize this helper is very confusing, and there's likely a better way to do this.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, I mean trimming zeros after splitting into fractional and non-fractional parts. Since a dot char would always split float number, there is no risk to introduce a bug IMHO (even if there is no dot at all). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think then you have to keep track of which parts are fractional and then only trim those? However this part is not doing any actual trimming, it's correcting for the fact that the previous function is now trimming "too much." (It trims the real and imaginary parts of each complex number independently, so they aren't aligned afterwards. Rather than rewrite the other function I found it easier to do this post-processing.) |
||
return padded | ||
|
||
|
||
def _trim_zeros_float( | ||
str_floats: Union[np.ndarray, List[str]], decimal: str = ".", na_rep: str = "NaN" | ||
str_floats: Union[np.ndarray, List[str]], decimal: str = "." | ||
) -> List[str]: | ||
""" | ||
Trims zeros, leaving just one before the decimal points if need be. | ||
""" | ||
trimmed = str_floats | ||
|
||
def _is_number(x): | ||
return x != na_rep and not x.endswith("inf") | ||
return re.match(fr"\s*-?[0-9]+(\{decimal}[0-9]*)?", x) is not None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you compile this and put it on the class / variable |
||
|
||
def _cond(values): | ||
finite = [x for x in values if _is_number(x)] | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3432,3 +3432,10 @@ def test_format_remove_leading_space_dataframe(input_array, expected): | |
# GH: 24980 | ||
df = pd.DataFrame(input_array).to_string(index=False) | ||
assert df == expected | ||
|
||
|
||
def test_to_string_complex_number_trims_zeros(): | ||
s = pd.Series([1.000000 + 1.000000j, 1.0 + 1.0j, 1.05 + 1.0j]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why should these have 2 decimal zeros and not 1 likely ordinary floats? Where did you get the expected output from? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's a 1.05 in the last element, the expected output is similar to what happens for floats: [ins] In [2]: s = pd.Series([1.0, 1.0000, 1.05])
[ins] In [3]: s
Out[3]:
0 1.00
1 1.00
2 1.05
dtype: float64 |
||
result = s.to_string() | ||
expected = "0 1.00+1.00j\n1 1.00+1.00j\n2 1.05+1.00j" | ||
assert result == expected |
Uh oh!
There was an error while loading. Please reload this page.