Skip to content

Commit b36c1ae

Browse files
authored
ENH: Add decimal parameter to read_excel (#44317)
1 parent 4c03128 commit b36c1ae

File tree

8 files changed

+26
-0
lines changed

8 files changed

+26
-0
lines changed

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -182,6 +182,7 @@ Other enhancements
182182
- Added :meth:`.ExponentialMovingWindow.sum` (:issue:`13297`)
183183
- :meth:`Series.str.split` now supports a ``regex`` argument that explicitly specifies whether the pattern is a regular expression. Default is ``None`` (:issue:`43563`, :issue:`32835`, :issue:`25549`)
184184
- :meth:`DataFrame.dropna` now accepts a single label as ``subset`` along with array-like (:issue:`41021`)
185+
- :meth:`read_excel` now accepts a ``decimal`` argument that allow the user to specify the decimal point when parsing string columns to numeric (:issue:`14403`)
185186
- :meth:`.GroupBy.mean` now supports `Numba <http://numba.pydata.org/>`_ execution with the ``engine`` keyword (:issue:`43731`)
186187

187188
.. ---------------------------------------------------------------------------

pandas/io/excel/_base.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,14 @@
234234
this parameter is only necessary for columns stored as TEXT in Excel,
235235
any numeric columns will automatically be parsed, regardless of display
236236
format.
237+
decimal : str, default '.'
238+
Character to recognize as decimal point for parsing string columns to numeric.
239+
Note that this parameter is only necessary for columns stored as TEXT in Excel,
240+
any numeric columns will automatically be parsed, regardless of display
241+
format.(e.g. use ',' for European data).
242+
243+
.. versionadded:: 1.4.0
244+
237245
comment : str, default None
238246
Comments out remainder of line. Pass a character or characters to this
239247
argument to indicate comments in the input file. Any data between the
@@ -356,6 +364,7 @@ def read_excel(
356364
parse_dates=False,
357365
date_parser=None,
358366
thousands=None,
367+
decimal=".",
359368
comment=None,
360369
skipfooter=0,
361370
convert_float=None,
@@ -394,6 +403,7 @@ def read_excel(
394403
parse_dates=parse_dates,
395404
date_parser=date_parser,
396405
thousands=thousands,
406+
decimal=decimal,
397407
comment=comment,
398408
skipfooter=skipfooter,
399409
convert_float=convert_float,
@@ -498,6 +508,7 @@ def parse(
498508
parse_dates=False,
499509
date_parser=None,
500510
thousands=None,
511+
decimal=".",
501512
comment=None,
502513
skipfooter=0,
503514
convert_float=None,
@@ -624,6 +635,7 @@ def parse(
624635
parse_dates=parse_dates,
625636
date_parser=date_parser,
626637
thousands=thousands,
638+
decimal=decimal,
627639
comment=comment,
628640
skipfooter=skipfooter,
629641
usecols=usecols,
4.3 KB
Binary file not shown.
33.5 KB
Binary file not shown.
15.7 KB
Binary file not shown.
17.5 KB
Binary file not shown.
17.5 KB
Binary file not shown.

pandas/tests/io/excel/test_readers.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1289,6 +1289,19 @@ def test_ignore_chartsheets_by_int(self, request, read_ext):
12891289
):
12901290
pd.read_excel("chartsheet" + read_ext, sheet_name=1)
12911291

1292+
def test_euro_decimal_format(self, request, read_ext):
1293+
# copied from read_csv
1294+
result = pd.read_excel("test_decimal" + read_ext, decimal=",", skiprows=1)
1295+
expected = DataFrame(
1296+
[
1297+
[1, 1521.1541, 187101.9543, "ABC", "poi", 4.738797819],
1298+
[2, 121.12, 14897.76, "DEF", "uyt", 0.377320872],
1299+
[3, 878.158, 108013.434, "GHI", "rez", 2.735694704],
1300+
],
1301+
columns=["Id", "Number1", "Number2", "Text1", "Text2", "Number3"],
1302+
)
1303+
tm.assert_frame_equal(result, expected)
1304+
12921305

12931306
class TestExcelFileRead:
12941307
@pytest.fixture(autouse=True)

0 commit comments

Comments
 (0)