Skip to content

Commit 9e36b6e

Browse files
bpo-41048: mimetypes should read the rule file using UTF-8, not the locale encoding (GH-20998)
(cherry picked from commit 7f569c9) Co-authored-by: Srinivas Reddy Thatiparthy (శ్రీనివాస్ రెడ్డి తాటిపర్తి) <[email protected]>
1 parent 02134da commit 9e36b6e

File tree

4 files changed

+16
-1
lines changed

4 files changed

+16
-1
lines changed

Lib/mimetypes.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -372,7 +372,7 @@ def init(files=None):
372372

373373
def read_mime_types(file):
374374
try:
375-
f = open(file)
375+
f = open(file, encoding='utf-8')
376376
except OSError:
377377
return None
378378
with f:

Lib/test/test_mimetypes.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,18 @@ def test_read_mime_types(self):
6767
mime_dict = mimetypes.read_mime_types(file)
6868
eq(mime_dict[".pyunit"], "x-application/x-unittest")
6969

70+
# bpo-41048: read_mime_types should read the rule file with 'utf-8' encoding.
71+
# Not with locale encoding. _bootlocale has been imported because io.open(...)
72+
# uses it.
73+
with support.temp_dir() as directory:
74+
data = "application/no-mans-land Fran\u00E7ais"
75+
file = pathlib.Path(directory, "sample.mimetype")
76+
file.write_text(data, encoding='utf-8')
77+
import _bootlocale
78+
with support.swap_attr(_bootlocale, 'getpreferredencoding', lambda do_setlocale=True: 'ASCII'):
79+
mime_dict = mimetypes.read_mime_types(file)
80+
eq(mime_dict[".Français"], "application/no-mans-land")
81+
7082
def test_non_standard_types(self):
7183
eq = self.assertEqual
7284
# First try strict

Misc/ACKS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1705,6 +1705,7 @@ Mikhail Terekhov
17051705
Victor Terrón
17061706
Pablo Galindo
17071707
Richard M. Tew
1708+
Srinivas Reddy Thatiparthy
17081709
Tobias Thelen
17091710
Christian Theune
17101711
Févry Thibault
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
:func:`mimetypes.read_mime_types` function reads the rule file using UTF-8 encoding, not the locale encoding.
2+
Patch by Srinivas Reddy Thatiparthy.

0 commit comments

Comments
 (0)