Skip to content

Commit 279657b

Browse files
authored
[3.7] bpo-29571: Fix test_re.test_locale_flag() (GH-12178)
Use locale.getpreferredencoding() rather than locale.getlocale() to get the locale encoding. With some locales, locale.getlocale() returns the wrong encoding. For example, on Fedora 29, locale.getlocale() returns ISO-8859-1 encoding for the "en_IN" locale, whereas locale.getpreferredencoding() reports the correct encoding: UTF-8. On Windows, set temporarily the LC_CTYPE locale to the user preferred encoding to ensure that it uses the ANSI code page, to be consistent with locale.getpreferredencoding().
1 parent bf35cc2 commit 279657b

File tree

2 files changed

+18
-2
lines changed

2 files changed

+18
-2
lines changed

Lib/test/test_re.py

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1516,8 +1516,18 @@ def test_ascii_and_unicode_flag(self):
15161516
self.assertRaises(re.error, re.compile, r'(?au)\w')
15171517

15181518
def test_locale_flag(self):
1519-
import locale
1520-
_, enc = locale.getlocale(locale.LC_CTYPE)
1519+
# On Windows, Python 3.7 doesn't call setlocale(LC_CTYPE, "") at
1520+
# startup and so the LC_CTYPE locale uses Latin1 encoding by default,
1521+
# whereas getpreferredencoding() returns the ANSI code page. Set
1522+
# temporarily the LC_CTYPE locale to the user preferred encoding to
1523+
# ensure that it uses the ANSI code page.
1524+
oldloc = locale.setlocale(locale.LC_CTYPE, None)
1525+
locale.setlocale(locale.LC_CTYPE, "")
1526+
self.addCleanup(locale.setlocale, locale.LC_CTYPE, oldloc)
1527+
1528+
# Get the current locale encoding
1529+
enc = locale.getpreferredencoding(False)
1530+
15211531
# Search non-ASCII letter
15221532
for i in range(128, 256):
15231533
try:
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Fix ``test_re.test_locale_flag()``: use ``locale.getpreferredencoding()``
2+
rather than ``locale.getlocale()`` to get the locale encoding. With some
3+
locales, ``locale.getlocale()`` returns the wrong encoding. On Windows, set
4+
temporarily the ``LC_CTYPE`` locale to the user preferred encoding to ensure
5+
that it uses the ANSI code page, to be consistent with
6+
``locale.getpreferredencoding()``.

0 commit comments

Comments
 (0)