Skip to content

Commit f385542

Browse files
authored
[Tooling/Inclusion] Modify the Python script to open the C++ reference with UTF-8 encoding. (#121341)
This will prevent the error on systems with a default encoding other than utf-8. ``` UnicodeDecodeError: 'gbk' codec can't decode byte 0xb6 in position 12958: illegal multibyte sequence ```
1 parent 9abcca5 commit f385542

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

clang/tools/include-mapping/cppreference_parser.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ def _ParseIndexPage(index_page_html):
139139

140140

141141
def _ReadSymbolPage(path, name, qual_name):
142-
with open(path) as f:
142+
with open(path, encoding="utf-8") as f:
143143
return _ParseSymbolPage(f.read(), name, qual_name)
144144

145145

@@ -156,7 +156,7 @@ def _GetSymbols(pool, root_dir, index_page_name, namespace, variants_to_accept):
156156
# contains the defined header.
157157
# 2. Parse the symbol page to get the defined header.
158158
index_page_path = os.path.join(root_dir, index_page_name)
159-
with open(index_page_path, "r") as f:
159+
with open(index_page_path, "r", encoding="utf-8") as f:
160160
# Read each symbol page in parallel.
161161
results = [] # (symbol_name, promise of [header...])
162162
for symbol_name, symbol_page_path, variant in _ParseIndexPage(f.read()):

0 commit comments

Comments
 (0)