Skip to content

Commit eed7fb1

Browse files
committed
libstdc++: Support link chains in std::chrono::tzdb::locate_zone [PR114770]
Since 2022 the TZif format defined in the zic(8) man page has said that links can refer to other links, rather than only referring to a zone. This isn't supported by the C++20 spec, which assumes that the target() for a chrono::time_zone_link always names a chrono::time_zone, not another chrono::time_zone_link. This hasn't been a problem until now, because there are no entries in the tzdata file that chain links together. However, Debian Sid has changed the target of the Asia/Chungking link from the Asia/Shanghai zone to the Asia/Chongqing link, creating a link chain. The libstdc++ code is unable to handle this, so chrono::locate_zone("Asia/Chungking") will fail with the tzdata.zi file from Debian Sid. It seems likely that the C++ spec will need a change to allow link chains, so that the original structure of the IANA database can be fully represented by chrono::tzdb. The alternative would be for chrono::tzdb to flatten all chains when loading the data, so that a link's target is always a zone, but this means throwing away information present in the tzdata.zi input file. In anticipation of a change to the spec, this commit adds support for chained links to libstdc++. When a name is found to be a link, we try to find its target in the list of zones as before, but now if the target isn't the name of a zone we don't fail. Instead we look for another link with that name, and keep doing that until we reach the end of the chain of links, and then look up the last target as a zone. This new logic would get stuck in a loop if the tzdata.zi file is buggy and defines a link chain that contains a cycle, e.g. two links that refer to each other. To deal with that unlikely case, we use the tortoise and hare algorithm to detect cycles in link chains, and throw an exception if we detect a cycle. Cycles in links should never happen, and it is expected that link chains will be short (if they occur at all) and so the code is optimized for short chains without cycles. Longer chains (four or more links) and cycles will do more work, but won't fail to resolve a chain or get stuck in a loop. The new test file checks various forms of broken links and cycles. Also add a new check in the testsuite that every element in the get_tzdb().zones and get_tzdb().links sequences can be successfully found using locate_zone. libstdc++-v3/ChangeLog: PR libstdc++/114770 * src/c++20/tzdb.cc (do_locate_zone): Support links that have another link as their target. * testsuite/std/time/tzdb/1.cc: Check that all zones and links can be found by locate_zone. * testsuite/std/time/tzdb/links.cc: New test.
1 parent e8f0540 commit eed7fb1

File tree

3 files changed

+280
-4
lines changed

3 files changed

+280
-4
lines changed

libstdc++-v3/src/c++20/tzdb.cc

Lines changed: 53 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1599,7 +1599,7 @@ namespace std::chrono
15991599
const time_zone*
16001600
do_locate_zone(const vector<time_zone>& zones,
16011601
const vector<time_zone_link>& links,
1602-
string_view tz_name) noexcept
1602+
string_view tz_name)
16031603
{
16041604
// Lambda mangling changed between -fabi-version=2 and -fabi-version=18
16051605
auto search = []<class Vec>(const Vec& v, string_view name) {
@@ -1610,13 +1610,62 @@ namespace std::chrono
16101610
return ptr;
16111611
};
16121612

1613+
// Search zones first.
16131614
if (auto tz = search(zones, tz_name))
16141615
return tz;
16151616

1617+
// Search links second.
16161618
if (auto tz_l = search(links, tz_name))
1617-
return search(zones, tz_l->target());
1619+
{
1620+
// Handle the common case of a link that has a zone as the target.
1621+
if (auto tz = search(zones, tz_l->target())) [[likely]]
1622+
return tz;
1623+
1624+
// Either tz_l->target() doesn't exist, or we have a chain of links.
1625+
// Use Floyd's cycle-finding algorithm to avoid infinite loops,
1626+
// at the cost of extra lookups. In the common case we expect a
1627+
// chain of links to be short so the loop won't run many times.
1628+
// In particular, the duplicate lookups to move the tortoise
1629+
// never happen unless the chain has four or more links.
1630+
// When a chain contains a cycle we do multiple duplicate lookups,
1631+
// but that case should never happen with correct tzdata.zi,
1632+
// so there's no need to optimize cycle detection.
1633+
1634+
const time_zone_link* tortoise = tz_l;
1635+
const time_zone_link* hare = search(links, tz_l->target());
1636+
while (hare)
1637+
{
1638+
// Chains should be short, so first check if it ends here:
1639+
if (auto tz = search(zones, hare->target())) [[likely]]
1640+
return tz;
1641+
1642+
// Otherwise follow the chain:
1643+
hare = search(links, hare->target());
1644+
if (!hare)
1645+
break;
1646+
1647+
// Again, first check if the chain ends at a zone here:
1648+
if (auto tz = search(zones, hare->target())) [[likely]]
1649+
return tz;
1650+
1651+
// Follow the chain again:
1652+
hare = search(links, hare->target());
1653+
1654+
if (hare == tortoise)
1655+
{
1656+
string_view err = "std::chrono::tzdb: link cycle: ";
1657+
string str;
1658+
str.reserve(err.size() + tz_name.size());
1659+
str += err;
1660+
str += tz_name;
1661+
__throw_runtime_error(str.c_str());
1662+
}
1663+
// Plod along the chain one step:
1664+
tortoise = search(links, tortoise->target());
1665+
}
1666+
}
16181667

1619-
return nullptr;
1668+
return nullptr; // not found
16201669
}
16211670
} // namespace
16221671

@@ -1626,7 +1675,7 @@ namespace std::chrono
16261675
{
16271676
if (auto tz = do_locate_zone(zones, links, tz_name))
16281677
return tz;
1629-
string_view err = "tzdb: cannot locate zone: ";
1678+
string_view err = "std::chrono::tzdb: cannot locate zone: ";
16301679
string str;
16311680
str.reserve(err.size() + tz_name.size());
16321681
str += err;

libstdc++-v3/testsuite/std/time/tzdb/1.cc

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,18 @@ test_locate()
4747
VERIFY( db.locate_zone(db.current_zone()->name()) == db.current_zone() );
4848
}
4949

50+
void
51+
test_all_zones()
52+
{
53+
const tzdb& db = get_tzdb();
54+
55+
for (const auto& zone : db.zones)
56+
VERIFY( locate_zone(zone.name())->name() == zone.name() );
57+
58+
for (const auto& link : db.links)
59+
VERIFY( locate_zone(link.name()) == locate_zone(link.target()) );
60+
}
61+
5062
int main()
5163
{
5264
test_version();
Lines changed: 215 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,215 @@
1+
// { dg-do run { target c++20 } }
2+
// { dg-require-effective-target tzdb }
3+
// { dg-require-effective-target cxx11_abi }
4+
// { dg-xfail-run-if "no weak override on AIX" { powerpc-ibm-aix* } }
5+
6+
#include <chrono>
7+
#include <fstream>
8+
#include <testsuite_hooks.h>
9+
10+
static bool override_used = false;
11+
12+
namespace __gnu_cxx
13+
{
14+
const char* zoneinfo_dir_override() {
15+
override_used = true;
16+
return "./";
17+
}
18+
}
19+
20+
using namespace std::chrono;
21+
22+
void
23+
test_link_chains()
24+
{
25+
std::ofstream("tzdata.zi") << R"(# version test_1
26+
Link Greenwich G_M_T
27+
Link Etc/GMT Greenwich
28+
Zone Etc/GMT 0 - GMT
29+
Zone A_Zone 1 - ZON
30+
Link A_Zone L1
31+
Link L1 L2
32+
Link L2 L3
33+
Link L3 L4
34+
Link L4 L5
35+
Link L5 L6
36+
Link L3 L7
37+
)";
38+
39+
const auto& db = reload_tzdb();
40+
VERIFY( override_used ); // If this fails then XFAIL for the target.
41+
VERIFY( db.version == "test_1" );
42+
43+
// Simple case of a link with a zone as its target.
44+
VERIFY( locate_zone("Greenwich")->name() == "Etc/GMT" );
45+
// Chains of links, where the target may be another link.
46+
VERIFY( locate_zone("G_M_T")->name() == "Etc/GMT" );
47+
VERIFY( locate_zone("L1")->name() == "A_Zone" );
48+
VERIFY( locate_zone("L2")->name() == "A_Zone" );
49+
VERIFY( locate_zone("L3")->name() == "A_Zone" );
50+
VERIFY( locate_zone("L4")->name() == "A_Zone" );
51+
VERIFY( locate_zone("L5")->name() == "A_Zone" );
52+
VERIFY( locate_zone("L6")->name() == "A_Zone" );
53+
VERIFY( locate_zone("L7")->name() == "A_Zone" );
54+
}
55+
56+
void
57+
test_bad_links()
58+
{
59+
// The zic(8) man page says
60+
// > the behavior is unspecified if multiple zone or link lines
61+
// > define the same name"
62+
// For libstdc++ the expected behaviour is described and tested below.
63+
std::ofstream("tzdata.zi") << R"(# version test_2
64+
Zone A_Zone 1 - ZA
65+
Zone B_Zone 2 - ZB
66+
Link A_Zone B_Zone
67+
Link B_Zone C_Link
68+
Link C_Link D_Link
69+
Link D_Link E_Link
70+
)";
71+
72+
const auto& db2 = reload_tzdb();
73+
VERIFY( override_used ); // If this fails then XFAIL for the target.
74+
VERIFY( db2.version == "test_2" );
75+
76+
// The standard requires locate_zone(name) to search for a zone first,
77+
// so this finds the zone B_Zone, not the link that points to zone A_Zone.
78+
VERIFY( locate_zone("B_Zone")->name() == "B_Zone" );
79+
// And libstdc++ does the same at every step when following chained links:
80+
VERIFY( locate_zone("C_Link")->name() == "B_Zone" );
81+
VERIFY( locate_zone("D_Link")->name() == "B_Zone" );
82+
VERIFY( locate_zone("E_Link")->name() == "B_Zone" );
83+
84+
// The zic(8) man page says
85+
// > the behavior is unspecified if a chain of one or more links
86+
// > does not terminate in a Zone name.
87+
// For libstdc++ we throw std::runtime_error if locate_zone finds an
88+
// unterminated chain, including the case of a chain that includes a cycle.
89+
std::ofstream("tzdata.zi") << R"(# version test_3
90+
Zone A_Zone 1 - ZON
91+
Link A_Zone GoodLink
92+
Link No_Zone BadLink
93+
Link LinkSelf LinkSelf
94+
Link LinkSelf Link1
95+
Link Link1 Link2
96+
Link Cycle2_A Cycle2_B
97+
Link Cycle2_B Cycle2_A
98+
Link Cycle3_A Cycle3_B
99+
Link Cycle3_B Cycle3_C
100+
Link Cycle3_C Cycle3_A
101+
Link Cycle3_C Cycle3_D
102+
Link Cycle4_A Cycle4_B
103+
Link Cycle4_B Cycle4_C
104+
Link Cycle4_C Cycle4_D
105+
Link Cycle4_D Cycle4_A
106+
)";
107+
108+
const auto& db3 = reload_tzdb();
109+
VERIFY( db3.version == "test_3" );
110+
111+
// Lookup for valid links should still work even if other links are bad.
112+
VERIFY( locate_zone("GoodLink")->name() == "A_Zone" );
113+
114+
#if __cpp_exceptions
115+
try {
116+
locate_zone("BadLink");
117+
VERIFY( false );
118+
} catch (const std::runtime_error& e) {
119+
std::string_view what(e.what());
120+
VERIFY( what.ends_with("cannot locate zone: BadLink") );
121+
}
122+
123+
// LinkSelf forms a link cycle with itself.
124+
try {
125+
locate_zone("LinkSelf");
126+
VERIFY( false );
127+
} catch (const std::runtime_error& e) {
128+
std::string_view what(e.what());
129+
VERIFY( what.ends_with("link cycle: LinkSelf") );
130+
}
131+
132+
// Any chain that leads to LinkSelf reaches a cycle.
133+
try {
134+
locate_zone("Link1");
135+
VERIFY( false );
136+
} catch (const std::runtime_error& e) {
137+
std::string_view what(e.what());
138+
VERIFY( what.ends_with("link cycle: Link1") );
139+
}
140+
141+
try {
142+
locate_zone("Link2");
143+
VERIFY( false );
144+
} catch (const std::runtime_error& e) {
145+
std::string_view what(e.what());
146+
VERIFY( what.ends_with("link cycle: Link2") );
147+
}
148+
149+
// Cycle2_A and Cycle2_B form a cycle of length two.
150+
try {
151+
locate_zone("Cycle2_A");
152+
VERIFY( false );
153+
} catch (const std::runtime_error& e) {
154+
std::string_view what(e.what());
155+
VERIFY( what.ends_with("link cycle: Cycle2_A") );
156+
}
157+
158+
try {
159+
locate_zone("Cycle2_B");
160+
VERIFY( false );
161+
} catch (const std::runtime_error& e) {
162+
std::string_view what(e.what());
163+
VERIFY( what.ends_with("link cycle: Cycle2_B") );
164+
}
165+
166+
// Cycle3_A, Cycle3_B and Cycle3_C form a cycle of length three.
167+
try {
168+
locate_zone("Cycle3_A");
169+
VERIFY( false );
170+
} catch (const std::runtime_error& e) {
171+
std::string_view what(e.what());
172+
VERIFY( what.ends_with("link cycle: Cycle3_A") );
173+
}
174+
175+
try {
176+
locate_zone("Cycle3_B");
177+
VERIFY( false );
178+
} catch (const std::runtime_error& e) {
179+
std::string_view what(e.what());
180+
VERIFY( what.ends_with("link cycle: Cycle3_B") );
181+
}
182+
183+
try {
184+
locate_zone("Cycle3_C");
185+
VERIFY( false );
186+
} catch (const std::runtime_error& e) {
187+
std::string_view what(e.what());
188+
VERIFY( what.ends_with("link cycle: Cycle3_C") );
189+
}
190+
191+
// Cycle3_D isn't part of the cycle, but it leads to it.
192+
try {
193+
locate_zone("Cycle3_D");
194+
VERIFY( false );
195+
} catch (const std::runtime_error& e) {
196+
std::string_view what(e.what());
197+
VERIFY( what.ends_with("link cycle: Cycle3_D") );
198+
}
199+
200+
// Cycle4_* links form a cycle of length four.
201+
try {
202+
locate_zone("Cycle4_A");
203+
VERIFY( false );
204+
} catch (const std::runtime_error& e) {
205+
std::string_view what(e.what());
206+
VERIFY( what.ends_with("link cycle: Cycle4_A") );
207+
}
208+
#endif
209+
}
210+
211+
int main()
212+
{
213+
test_link_chains();
214+
test_bad_links();
215+
}

0 commit comments

Comments
 (0)