Skip to content

[libc] Implement strftime #122556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 14, 2025
Merged

Conversation

michaelrj-google
Copy link
Contributor

@michaelrj-google michaelrj-google commented Jan 11, 2025

Implements the posix-specified strftime conversions for the default
locale, along with comprehensive unit tests. This reuses a lot of design
from printf, as well as the printf writer.

Roughly based on #111305, but with major rewrites.

michaelrj-google added a commit to michaelrj-google/llvm-project that referenced this pull request Feb 6, 2025
In the process of adding strftime (llvm#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
michaelrj-google added a commit that referenced this pull request Feb 11, 2025
In the process of adding strftime (#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
@michaelrj-google michaelrj-google marked this pull request as ready for review February 11, 2025 22:42
@llvmbot llvmbot added the libc label Feb 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 11, 2025

@llvm/pr-subscribers-libc

Author: Michael Jones (michaelrj-google)

Changes

Implements the posix-specified strftime conversions for the default
locale, along with comprehensive unit tests. This reuses a lot of design
from printf, as well as the printf writer.

Roughly based on #111305, but with major rewrites.


Patch is 131.94 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122556.diff

18 Files Affected:

  • (modified) libc/config/linux/x86_64/entrypoints.txt (+1)
  • (modified) libc/docs/dev/undefined_behavior.rst (+37)
  • (modified) libc/include/time.yaml (+19)
  • (modified) libc/src/time/CMakeLists.txt (+15)
  • (added) libc/src/time/strftime.cpp (+31)
  • (added) libc/src/time/strftime.h (+23)
  • (added) libc/src/time/strftime_core/CMakeLists.txt (+51)
  • (added) libc/src/time/strftime_core/composite_converter.h (+349)
  • (added) libc/src/time/strftime_core/converter.cpp (+95)
  • (added) libc/src/time/strftime_core/converter.h (+29)
  • (added) libc/src/time/strftime_core/core_structs.h (+53)
  • (added) libc/src/time/strftime_core/num_converter.h (+198)
  • (added) libc/src/time/strftime_core/parser.h (+110)
  • (added) libc/src/time/strftime_core/str_converter.h (+78)
  • (added) libc/src/time/strftime_core/strftime_main.cpp (+41)
  • (added) libc/src/time/strftime_core/strftime_main.h (+26)
  • (modified) libc/test/src/time/CMakeLists.txt (+11)
  • (added) libc/test/src/time/strftime_test.cpp (+2329)
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index 0a942516db6c3..fbac0ad14c24f 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -1115,6 +1115,7 @@ if(LLVM_LIBC_FULL_BUILD)
     libc.src.time.gmtime_r
     libc.src.time.mktime
     libc.src.time.nanosleep
+    libc.src.time.strftime
     libc.src.time.time
     libc.src.time.timespec_get
 
diff --git a/libc/docs/dev/undefined_behavior.rst b/libc/docs/dev/undefined_behavior.rst
index 60fda51e86452..b252832e6715c 100644
--- a/libc/docs/dev/undefined_behavior.rst
+++ b/libc/docs/dev/undefined_behavior.rst
@@ -106,3 +106,40 @@ uninitialized spinlock and invalid spinlock is left undefined. We follow the rec
 POSIX.1-2024, where EINVAL is returned if the spinlock is invalid (here we only check for null pointers) or
 EBUSY is returned if the spinlock is currently locked. The lock is poisoned after a successful destroy. That is,
 subsequent operations on the lock object without any reinitialization will return EINVAL.
+
+Strftime
+--------
+In the C Standard, it provides a list of modifiers, and the conversions these
+are valid on. It also says that a modifier on an unspecified conversion is
+undefined. For LLVM-libc, the conversion is treated as if the modifier isn't
+there.
+
+If a struct tm with values out of the normal range is passed, the standard says
+the result is undefined. For LLVM-libc, the result may be either the normalized
+value (e.g. weekday % 7) or the actual, out of range value. For any numeric
+conversion where the result is just printing a value out of the struct
+(e.g. "%w" prints the day of the week), no normalization occurs ("%w" on a
+tm_wday of 32 prints "32"). For any numeric conversion where the value is
+calculated (e.g. "%u" prints the day of the week, starting on monday), the
+value is normalized (e.g. "%u" on a tm_wday of 32 prints "4"). For conversions
+that result in strings, passing an out of range value will result in "?".
+
+Posix adds padding support to strftime, but says "the default padding character
+is unspecified." For LLVM-libc, the default padding character is ' ' (space)
+for all string-type conversions and '0' for integer-type conversions. Composite
+conversions pass the padding to the first (leftmost) conversion. In practice
+this is always a numeric conversion, so it pads with '0'. For the purposes of
+padding, composite conversions also assume the non-leading conversions have
+valid inputs and output their expected number of characters. For %c this means
+that the padding will be off if the year is outside of the range -999 to 9999.
+
+The %e conversion is padded with spaces by default, but pads with 0s if the '0'
+flag is set.
+
+Posix also adds flags and a minimum field width, but leaves unspecified what
+happens for most combinations of these. For LLVM-libc:
+An unspecified minimum field width defaults to 0.
+More specific flags take precedence over less specific flags (i.e. '+' takes precedence over '0')
+Any conversion with a minimum width is padded with the padding character until it is at least as long as the minimum width.
+Modifiers are applied, then the result is padded if necessary.
+Any composite conversion will pass along all flags to the component conversions.
diff --git a/libc/include/time.yaml b/libc/include/time.yaml
index b71b9ab72075b..41eb318a9dbdc 100644
--- a/libc/include/time.yaml
+++ b/libc/include/time.yaml
@@ -91,6 +91,25 @@ functions:
     arguments:
       - type: const struct timespec *
       - type: struct timespec *
+  - name: strftime
+    standard:
+      - stdc
+    return_type: size_t
+    arguments:
+      - type: char *__restrict
+      - type: size_t
+      - type: const char *__restrict
+      - type: const struct tm *__restrict
+  - name: strftime_l
+    standard:
+      - stdc
+    return_type: size_t
+    arguments:
+      - type: char *__restrict
+      - type: size_t
+      - type: const char *__restrict
+      - type: const struct tm *__restrict
+      - type: locale_t
   - name: time
     standard:
       - stdc
diff --git a/libc/src/time/CMakeLists.txt b/libc/src/time/CMakeLists.txt
index dd28aa67280b7..8332e8ab66f97 100644
--- a/libc/src/time/CMakeLists.txt
+++ b/libc/src/time/CMakeLists.txt
@@ -135,6 +135,21 @@ add_entrypoint_object(
     libc.hdr.types.struct_tm
 )
 
+add_subdirectory(strftime_core) #TODO: Move to top
+
+add_entrypoint_object(
+  strftime
+  SRCS
+    strftime.cpp
+  HDRS
+    strftime.h
+  DEPENDS
+    libc.hdr.types.size_t
+    libc.hdr.types.struct_tm
+    libc.src.stdio.printf_core.writer
+    libc.src.time.strftime_core.strftime_main
+)
+
 add_entrypoint_object(
   time
   SRCS
diff --git a/libc/src/time/strftime.cpp b/libc/src/time/strftime.cpp
new file mode 100644
index 0000000000000..f72bedb6fbf72
--- /dev/null
+++ b/libc/src/time/strftime.cpp
@@ -0,0 +1,31 @@
+//===-- Implementation of strftime function -------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/time/strftime.h"
+#include "hdr/types/size_t.h"
+#include "hdr/types/struct_tm.h"
+#include "src/__support/common.h"
+#include "src/__support/macros/config.h"
+#include "src/stdio/printf_core/writer.h"
+#include "src/time/strftime_core/strftime_main.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+LLVM_LIBC_FUNCTION(size_t, strftime,
+                   (char *__restrict buffer, size_t buffsz,
+                    const char *__restrict format, const struct tm *timeptr)) {
+
+  printf_core::WriteBuffer wb(buffer, (buffsz > 0 ? buffsz - 1 : 0));
+  printf_core::Writer writer(&wb);
+  int ret = strftime_core::strftime_main(&writer, format, timeptr);
+  if (buffsz > 0) // if the buffsz is 0 the buffer may be a null pointer.
+    wb.buff[wb.buff_cur] = '\0';
+  return (ret < 0 || static_cast<size_t>(ret) > buffsz) ? 0 : ret;
+}
+
+} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/time/strftime.h b/libc/src/time/strftime.h
new file mode 100644
index 0000000000000..d8f562859c395
--- /dev/null
+++ b/libc/src/time/strftime.h
@@ -0,0 +1,23 @@
+//===-- Implementation header of strftime -----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_TIME_STRFTIME_H
+#define LLVM_LIBC_SRC_TIME_STRFTIME_H
+
+#include "hdr/types/size_t.h"
+#include "hdr/types/struct_tm.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+size_t strftime(char *__restrict, size_t max, const char *__restrict format,
+                const struct tm *timeptr);
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC_TIME_STRFTIME_H
diff --git a/libc/src/time/strftime_core/CMakeLists.txt b/libc/src/time/strftime_core/CMakeLists.txt
new file mode 100644
index 0000000000000..a12a26b2aee0f
--- /dev/null
+++ b/libc/src/time/strftime_core/CMakeLists.txt
@@ -0,0 +1,51 @@
+add_header_library(
+  core_structs
+  HDRS
+    core_structs.h
+  DEPENDS
+    libc.src.__support.CPP.string_view
+    libc.hdr.types.struct_tm
+)
+
+add_header_library(
+  parser
+  HDRS
+    parser.h
+  DEPENDS
+    .core_structs
+    libc.src.__support.CPP.string_view
+    libc.src.__support.ctype_utils
+    libc.src.__support.str_to_integer
+)
+
+add_object_library(
+  converter
+  SRCS
+    converter.cpp
+  HDRS
+    converter.h
+    num_converter.h
+    str_converter.h
+    composite_converter.h
+  DEPENDS
+    .core_structs
+    libc.src.time.time_utils
+    libc.src.time.time_constants
+    libc.src.stdio.printf_core.writer
+    libc.src.__support.CPP.string_view
+    libc.src.__support.integer_to_string
+)
+
+add_object_library(
+  strftime_main
+  SRCS
+    strftime_main.cpp
+  HDRS
+    strftime_main.h
+  DEPENDS
+    .core_structs
+    .parser
+    .converter
+    libc.src.stdio.printf_core.writer
+    libc.hdr.types.struct_tm
+)
diff --git a/libc/src/time/strftime_core/composite_converter.h b/libc/src/time/strftime_core/composite_converter.h
new file mode 100644
index 0000000000000..86ba90cae2e03
--- /dev/null
+++ b/libc/src/time/strftime_core/composite_converter.h
@@ -0,0 +1,349 @@
+//===-- Composite converter for strftime ------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See htto_conv.times://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_STDIO_STRFTIME_CORE_COMPOSITE_CONVERTER_H
+#define LLVM_LIBC_SRC_STDIO_STRFTIME_CORE_COMPOSITE_CONVERTER_H
+
+#include "src/__support/CPP/string_view.h"
+#include "src/__support/macros/config.h"
+#include "src/stdio/printf_core/writer.h"
+#include "src/time/strftime_core/core_structs.h"
+#include "src/time/strftime_core/num_converter.h"
+#include "src/time/strftime_core/str_converter.h"
+#include "src/time/time_constants.h"
+#include "src/time/time_utils.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace strftime_core {
+
+LIBC_INLINE int convert_date_us(printf_core::Writer *writer,
+                                const FormatSection &to_conv,
+                                const tm *timeptr) {
+  // format is %m/%d/%y (month/day/year)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  const size_t trailing_conv_len = 1 + 2 + 1 + 2; // sizeof("/01/02")
+  IntFormatSection year_conv;
+  IntFormatSection mon_conv;
+  IntFormatSection mday_conv;
+
+  {
+    FormatSection raw_mon_conv = to_conv;
+    raw_mon_conv.conv_name = 'm';
+
+    const int requested_padding = to_conv.min_width - trailing_conv_len;
+    // a negative padding will be treated as the default
+    raw_mon_conv.min_width = requested_padding;
+
+    mon_conv = get_int_format(raw_mon_conv, timeptr);
+
+    // If the user set the padding, but it's below the width of the trailing
+    // conversions, then there should be no padding.
+    if (to_conv.min_width > 0 && requested_padding < 0)
+      mon_conv.pad_to_len = 0;
+  }
+  {
+    FormatSection raw_mday_conv = to_conv;
+    raw_mday_conv.conv_name = 'd';
+    raw_mday_conv.min_width = 0;
+
+    mday_conv = get_int_format(raw_mday_conv, timeptr);
+  }
+  {
+    FormatSection raw_year_conv = to_conv;
+    raw_year_conv.conv_name = 'y';
+    raw_year_conv.min_width = 0;
+
+    year_conv = get_int_format(raw_year_conv, timeptr);
+  }
+
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, mon_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write('/'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, mday_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write('/'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, year_conv));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_date_iso(printf_core::Writer *writer,
+                                 const FormatSection &to_conv,
+                                 const tm *timeptr) {
+  // format is "%Y-%m-%d" (year-month-day)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  const size_t trailing_conv_len = 1 + 2 + 1 + 2; // sizeof("-01-02")
+  IntFormatSection year_conv;
+  IntFormatSection mon_conv;
+  IntFormatSection mday_conv;
+
+  {
+    FormatSection raw_year_conv = to_conv;
+    raw_year_conv.conv_name = 'Y';
+
+    const int requested_padding = to_conv.min_width - trailing_conv_len;
+    // a negative padding will be treated as the default
+    raw_year_conv.min_width = requested_padding;
+
+    year_conv = get_int_format(raw_year_conv, timeptr);
+
+    // If the user set the padding, but it's below the width of the trailing
+    // conversions, then there should be no padding.
+    if (to_conv.min_width > 0 && requested_padding < 0)
+      year_conv.pad_to_len = 0;
+  }
+  {
+    FormatSection raw_mon_conv = to_conv;
+    raw_mon_conv.conv_name = 'm';
+    raw_mon_conv.min_width = 0;
+
+    mon_conv = get_int_format(raw_mon_conv, timeptr);
+  }
+  {
+    FormatSection raw_mday_conv = to_conv;
+    raw_mday_conv.conv_name = 'd';
+    raw_mday_conv.min_width = 0;
+
+    mday_conv = get_int_format(raw_mday_conv, timeptr);
+  }
+
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, year_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write('-'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, mon_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write('-'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, mday_conv));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_time_am_pm(printf_core::Writer *writer,
+                                   const FormatSection &to_conv,
+                                   const tm *timeptr) {
+  // format is "%I:%M:%S %p" (hour:minute:second AM/PM)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  const size_t trailing_conv_len = 1 + 2 + 1 + 2 + 1 + 2; // sizeof(":01:02 AM")
+  IntFormatSection hour_conv;
+  IntFormatSection min_conv;
+  IntFormatSection sec_conv;
+
+  const time_utils::TMReader time_reader(timeptr);
+
+  {
+    FormatSection raw_hour_conv = to_conv;
+    raw_hour_conv.conv_name = 'I';
+
+    const int requested_padding = to_conv.min_width - trailing_conv_len;
+    // a negative padding will be treated as the default
+    raw_hour_conv.min_width = requested_padding;
+
+    hour_conv = get_int_format(raw_hour_conv, timeptr);
+
+    // If the user set the padding, but it's below the width of the trailing
+    // conversions, then there should be no padding.
+    if (to_conv.min_width > 0 && requested_padding < 0)
+      hour_conv.pad_to_len = 0;
+  }
+  {
+    FormatSection raw_min_conv = to_conv;
+    raw_min_conv.conv_name = 'M';
+    raw_min_conv.min_width = 0;
+
+    min_conv = get_int_format(raw_min_conv, timeptr);
+  }
+  {
+    FormatSection raw_sec_conv = to_conv;
+    raw_sec_conv.conv_name = 'S';
+    raw_sec_conv.min_width = 0;
+
+    sec_conv = get_int_format(raw_sec_conv, timeptr);
+  }
+
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, hour_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(':'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, min_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(':'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, sec_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(' '));
+  RET_IF_RESULT_NEGATIVE(writer->write(time_reader.get_am_pm()));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_time_minute(printf_core::Writer *writer,
+                                    const FormatSection &to_conv,
+                                    const tm *timeptr) {
+  // format is "%H:%M" (hour:minute)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  const size_t trailing_conv_len = 1 + 2; // sizeof(":01")
+  IntFormatSection hour_conv;
+  IntFormatSection min_conv;
+
+  {
+    FormatSection raw_hour_conv = to_conv;
+    raw_hour_conv.conv_name = 'H';
+
+    const int requested_padding = to_conv.min_width - trailing_conv_len;
+    // a negative padding will be treated as the default
+    raw_hour_conv.min_width = requested_padding;
+
+    hour_conv = get_int_format(raw_hour_conv, timeptr);
+
+    // If the user set the padding, but it's below the width of the trailing
+    // conversions, then there should be no padding.
+    if (to_conv.min_width > 0 && requested_padding < 0)
+      hour_conv.pad_to_len = 0;
+  }
+  {
+    FormatSection raw_min_conv = to_conv;
+    raw_min_conv.conv_name = 'M';
+    raw_min_conv.min_width = 0;
+
+    min_conv = get_int_format(raw_min_conv, timeptr);
+  }
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, hour_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(':'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, min_conv));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_time_second(printf_core::Writer *writer,
+                                    const FormatSection &to_conv,
+                                    const tm *timeptr) {
+  // format is "%H:%M:%S" (hour:minute:second)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  const size_t trailing_conv_len = 1 + 2 + 1 + 2; // sizeof(":01:02")
+  IntFormatSection hour_conv;
+  IntFormatSection min_conv;
+  IntFormatSection sec_conv;
+
+  {
+    FormatSection raw_hour_conv = to_conv;
+    raw_hour_conv.conv_name = 'H';
+
+    const int requested_padding = to_conv.min_width - trailing_conv_len;
+    // a negative padding will be treated as the default
+    raw_hour_conv.min_width = requested_padding;
+
+    hour_conv = get_int_format(raw_hour_conv, timeptr);
+
+    // If the user set the padding, but it's below the width of the trailing
+    // conversions, then there should be no padding.
+    if (to_conv.min_width > 0 && requested_padding < 0)
+      hour_conv.pad_to_len = 0;
+  }
+  {
+    FormatSection raw_min_conv = to_conv;
+    raw_min_conv.conv_name = 'M';
+    raw_min_conv.min_width = 0;
+
+    min_conv = get_int_format(raw_min_conv, timeptr);
+  }
+  {
+    FormatSection raw_sec_conv = to_conv;
+    raw_sec_conv.conv_name = 'S';
+    raw_sec_conv.min_width = 0;
+
+    sec_conv = get_int_format(raw_sec_conv, timeptr);
+  }
+
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, hour_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(':'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, min_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(':'));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, sec_conv));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_full_date_time(printf_core::Writer *writer,
+                                       const FormatSection &to_conv,
+                                       const tm *timeptr) {
+  const time_utils::TMReader time_reader(timeptr);
+  // format is "%a %b %e %T %Y" (weekday month mday [time] year)
+  // we only pad the first conversion, and we assume all the other values are in
+  // their valid ranges.
+  // sizeof("Sun Jan 12 03:45:06 2025")
+  const size_t full_conv_len = 3 + 1 + 3 + 1 + 2 + 1 + 8 + 1 + 4;
+  // use the full conv lent because
+  const int requested_padding = to_conv.min_width - full_conv_len;
+
+  cpp::string_view wday_str = unwrap_opt(time_reader.get_weekday_short_name());
+  cpp::string_view month_str = unwrap_opt(time_reader.get_month_short_name());
+  IntFormatSection mday_conv;
+  FormatSection raw_time_conv = to_conv;
+  IntFormatSection year_conv;
+
+  {
+    FormatSection raw_mday_conv = to_conv;
+    raw_mday_conv.conv_name = 'e';
+    raw_mday_conv.min_width = 0;
+
+    mday_conv = get_int_format(raw_mday_conv, timeptr);
+  }
+  {
+    FormatSection raw_year_conv = to_conv;
+    raw_year_conv.conv_name = 'Y';
+    raw_year_conv.min_width = 0;
+
+    year_conv = get_int_format(raw_year_conv, timeptr);
+  }
+  {
+    raw_time_conv.conv_name = 'T';
+    raw_time_conv.min_width = 0;
+  }
+
+  if (requested_padding > 0)
+    RET_IF_RESULT_NEGATIVE(writer->write(' ', requested_padding));
+  RET_IF_RESULT_NEGATIVE(writer->write(wday_str));
+  RET_IF_RESULT_NEGATIVE(writer->write(' '));
+  RET_IF_RESULT_NEGATIVE(writer->write(month_str));
+  RET_IF_RESULT_NEGATIVE(writer->write(' '));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, mday_conv));
+  RET_IF_RESULT_NEGATIVE(writer->write(' '));
+  RET_IF_RESULT_NEGATIVE(convert_time_second(writer, raw_time_conv, timeptr));
+  RET_IF_RESULT_NEGATIVE(writer->write(' '));
+  RET_IF_RESULT_NEGATIVE(write_padded_int(writer, year_conv));
+
+  return WRITE_OK;
+}
+
+LIBC_INLINE int convert_composite(printf_core::Writer *writer,
+                                  const FormatSection &to_conv,
+                                  const tm *timeptr) {
+  switch (to_conv.conv_name) {
+  case 'c': // locale specified date and time
+            // in default locale Equivalent to %a %b %e %T %Y.
+    return convert_full_date_time(writer, to_conv, timeptr);
+  case 'D': // %m/%d/%y (month/day/y...
[truncated]

Copy link
Member

@nickdesaulniers nickdesaulniers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those tests are pretty thorough!

Comment on lines 51 to 55
FormatSection raw_mday_conv = to_conv;
raw_mday_conv.conv_name = 'd';
raw_mday_conv.min_width = 0;

mday_conv = get_int_format(raw_mday_conv, timeptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to avoid making so many copies of to_conv?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option would be to have one copy that gets modified for each conversion. I think that'd be a bit less clean since they'd all share a name, but it might be slightly more memory efficient.

Comment on lines 100 to 113
{
FormatSection raw_mon_conv = to_conv;
raw_mon_conv.conv_name = 'm';
raw_mon_conv.min_width = 0;

mon_conv = get_int_format(raw_mon_conv, timeptr);
}
{
FormatSection raw_mday_conv = to_conv;
raw_mday_conv.conv_name = 'd';
raw_mday_conv.min_width = 0;

mday_conv = get_int_format(raw_mday_conv, timeptr);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be 2 calls to a new helper function. Could probably reuse that helper throughout this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, that cleans up a lot of the rest of it. Done

__builtin_trap(); // this should be unreachable, but trap if you hit it.
}

int spaces = to_conv.min_width - static_cast<int>(str.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to check here if str is equivalent to OUT_OF_BOUNDS_STR due to the use of unwrap_opt above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's a valid string with a size.


void clear_buff() {
// TODO: builtin_memset?
for (size_t i = 0; i < BUFF_LEN; ++i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just call memset; it MUST always be available, regardless of freestanding mode or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but if we use the proper name for memset we have to include the header. That's why I left this as a todo.

Comment on lines 64 to 65
for (size_t str_cur = 0; str_cur < str.size(); ++i, ++str_cur)
buff[i] = str[str_cur];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for (size_t str_cur = 0; str_cur < str.size(); ++i, ++str_cur)
buff[i] = str[str_cur];
for (size_t str_cur = 0, e = str.size(); str_cur < e; ++i, ++str_cur)
buff[i] = str[str_cur];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've applied it, but I'm not sure if this optimization is necessary. Does the compiler not do this itself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop is about calls to end, but I think it applies to calls to size for the same reasons. You'll find that pattern VERY COMMON throughout the use of LLVM.

For this simple loop body, the code is likely very similar either way (since the definitions of size and operator[] are defined in headers), but for loop bodies containing function calls where we can't see the definition from this TU, we can't assume such functions don't already have non-const references that alias to something we might otherwise try to hoist out of a loop (such as a vector's size).

Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
In the process of adding strftime (llvm#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
Implements the posix-specified strftime conversions for the default
locale, along with comprehensive unit tests. This reuses a lot of design
from printf, as well as the printf writer.

Roughly based on llvm#111305, but with major rewrites.
flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025
In the process of adding strftime (llvm#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I'll test it on the GPU after this lands. Then maybe I can enable streams on the GPU build despite my better judgement.

joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
In the process of adding strftime (llvm#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
@michaelrj-google michaelrj-google merged commit 398f865 into llvm:main Feb 14, 2025
14 checks passed
@michaelrj-google michaelrj-google deleted the libcStrftime branch February 14, 2025 23:57
@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 15, 2025

LLVM Buildbot has detected a new failure on builder libc-x86_64-debian-gcc-fullbuild-dbg running on libc-x86_64-debian-fullbuild while building libc at step 4 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/131/builds/16056

Here is the relevant piece of the build log for the reference
Step 4 (annotate) failure: 'python ../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py ...' (failure)
...
[62/88] Generating header sys/types.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/types.yaml
[63/86] Generating header uchar.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/uchar.yaml
[64/86] Generating header termios.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/termios.yaml
[65/75] Generating header sys/wait.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/wait.yaml
[66/72] Generating header math.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/math.yaml
[67/72] Building CXX object compiler-rt/lib/scudo/standalone/CMakeFiles/RTScudoStandalone.x86_64.dir/mem_map_linux.cpp.o
[68/72] Building CXX object compiler-rt/lib/scudo/standalone/CMakeFiles/RTScudoStandalone.x86_64.dir/linux.cpp.o
[69/72] Building CXX object libc/src/time/CMakeFiles/libc.src.time.strftime.dir/strftime.cpp.o
[70/72] Building CXX object libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.strftime_main.dir/strftime_main.cpp.o
[71/72] Building CXX object libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.converter.dir/converter.cpp.o
FAILED: libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.converter.dir/converter.cpp.o 
/usr/bin/g++ -DLIBC_NAMESPACE=__llvm_libc_20_0_0_git -D_DEBUG -I/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/libc -isystem /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/build/libc/include -fvisibility-inlines-hidden -Werror=date-time -fno-lifetime-dse -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -Wimplicit-fallthrough -Wno-nonnull -Wno-class-memaccess -Wno-redundant-move -Wno-pessimizing-move -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wno-misleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -g -DLIBC_QSORT_IMPL=LIBC_QSORT_QUICK_SORT -DLIBC_ADD_NULL_CHECKS -fpie -ffreestanding -DLIBC_FULL_BUILD -isystem/usr/lib/gcc/x86_64-linux-gnu/12//include -nostdinc -idirafter/usr/include -fno-builtin -fno-exceptions -fno-lax-vector-conversions -fno-unwind-tables -fno-asynchronous-unwind-tables -fno-rtti -ftrivial-auto-var-init=pattern -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Wall -Wextra -Werror -Wconversion -Wno-sign-conversion -Wdeprecated -fext-numeric-literals -Wno-pedantic -Wimplicit-fallthrough -Wwrite-strings -Wextra-semi -std=gnu++17 -MD -MT libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.converter.dir/converter.cpp.o -MF libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.converter.dir/converter.cpp.o.d -o libc/src/time/strftime_core/CMakeFiles/libc.src.time.strftime_core.converter.dir/converter.cpp.o -c /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/libc/src/time/strftime_core/converter.cpp
In file included from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/libc/src/time/strftime_core/converter.cpp:14:
/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/libc/src/time/strftime_core/composite_converter.h: In function ‘int __llvm_libc_20_0_0_git::strftime_core::convert_full_date_time(__llvm_libc_20_0_0_git::printf_core::Writer*, const FormatSection&, const tm*)’:
/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/libc/src/time/strftime_core/composite_converter.h:178:51: error: conversion from ‘long unsigned int’ to ‘int’ may change value [-Werror=conversion]
  178 |   const int requested_padding = to_conv.min_width - full_conv_len;
      |                                 ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
ninja: build stopped: subcommand failed.
['ninja', 'libc'] exited with return code 1.
The build step threw an exception...
Traceback (most recent call last):
  File "/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/build/../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py", line 164, in step
    yield
  File "/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/build/../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py", line 126, in main
    run_command(['ninja', 'libc'])
  File "/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/build/../llvm-zorg/zorg/buildbot/builders/annotated/libc-linux.py", line 179, in run_command
    util.report_run_cmd(cmd, cwd=directory)
  File "/home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-zorg/zorg/buildbot/builders/annotated/util.py", line 49, in report_run_cmd
    subprocess.check_call(cmd, shell=shell, *args, **kwargs)
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ninja', 'libc']' returned non-zero exit status 1.
@@@STEP_FAILURE@@@
@@@BUILD_STEP build libc-startup@@@
Running: ninja libc-startup
[1/19] Generating header errno.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/errno.yaml
[2/19] Generating header stdint.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/stdint.yaml
[3/18] Generating header fcntl.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/fcntl.yaml
[4/18] Generating header float.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/float.yaml
[5/18] Generating header limits.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/limits.yaml
[6/18] Generating header link.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/link.yaml
[7/18] Generating header sys/syscall.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/syscall.yaml
[8/18] Generating header sys/auxv.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/auxv.yaml
[9/18] Generating header sys/prctl.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/prctl.yaml
[10/18] Generating header sys/time.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/time.yaml
[11/18] Generating header time.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/time.yaml
[12/18] Generating header sys/mman.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/sys/mman.yaml
[13/17] Generating header stdlib.h from /home/llvm-libc-buildbot/buildbot-worker/libc-x86_64-debian-fullbuild/libc-x86_64-debian-gcc-fullbuild-dbg/llvm-project/runtimes/../libc/include/stdlib.yaml

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
In the process of adding strftime (llvm#122556) I wrote this utility class
to simplify reading from a struct tm. It provides helper functions that
return basically everything needed by strftime. It's not tested
directly, but it is thoroughly exercised by the strftime tests.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
Implements the posix-specified strftime conversions for the default
locale, along with comprehensive unit tests. This reuses a lot of design
from printf, as well as the printf writer.

Roughly based on llvm#111305, but with major rewrites.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants