Commit graph

23659 commits

Author SHA1 Message Date
Fredrik Roubert
02a1bfc59f ICU-22520 Refactor CheckedArrayByteSink & u_terminateChars into helper.
The repeated sequence of allocating a CheckedArrayByteSink, calling some
function that writes into this, then checking for overflow and returning
through u_terminateChars() can all be moved into a single shared helper
function.
2024-03-05 20:09:54 +01:00
Rich Gillam
c610d7f986 ICU-22534 Promote (almost) all @draft ICU 73 APIs to @stable ICU 73 2024-03-04 18:05:29 -08:00
Fredrik Roubert
232362bf17 ICU-22520 Use operator* instead of calling std::optional::value().
There's a subtle difference between these two ways of accessing the
value of an optional and that is that the value() method can throw an
exception if there isn't any value, but operator* won't do that (it's
just undefined behavior if there isn't any value).

ICU4C code never tries to access any optional value without first
checking that it exists, but the ability of the value() method to throw
an exception in case there wasn't any such check first is the reason why
std::exception symbols previously could show up in debug builds.

This reverts the changes that were made to dependencies.txt by
commit dc70b5a056.
2024-03-04 23:40:15 +01:00
Frank Tang
73744ea41f ICU-22633 Fix overflow cause by large AM PM value
Fix https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=66771
2024-03-04 13:48:24 -08:00
Frank Tang
37526240e1 ICU-22274 Mark known issue for 3 timezones for EnvTest
tz2024a change "Asia/Qostanay" "Asia/Almaty" but test machines has
not yet update their zoneinfo to 2024a so we mark them as known issues

extern long timezone; in <time.h> (set man tzset on Linux shell)
returns wrong value when TZ=America/Scoresbysund
2024-03-04 11:06:39 -08:00
Shane F. Carr
71b9b88200 ICU-22319 Fix number range semanticallyEquivalent
See #2385
2024-03-04 08:23:00 -08:00
Fredrik Roubert
929cd9bb4f ICU-22520 Standardize return on error for all locale functions.
· No function should do anything if an error has already occurred.
· On error, a value of 0, nullptr, {}, etc., should be returned.
· Values shouldn't have overloaded meanings (eg. index or found).
· Values that are never used should not be returned at all.
2024-02-29 20:42:03 +01:00
David Carlier
35353f2d7f ICU-22671 format_date should use c++ nullptr instead of 0 for udat_open/DateFormat::create
- [x] Required: Issue filed: https://unicode-org.atlassian.net/browse/ICU-22671
- [x] Required: The PR title must be prefixed with a JIRA Issue number. <!-- For example: "ICU-1234 Fix xyz" -->
- [x] Required: The PR description must include the link to the Jira Issue, for example by completing the URL in the first checklist item
- [x] Required: Each commit message must be prefixed with a JIRA Issue number. <!-- For example: "ICU-1234 Fix xyz" -->
- [ ] Issue accepted (done by Technical Committee after discussion)
- [ ] Tests included, if applicable
- [ ] API docs and/or User Guide docs changed or added, if applicable
2024-02-29 20:02:20 +01:00
Fredrik Roubert
137b4c9e47 ICU-22556 Update configure files from configure.ac using autoreconf. 2024-02-29 19:43:43 +01:00
Jordan Williams
75ff7952b9 ICU-22556 Prefer cc and c++ compilers
When building icu4c, it defaults to clang instead of gcc when the default compiler, cc / c++, is a symlink to gcc / g++.
This not the expected behavior when building C and C++ code.
It appears that this behavior was put in place originally for supporting C++11, which hopefully is no longer such a concern.
This PR adjusts the configure.ac for icu4c to prefer the cc and c++ compilers first.
2024-02-29 19:43:43 +01:00
Frank Tang
0563859d8c ICU-22679 Optimize calendar code for edge cases
See #2853
2024-02-28 17:08:24 -08:00
DraganBesevic
a1925abf4f ICU-22534 CLDR 45 alpha2 integration to ICU 2024-02-28 08:28:08 -08:00
Craig
d271d3f269 ICU-21952 fix draft version of withoutLocale to ICU 75 2024-02-27 17:03:46 -08:00
Markus Scherer
d1fa15bc1f ICU-22571 add Aran script code variant 2024-02-27 14:23:59 -08:00
Frank Tang
ec800e7407 ICU-22633 Return error if era is out of range 2024-02-27 10:56:28 -08:00
Fredrik Roubert
314f03eeaf ICU-22532 Don't dereference nullptr (-Wtautological-undefined-compare). 2024-02-27 14:11:38 +01:00
Rūdolfs Mazurs
394341edba ICU-22646 Update collation test for Latvian locale
This test is also relevant to issues ICU-12765 ICU-13508 ICU-20532
2024-02-26 14:09:40 -08:00
Rahul Pandey
3c82e6857c ICU-22676 Undefine move32 since it is interpreted system call with MSVC ARM64 2024-02-26 08:55:31 -08:00
Frank Tang
b24b251bca ICU-22633 Fix more int overflow issues in calendar 2024-02-13 17:24:18 -08:00
Fredrik Roubert
939f08f274 ICU-22520 Use C++ function signatures for internal C++ functions.
Some of this code was originally written as C code and some of this code
was originally written as C++ code but made to resemble the then already
existing code that had once been C code. Changing it all to normal C++
now will make it easier and safer to work with going forward.

· Use unnamed namespace instead of static.
· Use reference instead of non-nullable pointer.
· Use bool instead of UBool.
· Use constexpr for static data.
· Use U_EXPORT instead of U_CAPI or U_CFUNC.
· Use the default calling convention instead of U_EXPORT2.
2024-02-12 21:44:06 +01:00
Fredrik Roubert
69c8e12642 ICU-22520 Remove local custom code for parsing variant subtags.
Now when the parseTagString() helper function just is a wrapper over
ulocimp_getSubtags() it can be replaced by calling that function
directly instead and letting it handle variant subtags as well.
2024-02-09 20:26:09 +01:00
Fredrik Roubert
61fdbe0d06 ICU-22520 Refactor code to remove the use of goto for error handling.
This is to facilitate further refactoring of the locale code, goto
doesn't play all too well with C++ memory handling.
2024-02-09 18:47:22 +01:00
Peter Edberg
2c16b037cf ICU-22557 Add kxv_IN to build-icu-data.xml, update generate stubs 2024-02-09 09:40:52 -08:00
Frank Tang
abcb80fd53 ICU-22615 Test TimeZoneNames API will not assert with non ASCII.
Add tests and return error when the ID is non ASCII
2024-02-08 23:37:14 -08:00
yumaoka
cd251ee62e ICU-22659 tzdata2024a updates in ICU repo 2024-02-08 15:00:39 -05:00
Fredrik Roubert
63ae786bf7 ICU-22520 Refactor function macros into inline functions.
This is to facilitate further refactoring of the locale code.
2024-02-08 14:24:48 +01:00
Robin Leroy
ba1208e49b ICU-22518 Add a flag to export the output of the reference implementation from the old segmentation monkey tests 2024-02-08 04:54:33 +01:00
Fredrik Roubert
699555a5bd ICU-22520 Use a ByteSink append buffer instead of a local CharString.
These functions that eventually write their output to a ByteSink need a
small temporary buffer for processing the subtag they're about to write
and currently use a local CharString object to provide this buffer,
which then gets written to the ByteSink and discarded.

This intermediate step is unnecessary as a ByteSink can provide an
append buffer which can be used instead, eliminating the need to
allocate a local temporary buffer and to copy the data around.

This approach also makes it natural to split the processing into two
steps, first calculating the length of the subtag, then processing it,
which makes it possible to return early when no output is requested.
2024-02-08 00:38:09 +01:00
Fredrik Roubert
a210fc8351 ICU-22651 Add a docstring for LocalOpenPointer. 2024-02-07 21:47:13 +01:00
Frank Yung-Fong Tang
0b66fada30 ICU-22633 Fix integer overflow inside Calendar code
See #2806
2024-02-07 10:58:41 -08:00
Fredrik Roubert
a6efa924ad ICU-22520 Let ulocimp_getRegionForSupplementalData() return CharString. 2024-02-07 14:27:40 +01:00
Fredrik Roubert
56509e88bf ICU-22651 Move LocalOpenPointer into an internal nested namespace. 2024-02-07 14:27:17 +01:00
Peter Edberg
43ab3d1de8 ICU-22583 BRS 75rc CLDR 45-alpha0 to ICU main part 4 (fix to get new unitPrefixes data) 2024-02-06 18:07:44 -08:00
Peter Edberg
12cbf73e39 ICU-22583 BRS 75rc CLDR 45-alpha0 to ICU main part 3 (source and test code changes) 2024-02-06 18:07:44 -08:00
Peter Edberg
c7245a36df ICU-22583 BRS 75rc CLDR 45-alpha0 to ICU main part 2 (data generated or copied from CLDR) 2024-02-06 18:07:44 -08:00
Fredrik Roubert
d28e12b1f2 ICU-22520 Replace char arrays with icu::CharString. 2024-02-06 19:53:53 +01:00
Fredrik Roubert
930b4d9ab9 ICU-22520 Add convenience wrappers for calling ulocimp_getSubtags().
These wrappers that call ulocimp_getSubtags() to get only one particular
subtag and then return that as icu::CharString will be convenient for
replacing code that currently calls the uloc_get*() functions writing
into a fixed size buffer.
2024-02-06 19:53:53 +01:00
Fredrik Roubert
835b009314 ICU-22520 Make ulocimp_get*() internal to ulocimp_getSubtags().
These functions now no longer have any other callers so they can be made
internal to the compilation unit of ulocimp_getSubtags(), thus bringing
them back to how they originally were intended to be used (and making
the comment above them true once again).

This also makes it possible to remove the temporary icu::CharString
objects that previously were returned to callers and instead write
directly to icu::ByteSink, making the code both simpler and less
wasteful (also that how this was once intended).
2024-02-06 13:12:55 +01:00
Fredrik Roubert
1b768edbdf ICU-22520 Update all users of ulocimp_get*() to ulocimp_getSubtags().
This simplifies the code by removing the need for finding the positions
of the subtags, all that logic is now in just one single place.
2024-02-06 13:12:55 +01:00
Fredrik Roubert
dc70b5a056 ICU-22520 Move all localeID parsing logic into new ulocimp_getSubtags().
The logic for parsing a localeID string into its constituent subtags is
currently repeated over and over again in each one of the uloc_get*()
functions, so that calling all these functions one after the other in
order to get all the subtags does the parsing all over again from the
beginning for each function call.

In order to avoid having to do this parsing over and over again, a lot
of code instead has its own copy of the parsing logic in order to call
the underlying ulocimp_get*() functions directly for lower runtime cost
at the price of increased code complexity and repetition.

This new ulocimp_getSubtags() function, which writes natively to
icu::ByteSink and has a convenience wrapper to write to icu::CharString,
removes the repeated code from the uloc_get*() functions and makes it
possible to update all code that calls the ulocimp_get*() functions.
2024-02-06 13:12:55 +01:00
Fredrik Roubert
678d5c1273 ICU-22520 Replace use of ulocimp_forLanguageTag() in uloc_getVariant().
Originally added by commit 24055f8585
for ICU-7882, converting any language tag with a BCP-47 extension into a
legacy Unicode locale ID was a simple way to make the existing code keep
working unchanged also with BCP-47 extensions.

But the only thing that uloc_getVariant() needs is being able to find
out where variants end and extensions begin, for which converting the
entire language tag is unnecessary, it's much more straightforward to
instead just check for the -t-, -u- or -x- marker that indicates the
start of a BCP-47 extension.
2024-02-06 13:12:55 +01:00
Fredrik Roubert
6fa113eaa8 ICU-22651 Refactor U_DEFINE_LOCAL_OPEN_POINTER into a template. 2024-02-05 14:15:15 +01:00
Frank Tang
b8271577b6 ICU-22649 Fix possible leakage by using LocalUResourceBundlePointer 2024-02-02 10:24:21 -08:00
Fredrik Roubert
6562a7df85 ICU-22627 Delete obsolete test case letest/api/ScriptTest. 2024-02-02 15:55:27 +01:00
Frank Tang
9515e82741 ICU-22633 Fix Integer-overflow in icu_75::Calendar::add
See #2805
2024-02-01 13:49:41 -08:00
Fredrik Roubert
ae9cc8cbd1 ICU-22520 Replace char arrays with icu::CharString. 2024-01-30 12:04:53 +01:00
Fredrik Roubert
1b0f5e41c5 ICU-22520 Switch to using CharString for calling uloc_setKeywordValue(). 2024-01-30 12:04:53 +01:00
Fredrik Roubert
340806bf9a ICU-22520 Add a ulocimp_setKeywordValue() that writes to icu::ByteSink. 2024-01-30 12:04:53 +01:00
Frank Tang
e1415d1282 ICU-22635 Avoid integer-overflow for invalid large UChar32 2024-01-29 11:57:12 -08:00
Frank Tang
8f80c62aa2 ICU-22638 Fix cast overflow issue 2024-01-25 12:11:56 -08:00