Update spec tests to current version from message-format-wg
- Update parser for changed name-start grammar rule
- Validate number literals in :number implementation (since parser no longer does this)
- Disallow `:number`/`:integer` select option set from variable
See https://github.com/unicode-org/message-format-wg/pull/1016
As part of this, un-skip tests where the `bad-option` error is
expected, and implement validating digit size options
(pending PR https://github.com/unicode-org/icu/pull/2973 is intended
to do this more fully)
This updates the MF2 spec tests to 943479b602 with the following exceptions:
- functions/currency.json and functions/math.json are omitted because these are not yet implemented
- bidi.json will be handled in a future PR
- u-options.json will be handled in a future PR
Changes include:
* `:integer` now returns a value encapsulating the rounded numeric value of the argument, rather than
the value itself.
* Fallbacks are handled according to the current spec.
* Fallback values are not passed into functions.
* Characters inside literal fallbacks are properly escaped.
* The test runner skips null values properly.
* The test runner handles boolean `expErrors` in defaultTestProperties.
* `:string` normalizes its input and normalizeNFC() has been refactored so it can be called there.
Implement :test:format, :test:select, and :test:function, which are
required by the new `pattern-selection.json` tests.
Change the internal value representation in the formatter in order to
support some of the test cases (binding the results of selectors to a
variable).
Until now, the implementation of the UCollator predicates has been using
UnicodeString and StringPiece as convenient wrappers for converting from
standard C++ data types to ICU4C data types.
But as that doesn't work when the client uses ICU4C built without
U_SHOW_CPLUSPLUS_API this is now changed to instead perform these
conversions directly.
(It's a bit more code, but does just the same thing in the end.)
Returning a const-qualified prvalue doesn't do anything useful, but it does
turn an assignment such as `v = rb.getLocale();` from a move-assignment
into a copy-assignment (because it's forbidden to move-from a const value,
even if it's a const prvalue). Each affected site was diagnosed mechanically
by my fork of Clang. E.g.:
warning: 'const' type qualifier on return type is a bad idea [-Wqual-class-return-type]
391 | const Locale ResourceBundle::getLocale(ULocDataLocaleType type, UErrorCode &status) const
| ^~~~~
This partially reverts commit 3527b3d320.
Making LocalPointer header-only, with a different namespace when compiling internally,
turned out to be problematic.
Matching PR #883 in the message-format-wg repo.
Also move spec tests for unsupported statements and expressions into new files
to serve as syntax error tests.
Add MessageFormatter::Builder::setErrorHandlingBehavior() method
and a new enum type MessageFormatter::UMFErrorHandlingBehavior
to denote strict or best-effort behavior.
The reason for adding a single method that takes an enum is to allow
for the possibility of more error handling modes in the future.
Co-authored-by: Markus Scherer <markus.icu@gmail.com>
This also updates the spec tests from the current version of the MFWG
repository and removes some duplicate tests.
Spec tests now reflect the message-format-wg repo as of
5612f3b050
It also updates both the ICU4C and ICU4J parsers to follow the
current test schema in the conformance repository.
This includes adding code to both parsers to allow `src` to be
either a single string or an array of strings (per
https://github.com/unicode-org/conformance/pull/255 ),
and eliminating `srcs` in tests.
It also includes other changes to make updated spec tests pass:
ICU4C: Allow trailing whitespace for complex messages, due to spec change
ICU4C: Parse number literals correctly in Number::format
ICU4J: Allow trailing whitespace after complex body, per spec change
ICU4C: Fix bug that was assuming an .input variable can't have a reserved annotation
ICU4C: Fix bug where unsupported '.i' was parsed as an '.input'
ICU4C/ICU4J: Handle markup with space after the initial left curly brace
ICU4C: Check for duplicate variant errors
ICU4C/ICU4J: Handle leading whitespace in complex messages
ICU4J: Treat whitespace after .input keyword as optional
ICU4J: Don't format unannotated number literals as numbers
It's not uncommon for code for Windows to use the _MSC_VER preprocessor
macro to identify that it's being compiled for Windows so it's also not
uncommon for compilers other than the real MSVC compiler to also set
this to be able to compile such code.
It's also not possible to use _MSC_VER to determine whether the C++
standard library implementation used is the Microsoft STL.
Clang will however refuse to instantiate a template with a forward
declared type, so the code that currently does this needs to be moved to
after the type has been properly defined, which in turn makes MSVC warn
that those templates aren't instantiated, so those warnings need to be
disabled, but then the disabling of warning C4661 doesn't work any
longer (for some unknown reason) but this can be resolved by properly
deleting the non-existent operators instead of disabling the warning.
Previously, there were separate overrides for the options and
attributes parsing methods in the parser that were used in different
context. (Options can appear in Operator and Markup, while attributes
can appear in Expression and Markup.)
This is a refactoring that eliminates this duplicated code.
To enable it, a builder is added for the internal OptionMap type.
Separately, this patch also explicitly deletes copy constructors
and copy assignment operators for all Builder classes; a bug in an
earlier version of this patch caused me to notice this hadn't been
done. Also explicitly deletes move constructors/assignment operators
with the exception of OptionMap::Builder (OptionMap is non-public,
so that shouldn't cause confusion).
The implementation was keeping a cache of FormatterFactory
objects so that subsequent calls to the same formatter re-use the
same object.
The problem is that this is unsafe, because
`MFFunctionRegistry::getFormatter()` returns a non-const `FormatterFactory*`;
so if the caller deleted the resulting pointer, the formatter cache
would contain a dangling pointer.
This optimization was added because of an ICU4J test that checked for
the presence of the optimization. However, for separate reasons
(making `adoptFormatter()` actually adopt its argument), this test
was already removed.
The caching could be re-added later if that optimization is needed,
but for now, remove it (also, no tests were checking for its presence).
substitution is using a DecimalFormat and its owning rule also has a modulus substitution. Took out a redundant
call to floor(). Added a hack to allow the caller to change the rounding behavior with setRoundingMode().
Added appropriate unit tests. Added additional documentation of the behavior to the API docs.