This is the java port of ICU-21043 (for C++)
This PR fixes
ICU-21043 Erroneous date display in indian calendar of all dates prior to 0001-01-01.
ICU-21044 Hebrew Calendar calculation is incorrect when the year < 1
ICU-21045 Erroneous date display in islamic and islamic-rgsa calendars of all dates prior to 0622-07-18.
ICU-21046 Erroneous date display in islamic-umalqura calendar of all dates prior to -195366-07-23.
The problem in the IndianCalendarl is
ICU-21043 the gregorian/julain convesion is wrong. Swith to use the
calculation function in the Calendar class.
The problem in the HebrewCalendar is
ICU-21044 the use of bulit in / is wrong when the year or month could be < 1.
The problem in the IslamicCalendar is
ICU-21045: The math of % negative number for year and month is wrong.
Also add tests to exhaust test 8000 years for all calendar. In quick
mode, only test 2.5 years.
reduce the number of date in quick mode
Internal API VersionInfo.javaVersion() maps Java version number to 4 integer fields. Each field must be up to 255. However, recent OpenJDK 8 update exceed this range.
Luckily, we have only one reference in our code base for checking Java version. CharsetUTF16 uses maxBytePerChar = 4 for Java 5 and older, maxBytePerChar = 2 for newer Java version. Because we no longer support Java 5 runtime, we don't need this conditional check.
We don't have any other uses of VersionInfo.javaVersion(). Java's version range is not what we can control, so I decided to delete the internal use only API completely.
Change the mapping from rule number to boundary position to use a simple array
instead of a linear search lookup map.
Look-ahead rules have a preceding context, a boundary position, and following context.
In the implementation, when the preceding context matches, the potential boundary
position is saved. Then, if the following context proves to match, the saved boundary is
returned as an actual boundary.
Look-ahead rules are numbered, and the implementation maintains a map from
rule number to the tentative saved boundary position.
In an earlier improvement to the rule builder, the rule numbering was changed to be a
contiguous sequence, from the original sparse numbering. In anticipation of
changing the mapping from number to position to use a simple array.
For identifying text that needs to be handled by a word dictionary for Break Iteration,
change from using a bit in the character category to sorting all dictionary categories
together, and recording the boundary between the non-dictionary and dictionary ranges.
This is internal to the implementaion. It does not affect behavior.
It does increase the number of character categories that can be handled using a
compact 8 bit Trie, from 127 to 255.
- Check non-lenient rules before call lenint parsing
- Remove logKnownIssue 9503 from test code
- Adjust TestAllLocales test on ICU4C
- Add lenient checks on ICU4J
Fix the issue identified by Coverity.
The problem was in code handling the mapping from the table build time
representation of a set of status values for an RBBI rule to the corresponding
status data as saved in a binary RBBI rule file.
The problem was benign, the rbbi data built by the incorrect code would
would still operate correctly, although it might not byte-for-byte match
that built by ICU4C. (The problem was in Java only.)