Commit graph

170 commits

Author SHA1 Message Date
Markus Scherer
d8659b476d ICU-22404 new properties IDS_Unary_Operator, ID_Compat_Math_*, NFKC_SCF 2023-09-16 14:41:51 -07:00
Elango Cheran
2e45e6ec0e ICU-22404 Unicode 15.1 beta data files & API constants
See #2492

Co-authored-by: Andy Heninger <andy.heninger@gmail.com>
Co-authored-by: Robin Leroy <egg.robin.leroy@gmail.com>
2023-07-13 19:26:14 -07:00
Markus Scherer
b6dcc95d3c ICU-21833 remove redundant void parameter lists
See #2351
2023-03-02 09:31:57 -08:00
Fredrik Roubert
a3cbe80909 ICU-21833 Replace U_OVERRIDE with override everywhere. 2023-02-22 18:28:07 +01:00
Fredrik Roubert
2de88f9d9c ICU-21833 Replace UChar with char16_t in all C++ code. 2023-02-06 19:27:44 +01:00
Fredrik Roubert
2e0d30cfcf ICU-21833 Replace NULL with nullptr in all C++ code. 2023-02-03 20:20:38 +01:00
Fredrik Roubert
030fa1a479 ICU-21148 Consistently use standard lowercase true/false everywhere.
This is the normal standard way in C, C++ as well as Java and there's no
longer any reason for ICU to be different. The various internal macros
providing custom boolean constants can all be deleted and code as well
as documentation can be updated to use lowercase true/false everywhere.
2022-09-07 20:56:33 +02:00
Henri Sivonen
3cefbd55c7 ICU-22028 Export collation and normalization data for ICU4X 2022-06-28 08:37:32 -07:00
Markus Scherer
c5d0fff5a0 ICU-21980 parse multiple @missing lines 2022-06-02 21:29:24 +00:00
Markus Scherer
e1be738ccb ICU-21980 Unicode 15 pre-beta data files, new prop values 2022-05-25 18:23:11 +00:00
gnrunge
f37a5e0090 ICU-21796 Rename bazel build files from BUILD to BUILD.bazel. This can
prevent conflicts when ICU users have their own BUILD files already.
2021-12-16 06:55:09 -08:00
Markus Scherer
75ac80bd68 ICU-21580 change site.icu-project.org to icu.unicode.org etc 2021-10-21 15:54:42 -07:00
Markus Scherer
f9beb616a8 ICU-21652 add emoji properties of strings
- 7 new properties: API constants & property names
- u_stringHasBinaryProperty(s, property) & UCharacter.hasBinaryProperty(s, property)
- two additional source data files
- new genprops part for writing new binary data file uemoji.icu
- data for existing emoji properties moved from uprops.icu (hardcoded in C++) to uemoji.icu (always loaded)
- new EmojiProps implementation
2021-09-08 12:15:50 -07:00
Markus Scherer
41aa7159ea ICU-21635 Unicode 14 data files 20210820, line break LB30b.2
See #1807
2021-08-23 22:11:49 +00:00
Markus Scherer
d4c92ebcfc ICU-21635 Unicode 14 beta 2021-06-21 22:26:15 +00:00
Erik Torres
3f043c7693 ICU-21555 Fix typos from G to L
See #1737
2021-06-07 16:09:09 -07:00
Elango Cheran
227c729b0e ICU-21117 Use Bazel to automate generation of Unicode data files 2021-03-24 10:39:38 -07:00
gnrunge
d0096a84e7 ICU-21243 Migrates preparseucd.py script to Python 3. Python 3 changes
the order of elements in an iterator from Python 2 with the result
that the generated data in ppucd.txt changes with respect to the selection
of a property value used to compact the output when there is a
property with equal count of the two most frequent values. This
change doesn't change the validity of the generated ppucd.txt file.

While at it, also migrated script parsescriptmetadata.py to Python 3.
2020-12-01 13:02:52 -08:00
Markus Scherer
f62693aa02 ICU-13416 change Armenian (hy) uppercase/titlecase of և ligature ech-yiwn 2020-08-30 18:19:10 -07:00
Markus Scherer
a7e378d587 ICU-20893 Unicode 13 beta
See PR #915, see changes.txt
- Unicode 13 beta data as of 2019-nov-21
- uprops.icu format version 7.7 with more bits for Script/Script_Extensions
- more bits in spoof checker ScriptSet
- root line break rules adjusted for UAX 14 changes, from Andy
- line break tailorings not yet in sync with root
2019-11-21 17:35:53 -08:00
Markus Scherer
0565894534 ICU-20497 Unicode 12.1 2019-04-04 10:23:24 -07:00
Markus Scherer
d2e3a8847d ICU-20111 move text layout properties data into a new ulayout.icu data file 2019-02-14 08:30:57 -08:00
Markus Scherer
ea7c030961 ICU-20203 update ICU to Unicode 12 beta
- data as of 2018-nov-26
- API constants for new blocks & scripts
- sync RBBIMonkeyTest.java test data with C++
2018-11-28 23:13:07 +01:00
Markus Scherer
d2ec8987a7
ICU-8966 ICU-12850 add API/data/code for text layout properties InPC, InSC, vo (#92)
ICU-8966: Indic_Positional_Category & Indic_Syllabic_Category

ICU-12850: Vertical_Orientation
2018-09-27 14:27:39 -07:00
Fredrik Roubert
12e2a72747
ICU-20062 Set the Python -B flag to inhibit the writing of .pyc files.
This will prevent littering the source tree with spurious .pyc files.
The potential faster execution when re-running a script that has an
up-to-date .pyc file is negligible.
2018-09-27 14:27:38 -07:00
Markus Scherer
ebca759ea1 ICU-13630 Unicode 11 update from near-final data 20180521
X-SVN-Rev: 41426
2018-05-22 01:56:20 +00:00
Markus Scherer
a4e66ded6d ICU-13630 switch from IdnaTest.txt to IdnaTestV2.txt new in Unicode 11 see Unicode PRI 375
X-SVN-Rev: 41294
2018-04-30 03:17:11 +00:00
Markus Scherer
03303a6cb6 ICU-13630 Unicode 11 beta data apr02 (security apr03), fix ICU4C tests except RBBI
X-SVN-Rev: 41191
2018-04-03 23:09:49 +00:00
Markus Scherer
af6a771267 ICU-13630 implement, test, use emoji property Extended_Pictographic
X-SVN-Rev: 41094
2018-03-12 05:53:02 +00:00
Markus Scherer
b3aec18a3c ICU-13630 ucase.icu formatVersion 4: more compressible exceptions, and more room for future exceptions growth
X-SVN-Rev: 41093
2018-03-12 00:15:40 +00:00
Markus Scherer
1752b5c8c9 ICU-13630 Unicode 11 beta data mar06, API constants for new property values
X-SVN-Rev: 41092
2018-03-09 23:53:02 +00:00
Markus Scherer
cf4cb10c3d ICU-13462 fix Script_Extensions for 5 characters: data generator needs to revert them from block scx to sc (merged from maint-60 r40667)
X-SVN-Rev: 40699
2017-12-05 20:53:14 +00:00
Yoshito Umaoka
1870215131 ICU-13358 Fixed cpyscan problems. Enhanced cpyscan.pl to use online version of cpyskip.txt by default. Added the new Unicode copyright comment in many tools files.
X-SVN-Rev: 40527
2017-10-03 02:32:50 +00:00
Markus Scherer
acf2b4cc82 ICU-13186 stop prepending UTF-8 BOM to some Unicode files
X-SVN-Rev: 40149
2017-06-02 22:52:19 +00:00
Markus Scherer
b2ead3e2e1 ICU-8130 UTS 46 conformance test using Unicode IdnaTest.txt
X-SVN-Rev: 40130
2017-05-23 04:44:58 +00:00
Markus Scherer
20bee936b1 ICU-12985 ppucd.txt more readable unassigned ranges; block compaction by size savings not value plurality reduces clutter
X-SVN-Rev: 40096
2017-05-02 22:53:28 +00:00
Markus Scherer
761c994436 ICU-12985 pre-parse VerticalOrientation.txt
X-SVN-Rev: 40086
2017-04-28 20:29:22 +00:00
Markus Scherer
eb57bf7c90 ICU-12985 implement the binary Prepended_Concatenation_Mark property
X-SVN-Rev: 40084
2017-04-27 21:11:01 +00:00
Markus Scherer
6ce7f348a3 ICU-12985 implement the binary Emoji_Component property for emoji 5
X-SVN-Rev: 40082
2017-04-26 23:58:36 +00:00
Markus Scherer
edce2be62c ICU-12985 Unicode 10 data 20170418, new property values, adjust tools & tests
X-SVN-Rev: 40079
2017-04-26 21:17:13 +00:00
Markus Scherer
1982037316 ICU-12900 change ppucd.txt for copyright scanner patterns
X-SVN-Rev: 39921
2017-03-23 17:30:41 +00:00
Markus Scherer
466a569c58 ICU-12900 mostly still Unicode 9.0 but Unicode 10 beta (20170322) segmentation & bidi data and draft emoji 5.0 (also 20170322)
X-SVN-Rev: 39915
2017-03-23 02:14:00 +00:00
Markus Scherer
798f5235dd ICU-12526 genuca: add new script sample characters, more readable error output
X-SVN-Rev: 38716
2016-05-06 23:19:36 +00:00
Markus Scherer
8d3a176d4f ICU-12526 ignore inline comments in script metadata
X-SVN-Rev: 38709
2016-05-05 23:53:32 +00:00
Markus Scherer
3e5578f3bf ICU-12526 uprops.icu formatVersion 7.3: support new fraction numeric values like 3/80; ppucd.txt mostly no block compression for String/Misc properties; minor bug fixes
X-SVN-Rev: 38706
2016-05-05 22:51:18 +00:00
Markus Scherer
dbebd188e7 ICU-12526 initial Unicode 9 data
X-SVN-Rev: 38698
2016-05-04 23:54:37 +00:00
Markus Scherer
e70b98d3f6 ICU-11764 8 new script codes for Unicode 9 & CLDR 29
X-SVN-Rev: 38607
2016-04-08 22:23:14 +00:00
Markus Scherer
0390f4c86c ICU-11802 add 4 Emoji properties from emoji-data.txt 2.0
X-SVN-Rev: 38182
2016-01-21 04:34:33 +00:00
Markus Scherer
f39f59f3a9 ICU-11574 genuca new script sample characters
X-SVN-Rev: 37381
2015-04-22 23:07:33 +00:00
Markus Scherer
f917158366 ICU-11574 fix Cherokee case folding data: add scf->self mappings for characters that do not have sfc (map to self) but have slc (ICU code: sfc falls back to slc)
X-SVN-Rev: 37374
2015-04-21 23:55:18 +00:00