Commit graph

115 commits

Author SHA1 Message Date
Fredrik Roubert
0178a07a26 ICU-22793 Clang-Tidy: google-readability-casting
https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/google/readability-casting.html
2024-07-04 22:32:12 +02:00
Frank Tang
e1415d1282 ICU-22635 Avoid integer-overflow for invalid large UChar32 2024-01-29 11:57:12 -08:00
Markus Scherer
d8659b476d ICU-22404 new properties IDS_Unary_Operator, ID_Compat_Math_*, NFKC_SCF 2023-09-16 14:41:51 -07:00
Markus Scherer
b6dcc95d3c ICU-21833 remove redundant void parameter lists
See #2351
2023-03-02 09:31:57 -08:00
Fredrik Roubert
2de88f9d9c ICU-21833 Replace UChar with char16_t in all C++ code. 2023-02-06 19:27:44 +01:00
Fredrik Roubert
030fa1a479 ICU-21148 Consistently use standard lowercase true/false everywhere.
This is the normal standard way in C, C++ as well as Java and there's no
longer any reason for ICU to be different. The various internal macros
providing custom boolean constants can all be deleted and code as well
as documentation can be updated to use lowercase true/false everywhere.
2022-09-07 20:56:33 +02:00
Markus Scherer
fc12cf095c ICU-21279 decompose (NFD/NFKD) UTF-8 with Edits
See #1518
2021-01-07 15:38:16 -08:00
Markus Scherer
524748c6bf ICU-20984 StringPiece & ByteSink overloads for char8_t* 2020-03-16 10:49:21 -07:00
Laurent Stacul
3b58179396 ICU-20972 Fix invalid conversion from const char8_t* to const char* (C++20) 2020-02-20 13:09:18 -08:00
Markus Scherer
d3315d98ef ICU-20783 use C++ covariant return types 2019-08-23 11:45:36 -07:00
Jeff Genovy
33d7868d45 ICU-20351 Warning cleanup changes for ICU4C under MSVC. 2019-01-16 16:43:02 -08:00
Markus Scherer
fe3eb3ed5c
ICU-13530 add UCPTrie/CodePointTrie, switch normalization to use it (#48)
* ICU-13530 copy C/C++ files UTrie2 -> UTrie3

X-SVN-Rev: 40754

* ICU-13530 UTrie3 new files copied from UTrie2: rename types/functions/macros

X-SVN-Rev: 40755

* ICU-13530 debug-print building each UTrie2

X-SVN-Rev: 40756

* ICU-13530 remove two-byte-UTF-8 errorValue block; move highValue from end of data array into header; add errorValue to header

X-SVN-Rev: 40762

* ICU-13530 UTrie3 U16_NEXT/PREV: errorValue for unpaired surrogates

X-SVN-Rev: 40763

* ICU-13530 no more separate values for lead surrogate code units

X-SVN-Rev: 40764

* ICU-13530 change from 11:5 trie bits to 10:6 for simpler UTF-8 code

X-SVN-Rev: 40766

* ICU-13530 UTrie2 build UTrie3 as well, print sizes

X-SVN-Rev: 40767

* ICU-13530 debug-print countSame, sumOverlaps, countInitial

X-SVN-Rev: 40768

* ICU-13530 debug-print whether trie is for CanonIterData

X-SVN-Rev: 40769

* ICU-13530 no index-shift for BMP data, no separate index-2 for 2-byte UTF-8; builder changes incomplete

X-SVN-Rev: 40777

* ICU-13530 remove errorValue and highStart from UNewTrie3

X-SVN-Rev: 40778

* ICU-13530 rewrite UTrie3 builder code

X-SVN-Rev: 40783

* ICU-13530 UTrie3 bug fixes

X-SVN-Rev: 40788

* ICU-13530 fully re-inline _UTRIE3_U8_NEXT()

X-SVN-Rev: 40790

* ICU-13530 find most common all-same data block for dataNullBlock and initialValue

X-SVN-Rev: 40792

* ICU-13530 UTrie3 iterator functions take start and return the end of a range, rather than callback call for each range

X-SVN-Rev: 40800

* ICU-13530 mask off unused data value bits before building a UTrie3 with values less than 32 bits wide

X-SVN-Rev: 40803

* ICU-13530 split utrie3builder.h out of utrie3.h

X-SVN-Rev: 40804

* ICU-13530 separate types UTrie3 vs. UTrie3Builder, implement builder as wrapper over C++ class Trie3Builder in .cpp

X-SVN-Rev: 40809

* ICU-13530 function to make a UTrie3Builder from a UTrie3

X-SVN-Rev: 40810

* ICU-13530 debug-print some data; some cleanup

X-SVN-Rev: 40865

* ICU-13530 BMP 10:6 but supplementary 10:6:4

X-SVN-Rev: 40984

* ICU-13530 move errorValue & highValue to the end of the data table, minimal padding to 4 bytes

X-SVN-Rev: 41011

* ICU-13530 index-1 table gap of index-2 null blocks

X-SVN-Rev: 41018

* ICU-13530 test with more than 128k compacted data

X-SVN-Rev: 41034

* ICU-13530 supplementary bits 11:5:4 saves a little space

X-SVN-Rev: 41039

* ICU-13530 supplementary bits 6:5:5:4 instead of gap: about same size but simpler

X-SVN-Rev: 41050

* ICU-13530 remove unnecessary utrie3_clone(built trie)

X-SVN-Rev: 41058

* ICU-13530 remove unnecessary UTrie3StringIterator

X-SVN-Rev: 41059

* ICU-13530 back to UTRIE3_GET...() macros *returning* data values

X-SVN-Rev: 41060

* ICU-13530 fast vs. small

X-SVN-Rev: 41066

* ICU-13530 always load NFC data, add simple normalization performance test

X-SVN-Rev: 41110

* ICU-13530 change normalization main trie to UTrie3 with special values for lead surrogates; forbid non-inert surrogate code *points* because unable to store values different from code *units*; runtime code work around that for code point lookup and iteration; adjust UTS 46 for normalization no longer mapping unpaired surrogates to U+FFFD

X-SVN-Rev: 41122

* ICU-13530 simplenormperf bug fix and NFC base line

X-SVN-Rev: 41126

* ICU-13530 move normalization getRange skipping lead surrogates to API getRangeSkipLead()

X-SVN-Rev: 41182

* ICU-13530 switch CanonIterData and gennorm2 Norms to UTrie3

X-SVN-Rev: 41183

* ICU-13530 remove unused overwrite parameter from setRange()

X-SVN-Rev: 41184

* ICU-13530 getRange skip lead -> fixed surrogates

X-SVN-Rev: 41219

* ICU-13530 minor cleanup

X-SVN-Rev: 41221

* ICU-13530 UTS 46 code map unpaired surrogates to U+FFFD before normalization

X-SVN-Rev: 41224

* ICU-13530 minor internal-docs cleanup

X-SVN-Rev: 41225

* ICU-13530 rename UTrie3 to UCPTrie, and other name changes

X-SVN-Rev: 41226

* ICU-13530 add 8-bit data option; add type-any & valueBits-any for fromBinary(); macros consistently source type then data width

X-SVN-Rev: 41234

* ICU-13530 scrub the API docs for the proposal

X-SVN-Rev: 41319

* ICU-13530 tag internal definitions as such, or move them to an internal header

X-SVN-Rev: 41320

* ICU-13530 Java API skeleton

X-SVN-Rev: 41326

* ICU-13530 API feedback: ValueWidth, MutableCodePointTrie, base CodePointMap, ...

X-SVN-Rev: 41382

* ICU-13530 add UCPTrie valueWidth field and padding, and combine data pointers into a union

X-SVN-Rev: 41408

* ICU-13530 switch some macros to using dataAccess parameter: separate index vs. data lookups, no macro variant for each value width

X-SVN-Rev: 41409

* ICU-13530 StringIterator is no longer a java.util.Iterator (bad fit)

X-SVN-Rev: 41455

* ICU-13530 CodePointTrie.java code complete

X-SVN-Rev: 41518

* ICU-13530 finish Java port incl test; keep C++ parallel

* ICU-13530 adjust API for feedback: rename HandleValue to FilterValue, change getRange+getRangeFixedSurr(bool allSurr) to enum RangeOption+getRange(enum option); change remaining C macros to use dataAccess for 16/32/8-bit value widths; fix/clarify some API docs

* ICU-13530 add javadoc

* ICU-13530 document UCPTrie binary data format

* ICU-13530 update .nrm formatVersion 3->4, document change in surrogate handling with new trie

* ICU-13530 re-hardcode NFC data

* move trie swapper code into new file; add new files to Windows project files; turn off trie debugging

* ICU-13530 minor cleanup

* ICU-13530 test more range starts; fix a C test leak

* ICU-13530 regenerate Java data from scratch

* ICU-13530 review feedback changes: API docs typos, more @internal, C++11 field initializers, fix potential leak in MutableCodePointTrie::fromUCPTrie()

* ICU-13530 rename interface FilterValue to ValueFilter
2018-09-27 14:27:38 -07:00
Shane Carr
1fe1497d88 ICU-13661 Renaming logIfFailureAndReset to errIfFailureAndReset.
X-SVN-Rev: 41362
2018-05-08 23:55:47 +00:00
Markus Scherer
2f87cf4c46 ICU-10524 normalization one-way mapping with trailing ccc>1 has no compose-boundary-after
X-SVN-Rev: 40355
2017-08-25 22:46:12 +00:00
Markus Scherer
9a3a03c417 ICU-13270 icu::Edits add numberOfChanges(); Edits::Iterator add findDestinationIndex(), destinationIndexFromSourceIndex(), sourceIndexFromDestinationIndex()
X-SVN-Rev: 40286
2017-07-24 22:43:53 +00:00
Markus Scherer
aa6d5e3e76 ICU-13271 add Normalizer2::isNormalizedUTF8()
X-SVN-Rev: 40280
2017-07-20 22:08:30 +00:00
Markus Scherer
09b77193dc ICU-13269 add StringByteSink(dest, initialAppendCapacity) constructor
X-SVN-Rev: 40277
2017-07-20 19:56:45 +00:00
Markus Scherer
e6748afd82 ICU-13197 improved normalization data structure and code; .nrm formatVersion 3; merged from branches/markus/normv3 except for cherry-picks from trunk to there
X-SVN-Rev: 40265
2017-07-14 22:38:40 +00:00
Markus Scherer
06a03303cb ICU-13234 collect string & character options bits in new stringoptions.h
X-SVN-Rev: 40162
2017-06-08 20:35:40 +00:00
Markus Scherer
3975adb564 ICU-13234 rename UCASEMAP_OMIT_UNCHANGED_TEXT to U_OMIT_UNCHANGED_TEXT
X-SVN-Rev: 40161
2017-06-08 19:36:34 +00:00
Markus Scherer
f3b00dc8ff ICU-13197 test Normalizer2::normalizeUTF8() with Edits
X-SVN-Rev: 40148
2017-06-02 21:19:33 +00:00
Andy Heninger
242e02c388 ICU-12764 icu4c utf-8 source files, update Copyright notices.
X-SVN-Rev: 39583
2017-01-20 00:20:31 +00:00
Michael Ow
61607c2773 ICU-12564 Update copyright notice in trunk
X-SVN-Rev: 38848
2016-06-15 18:58:17 +00:00
Yoshito Umaoka
00ca13e126 ICU-12564 Reverted r38761 and r38762, because we want to prepend the Unicode copyright for existing source files, instead of replacing copyright comments.
X-SVN-Rev: 38776
2016-05-31 21:45:07 +00:00
Michael Ow
c9f199a30f ICU-12564 Update copyright notice in ICU4C
X-SVN-Rev: 38761
2016-05-26 22:32:17 +00:00
Fredrik Roubert
7f4b8d106b ICU-12012 Replace all sizeof p / sizeof *p with UPRV_LENGTHOF().
R=markus.icu@gmail.com

Review URL: https://codereview.appspot.com/285520043 .

X-SVN-Rev: 38337
2016-02-23 10:40:09 +00:00
Markus Scherer
0f78abc7ee ICU-9644 re-hardcode some normalization data: nfc.nrm
X-SVN-Rev: 36384
2014-09-08 03:05:56 +00:00
Steven R. Loomis
7594250cc5 ICU-7653 changed uprv_lengthof to UPRV_LENGTHOF, also added apidoc
X-SVN-Rev: 36275
2014-08-28 22:13:45 +00:00
Tom Zhang
ee1f29b584 ICU-7653 move LENGTHOF(array) to common, internal header
X-SVN-Rev: 36265
2014-08-28 14:55:34 +00:00
Markus Scherer
3a86b119b0 ICU-8246 add Normalizer2::getNFCInstance(), getNFKDInstance(), ...
X-SVN-Rev: 30994
2011-12-01 00:43:35 +00:00
Markus Scherer
bed105857f ICU-8804 Normalizer2::composePair(a, b) with separation of minYesNo extraData into combines-forward vs. not
X-SVN-Rev: 30982
2011-11-27 20:29:38 +00:00
Markus Scherer
03748b07f1 ICU-8804 Normalizer2::getRawDecomposition(c) with added data in .nrm formatVersion 2
X-SVN-Rev: 30980
2011-11-27 04:44:13 +00:00
Markus Scherer
e31ce99b84 ICU-8575 option for not including utf headers by default; replace uses of deprecated utf_old.h macros
X-SVN-Rev: 30430
2011-07-27 05:53:56 +00:00
Markus Scherer
4abbf7161a ICU-8606 add Normalizer2.getCombiningClass(c)
X-SVN-Rev: 30263
2011-06-30 23:22:17 +00:00
Michael Ow
b21e2734dd ICU-8146 Add check for data loading failure in cintltst and intltest.
X-SVN-Rev: 29025
2010-11-11 05:37:40 +00:00
Michael Ow
2333b126c1 ICU-6845 Improve the code coverage in ICU4C.
X-SVN-Rev: 28790
2010-10-12 16:38:38 +00:00
Andy Heninger
4997366db1 ICU-7987 Fix memory leak in intltest normalize
X-SVN-Rev: 28704
2010-09-27 18:47:37 +00:00
Markus Scherer
b5e1330176 ICU-7264 merge Unicode 6.0 into trunk from branches/markus/uni60 -r 28339:28657
X-SVN-Rev: 28661
2010-09-21 00:12:49 +00:00
Michael Ow
0e8df2ba58 ICU-7784 Set some test failure errors as data loading error appropriate.
X-SVN-Rev: 28305
2010-07-14 16:09:03 +00:00
Markus Scherer
9cbd929fca ICU-7736 test and fix FilteredNormalizer2::getDecomposition()
X-SVN-Rev: 28163
2010-06-09 06:08:43 +00:00
Markus Scherer
82160e104c ICU-7736 add Normalizer2::getDecomposition(c)
X-SVN-Rev: 28161
2010-06-08 23:32:11 +00:00
Markus Scherer
0acda636e4 ICU-7722 build canonical-iterator data from nfc.nrm (port Java code to C++)
X-SVN-Rev: 28117
2010-06-01 06:10:26 +00:00
Steven R. Loomis
a1ea70071b ICU-7708 compiler warnings for 4.5.1 (batch 1)
X-SVN-Rev: 28103
2010-05-25 22:17:12 +00:00
Michael Ow
43ab52b074 ICU-7650 Fix uconfig test errors in i18n library and test codes.
X-SVN-Rev: 28037
2010-05-07 07:28:47 +00:00
Michael Ow
0763686c6c ICU-7370 Log data errors to ensure that intltest and cintltst passes without data.
X-SVN-Rev: 27649
2010-02-24 16:17:03 +00:00
Markus Scherer
049b68b40b ICU-7273 simplify caching code and add custom FCC test
X-SVN-Rev: 27593
2010-02-18 18:33:00 +00:00
Markus Scherer
81234fecdb ICU-7273 add loading of custom data, with caching, test data and test code
X-SVN-Rev: 27578
2010-02-16 23:43:22 +00:00
Markus Scherer
8ddbd1394c ICU-7273 merge in Normalizer2 API & code, and ICU-5785 UnicodeSet::span(UnicodeString) and ICU-7296 tempSubString()/retainBetween(); merge -r 26971:27150 branches/markus/norm2
X-SVN-Rev: 27155
2010-01-06 23:50:03 +00:00
Markus Scherer
66b63f9c48 ICU-7084 Unicode 5.2: merge -r 26464:26890 branches/markus/uni52 into trunk, and a little cleanup (C++)
X-SVN-Rev: 26898
2009-11-13 19:25:21 +00:00
Andy Heninger
71bf003171 ICU-5696 Unicode 5.1 Update
X-SVN-Rev: 23761
2008-04-04 22:47:43 +00:00