Commit graph

56 commits

Author SHA1 Message Date
Fredrik Roubert
0178a07a26 ICU-22793 Clang-Tidy: google-readability-casting
https://releases.llvm.org/17.0.1/tools/clang/tools/extra/docs/clang-tidy/checks/google/readability-casting.html
2024-07-04 22:32:12 +02:00
Markus Scherer
94ef2757a3 ICU-22707 gennorm2 & C++ norm2impl support MaybeNo
- .nrm formatVersion 5
- updated data format doc & design doc
2024-04-29 17:00:55 -07:00
Frank Tang
e1415d1282 ICU-22635 Avoid integer-overflow for invalid large UChar32 2024-01-29 11:57:12 -08:00
Frank Yung-Fong Tang
80414a247b ICU-22224 Enable UBSAN and fix breakage
See #2324
2023-02-27 17:31:49 -08:00
Fredrik Roubert
2de88f9d9c ICU-21833 Replace UChar with char16_t in all C++ code. 2023-02-06 19:27:44 +01:00
Fredrik Roubert
2e0d30cfcf ICU-21833 Replace NULL with nullptr in all C++ code. 2023-02-03 20:20:38 +01:00
Fredrik Roubert
030fa1a479 ICU-21148 Consistently use standard lowercase true/false everywhere.
This is the normal standard way in C, C++ as well as Java and there's no
longer any reason for ICU to be different. The various internal macros
providing custom boolean constants can all be deleted and code as well
as documentation can be updated to use lowercase true/false everywhere.
2022-09-07 20:56:33 +02:00
Andy Heninger
fa30c0eeb4 ICU-21763 UVector cleanup, continued.
Revise uses of UVector in the next batch of files to better handle memory
allocation failures.  This is one of an ongoing series of commits to address
similar problems with UVector usage throughout ICU.

The changes primarily involve switching uses of UVector::addElementX() to the
new adoptElement() or addElement() functions, as appropriate, and using
LocalPointers for tracking memory ownership.
2021-11-30 09:12:16 -08:00
Peter Edberg
e69f337f3c ICU-21669 UPRV_UNREACHABLE > UPRV_UNREACHABLE_EXIT/ASSERT, update usages 2021-09-14 15:22:52 -07:00
Andy Heninger
c26aebe802 ICU-21662 Rename UVector::addElement().
This is the first step towards improving the error handling and out-of-memory
behavior of UVector::addElement(). A followup PR will add back a new addElement()
with corrected error handling, then additional followups will switch call sites
from the original (renamed) function to the new addElement().

This commit includes no logic or behavior changes; it only renames the existing functions.
2021-07-28 15:36:50 -07:00
Markus Scherer
fc12cf095c ICU-21279 decompose (NFD/NFKD) UTF-8 with Edits
See #1518
2021-01-07 15:38:16 -08:00
Jeff Genovy
afaff40164 ICU-20907 Disable optimization on Windows when building for ARM64 with Visual Studio versions below 16.4. 2019-11-27 15:35:58 -08:00
Jeff Genovy
5c8960e59e ICU-20074 Revise UPRV_UNREACHABLE macro to always call abort().
Moved the macro from platform.h to uassert.h.
Removed any "unreachable" code that previously occurred after the UPRV_UNREACHABLE macro is used.
Changes based on review from Andy.

Co-authored-by: Daniel Ju <daju@microsoft.com>
2019-01-24 18:50:04 -08:00
Daniel Ju
7453181fff ICU-20074 Define UPRV_UNREACHABLE macro for unreachable code
Replaced occurrences of U_ASSERT(FALSE) with new UPRV_UNREACHABLE macro.
2019-01-14 14:16:26 -08:00
Markus Scherer
82f0f480d4
ICU-20086 C++ sets & maps for Unicode properties (#93)
also create ucpmap.h from renamed parts of ucptrie.h
2018-09-27 14:27:39 -07:00
Markus Scherer
fe3eb3ed5c
ICU-13530 add UCPTrie/CodePointTrie, switch normalization to use it (#48)
* ICU-13530 copy C/C++ files UTrie2 -> UTrie3

X-SVN-Rev: 40754

* ICU-13530 UTrie3 new files copied from UTrie2: rename types/functions/macros

X-SVN-Rev: 40755

* ICU-13530 debug-print building each UTrie2

X-SVN-Rev: 40756

* ICU-13530 remove two-byte-UTF-8 errorValue block; move highValue from end of data array into header; add errorValue to header

X-SVN-Rev: 40762

* ICU-13530 UTrie3 U16_NEXT/PREV: errorValue for unpaired surrogates

X-SVN-Rev: 40763

* ICU-13530 no more separate values for lead surrogate code units

X-SVN-Rev: 40764

* ICU-13530 change from 11:5 trie bits to 10:6 for simpler UTF-8 code

X-SVN-Rev: 40766

* ICU-13530 UTrie2 build UTrie3 as well, print sizes

X-SVN-Rev: 40767

* ICU-13530 debug-print countSame, sumOverlaps, countInitial

X-SVN-Rev: 40768

* ICU-13530 debug-print whether trie is for CanonIterData

X-SVN-Rev: 40769

* ICU-13530 no index-shift for BMP data, no separate index-2 for 2-byte UTF-8; builder changes incomplete

X-SVN-Rev: 40777

* ICU-13530 remove errorValue and highStart from UNewTrie3

X-SVN-Rev: 40778

* ICU-13530 rewrite UTrie3 builder code

X-SVN-Rev: 40783

* ICU-13530 UTrie3 bug fixes

X-SVN-Rev: 40788

* ICU-13530 fully re-inline _UTRIE3_U8_NEXT()

X-SVN-Rev: 40790

* ICU-13530 find most common all-same data block for dataNullBlock and initialValue

X-SVN-Rev: 40792

* ICU-13530 UTrie3 iterator functions take start and return the end of a range, rather than callback call for each range

X-SVN-Rev: 40800

* ICU-13530 mask off unused data value bits before building a UTrie3 with values less than 32 bits wide

X-SVN-Rev: 40803

* ICU-13530 split utrie3builder.h out of utrie3.h

X-SVN-Rev: 40804

* ICU-13530 separate types UTrie3 vs. UTrie3Builder, implement builder as wrapper over C++ class Trie3Builder in .cpp

X-SVN-Rev: 40809

* ICU-13530 function to make a UTrie3Builder from a UTrie3

X-SVN-Rev: 40810

* ICU-13530 debug-print some data; some cleanup

X-SVN-Rev: 40865

* ICU-13530 BMP 10:6 but supplementary 10:6:4

X-SVN-Rev: 40984

* ICU-13530 move errorValue & highValue to the end of the data table, minimal padding to 4 bytes

X-SVN-Rev: 41011

* ICU-13530 index-1 table gap of index-2 null blocks

X-SVN-Rev: 41018

* ICU-13530 test with more than 128k compacted data

X-SVN-Rev: 41034

* ICU-13530 supplementary bits 11:5:4 saves a little space

X-SVN-Rev: 41039

* ICU-13530 supplementary bits 6:5:5:4 instead of gap: about same size but simpler

X-SVN-Rev: 41050

* ICU-13530 remove unnecessary utrie3_clone(built trie)

X-SVN-Rev: 41058

* ICU-13530 remove unnecessary UTrie3StringIterator

X-SVN-Rev: 41059

* ICU-13530 back to UTRIE3_GET...() macros *returning* data values

X-SVN-Rev: 41060

* ICU-13530 fast vs. small

X-SVN-Rev: 41066

* ICU-13530 always load NFC data, add simple normalization performance test

X-SVN-Rev: 41110

* ICU-13530 change normalization main trie to UTrie3 with special values for lead surrogates; forbid non-inert surrogate code *points* because unable to store values different from code *units*; runtime code work around that for code point lookup and iteration; adjust UTS 46 for normalization no longer mapping unpaired surrogates to U+FFFD

X-SVN-Rev: 41122

* ICU-13530 simplenormperf bug fix and NFC base line

X-SVN-Rev: 41126

* ICU-13530 move normalization getRange skipping lead surrogates to API getRangeSkipLead()

X-SVN-Rev: 41182

* ICU-13530 switch CanonIterData and gennorm2 Norms to UTrie3

X-SVN-Rev: 41183

* ICU-13530 remove unused overwrite parameter from setRange()

X-SVN-Rev: 41184

* ICU-13530 getRange skip lead -> fixed surrogates

X-SVN-Rev: 41219

* ICU-13530 minor cleanup

X-SVN-Rev: 41221

* ICU-13530 UTS 46 code map unpaired surrogates to U+FFFD before normalization

X-SVN-Rev: 41224

* ICU-13530 minor internal-docs cleanup

X-SVN-Rev: 41225

* ICU-13530 rename UTrie3 to UCPTrie, and other name changes

X-SVN-Rev: 41226

* ICU-13530 add 8-bit data option; add type-any & valueBits-any for fromBinary(); macros consistently source type then data width

X-SVN-Rev: 41234

* ICU-13530 scrub the API docs for the proposal

X-SVN-Rev: 41319

* ICU-13530 tag internal definitions as such, or move them to an internal header

X-SVN-Rev: 41320

* ICU-13530 Java API skeleton

X-SVN-Rev: 41326

* ICU-13530 API feedback: ValueWidth, MutableCodePointTrie, base CodePointMap, ...

X-SVN-Rev: 41382

* ICU-13530 add UCPTrie valueWidth field and padding, and combine data pointers into a union

X-SVN-Rev: 41408

* ICU-13530 switch some macros to using dataAccess parameter: separate index vs. data lookups, no macro variant for each value width

X-SVN-Rev: 41409

* ICU-13530 StringIterator is no longer a java.util.Iterator (bad fit)

X-SVN-Rev: 41455

* ICU-13530 CodePointTrie.java code complete

X-SVN-Rev: 41518

* ICU-13530 finish Java port incl test; keep C++ parallel

* ICU-13530 adjust API for feedback: rename HandleValue to FilterValue, change getRange+getRangeFixedSurr(bool allSurr) to enum RangeOption+getRange(enum option); change remaining C macros to use dataAccess for 16/32/8-bit value widths; fix/clarify some API docs

* ICU-13530 add javadoc

* ICU-13530 document UCPTrie binary data format

* ICU-13530 update .nrm formatVersion 3->4, document change in surrogate handling with new trie

* ICU-13530 re-hardcode NFC data

* move trie swapper code into new file; add new files to Windows project files; turn off trie debugging

* ICU-13530 minor cleanup

* ICU-13530 test more range starts; fix a C test leak

* ICU-13530 regenerate Java data from scratch

* ICU-13530 review feedback changes: API docs typos, more @internal, C++11 field initializers, fix potential leak in MutableCodePointTrie::fromUCPTrie()

* ICU-13530 rename interface FilterValue to ValueFilter
2018-09-27 14:27:38 -07:00
Daniel Ju
b13c951348
ICU-20043 ICU-13214 ICU-13764 MSVC W3 and W4 warning cleanup (#53)
Cleaned up all of the MSVC W3 warnings and most of the W4 warnings in the common and i18n projects.
2018-09-27 14:27:38 -07:00
Markus Scherer
68ef77118b ICU-13203 CaseMap UTF-8 add StringPiece->ByteSink overloads; change implementation to that and change array->array versions into wrappers
X-SVN-Rev: 40425
2017-09-18 21:45:11 +00:00
Markus Scherer
e6748afd82 ICU-13197 improved normalization data structure and code; .nrm formatVersion 3; merged from branches/markus/normv3 except for cherry-picks from trunk to there
X-SVN-Rev: 40265
2017-07-14 22:38:40 +00:00
Markus Scherer
06a03303cb ICU-13234 collect string & character options bits in new stringoptions.h
X-SVN-Rev: 40162
2017-06-08 20:35:40 +00:00
Markus Scherer
3975adb564 ICU-13234 rename UCASEMAP_OMIT_UNCHANGED_TEXT to U_OMIT_UNCHANGED_TEXT
X-SVN-Rev: 40161
2017-06-08 19:36:34 +00:00
Markus Scherer
e05c15a02c ICU-13197 fix indexesLength check while loading data, more readable duplicate elimination of noNo mappings
X-SVN-Rev: 40157
2017-06-07 18:22:44 +00:00
Markus Scherer
8dcca5dc76 ICU-13197 Normalizer2::normalizeUTF8(StringPiece->ByteSink/Edits) compose=direct UTF-8, else via UTF-16/no edits
X-SVN-Rev: 40147
2017-05-31 18:15:45 +00:00
Andy Heninger
04448b004f ICU-12764 UTF-8 source files, update file encoding comments.
X-SVN-Rev: 39641
2017-02-03 18:57:23 +00:00
Andy Heninger
242e02c388 ICU-12764 icu4c utf-8 source files, update Copyright notices.
X-SVN-Rev: 39583
2017-01-20 00:20:31 +00:00
Michael Ow
61607c2773 ICU-12564 Update copyright notice in trunk
X-SVN-Rev: 38848
2016-06-15 18:58:17 +00:00
Yoshito Umaoka
00ca13e126 ICU-12564 Reverted r38761 and r38762, because we want to prepend the Unicode copyright for existing source files, instead of replacing copyright comments.
X-SVN-Rev: 38776
2016-05-31 21:45:07 +00:00
Michael Ow
c9f199a30f ICU-12564 Update copyright notice in ICU4C
X-SVN-Rev: 38761
2016-05-26 22:32:17 +00:00
Markus Scherer
0f78abc7ee ICU-9644 re-hardcode some normalization data: nfc.nrm
X-SVN-Rev: 36384
2014-09-08 03:05:56 +00:00
Markus Scherer
e977c057a9 ICU-9101 merge branches/markus/collv2@35225 into the trunk
X-SVN-Rev: 35227
2014-02-25 21:21:49 +00:00
Andy Heninger
8e9b5e0b7e ICU-10301 Remove singleton classes, convert existing usage to UInitOnce
X-SVN-Rev: 34032
2013-08-12 03:35:22 +00:00
Andy Heninger
ae87a3acc2 ICU-10051 Mutexes: introduce UInitOnce; remove UMTX_CHECK; replace all uses of UMTX_CHECK. All the directories this time.
X-SVN-Rev: 33788
2013-06-01 03:37:16 +00:00
Michael Ow
0ca13b73b0 ICU-9292 Merge BEAM warning fixes from branch into trunk
X-SVN-Rev: 31792
2012-05-03 05:50:26 +00:00
Markus Scherer
6390003c87 ICU-9008 some more U_SIGNED_RIGHT_SHIFT_IS_ARITHMETIC fixes; include putilimp.h where that macro is tested
X-SVN-Rev: 31188
2012-01-10 07:15:25 +00:00
Markus Scherer
b19d1bd16a ICU-8915 flag & test for whether signed integer right shift is Arithmetic Shift Right
X-SVN-Rev: 30999
2011-12-01 06:04:35 +00:00
Markus Scherer
bf5ef2ad0e ICU-8804 cast from uint16_t* to UChar* (different types on some platforms)
X-SVN-Rev: 30989
2011-11-29 22:54:44 +00:00
Markus Scherer
524fd241c5 ICU-8942 use smaller/simpler FCD data rather than building an FCD trie
X-SVN-Rev: 30985
2011-11-28 22:59:49 +00:00
Markus Scherer
bed105857f ICU-8804 Normalizer2::composePair(a, b) with separation of minYesNo extraData into combines-forward vs. not
X-SVN-Rev: 30982
2011-11-27 20:29:38 +00:00
Markus Scherer
03748b07f1 ICU-8804 Normalizer2::getRawDecomposition(c) with added data in .nrm formatVersion 2
X-SVN-Rev: 30980
2011-11-27 04:44:13 +00:00
Yoshito Umaoka
e9503bdade ICU-8909 Fixed various warnings reported by a source code analysis tool.
X-SVN-Rev: 30958
2011-11-14 19:32:51 +00:00
Markus Scherer
e31ce99b84 ICU-8575 option for not including utf headers by default; replace uses of deprecated utf_old.h macros
X-SVN-Rev: 30430
2011-07-27 05:53:56 +00:00
Markus Scherer
9f7d74001c ICU-8605 document & test ICU4C dependencies, remove cycles, reduce some deps; merged from branches/markus/depstest -r 30155:30193
X-SVN-Rev: 30194
2011-06-03 05:23:57 +00:00
Markus Scherer
af43054b4e ICU-7848 normalize-append will never modify first-suffix if second begins with a has-boundary-before character; leave safeMiddle empty in that case
X-SVN-Rev: 30020
2011-05-04 13:44:04 +00:00
Markus Scherer
56b28bd292 ICU-7848 make normalize-append restore the middle string section (the relevant suffix of the first string) when something goes wrong (especially C buffer overflow)
X-SVN-Rev: 30014
2011-05-04 05:50:20 +00:00
Markus Scherer
0acda636e4 ICU-7722 build canonical-iterator data from nfc.nrm (port Java code to C++)
X-SVN-Rev: 28117
2010-06-01 06:10:26 +00:00
Markus Scherer
77543b3e58 ICU-7703 fix unorm_normalize(src, length=-1) bug and allow src=NULL if length=0
X-SVN-Rev: 28115
2010-05-30 23:00:52 +00:00
Steven R. Loomis
a1ea70071b ICU-7708 compiler warnings for 4.5.1 (batch 1)
X-SVN-Rev: 28103
2010-05-25 22:17:12 +00:00
Markus Scherer
d928bb24e1 ICU-7273 minor internal doc fixes from code review
X-SVN-Rev: 27663
2010-02-24 23:57:40 +00:00
Markus Scherer
7a3a89e61f ICU-7273 remove now-unused unorm.icu, and small changes parallel with Java
X-SVN-Rev: 27562
2010-02-13 23:15:05 +00:00
Markus Scherer
7f34717f2e ICU-7273 minor C++ changes parallel with Java
X-SVN-Rev: 27494
2010-02-04 23:57:28 +00:00