Commit graph

3 commits

Author SHA1 Message Date
Mihai Nita
2fa8a0908c ICU-22773 Migrate the CLDR conversion tool to Maven 2024-12-09 13:15:13 -08:00
Frank Yung-Fong Tang
2a72af07ac ICU-21569 LSTM Part 3 Add Java implementation
See #1706
2021-05-08 21:15:44 -07:00
Frank Tang
9a2177c575 ICU-21569 Add GA to test LSTM configuration
1. Add GA to test BreakIterator under LSTM configuration (remove Thai
and Burmese dictionary and include Thai and Burmese LSTM)
2. Add LSTMDataName for the purpose of testing.
3. Add file base test code to test BreakIterator match results from test
file generated by pythong code in
https://github.com/unicode-org/lstm_word_segmentation/blob/master/segment_text.py
4. Fix a LSTMBreakEngine::divideUpDictionaryRange bug when the return value
should only contains the number of words found when the passed in foundBreaks
already contains some data.
5. Change the cintltest TestSwapData from testing thaidict to laodict so
it will not break while we filter out thaidict under the LSTM
configuration.
2021-04-30 20:02:09 -07:00