ICU-21310 Consolidate ICU4C and ICU4J Readmes via the User Guide

This commit is contained in:
Elango Cheran 2021-02-04 23:11:30 -08:00 committed by Elango
parent 0ad4614a04
commit 35fe8534f2
40 changed files with 633 additions and 3362 deletions

1
.gitignore vendored
View file

@ -52,6 +52,7 @@ Debug/
Generated[!!-~]Files/
Release/
__pycache__/
_site/
arm/
arm64/
bin/

View file

@ -1,7 +1,7 @@
GEM
remote: https://rubygems.org/
specs:
activesupport (6.0.3.4)
activesupport (6.0.3.5)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (>= 0.7, < 2)
minitest (~> 5.1)
@ -16,7 +16,7 @@ GEM
colorator (1.1.0)
commonmarker (0.17.13)
ruby-enum (~> 0.5)
concurrent-ruby (1.1.7)
concurrent-ruby (1.1.8)
dnsruby (1.61.5)
simpleidn (~> 0.1)
em-websocket (0.5.2)
@ -26,10 +26,12 @@ GEM
ffi (>= 1.3.0)
eventmachine (1.2.7)
execjs (2.7.0)
faraday (1.1.0)
faraday (1.3.0)
faraday-net_http (~> 1.0)
multipart-post (>= 1.2, < 3)
ruby2_keywords
ffi (1.13.1)
faraday-net_http (1.0.1)
ffi (1.15.0)
forwardable-extended (2.6.0)
gemoji (3.0.1)
github-pages (207)
@ -202,34 +204,36 @@ GEM
kramdown-parser-gfm (1.1.0)
kramdown (~> 2.0)
liquid (4.0.3)
listen (3.3.0)
listen (3.4.1)
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
mercenary (0.3.6)
mini_portile2 (2.4.0)
mini_portile2 (2.5.0)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.14.2)
minitest (5.14.4)
multipart-post (2.1.1)
nokogiri (1.10.10)
mini_portile2 (~> 2.4.0)
octokit (4.19.0)
nokogiri (1.11.1)
mini_portile2 (~> 2.5.0)
racc (~> 1.4)
octokit (4.20.0)
faraday (>= 0.9)
sawyer (~> 0.8.0, >= 0.5.3)
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (3.1.1)
rake (13.0.1)
racc (1.5.2)
rake (13.0.3)
rb-fsevent (0.10.4)
rb-inotify (0.10.1)
ffi (~> 1.0)
rexml (3.2.4)
rouge (3.19.0)
ruby-enum (0.8.0)
ruby-enum (0.9.0)
i18n
ruby2_keywords (0.0.2)
ruby2_keywords (0.0.4)
rubyzip (2.3.0)
safe_yaml (1.0.5)
sass (3.7.4)
@ -240,20 +244,20 @@ GEM
sawyer (0.8.2)
addressable (>= 2.3.5)
faraday (> 0.8, < 2.0)
simpleidn (0.1.1)
simpleidn (0.2.1)
unf (~> 0.1.4)
terminal-table (1.8.0)
unicode-display_width (~> 1.1, >= 1.1.1)
thread_safe (0.3.6)
typhoeus (1.4.0)
ethon (>= 0.9.0)
tzinfo (1.2.8)
tzinfo (1.2.9)
thread_safe (~> 0.1)
unf (0.1.4)
unf_ext
unf_ext (0.0.7.7)
unicode-display_width (1.7.0)
zeitwerk (2.4.1)
zeitwerk (2.4.2)
PLATFORMS
ruby

View file

@ -1,7 +1,7 @@
---
layout: default
title: Boundary Analysis
nav_order: 10
nav_order: 12
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Collation
nav_order: 9
nav_order: 11
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Conversion
nav_order: 4
nav_order: 6
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Date/Time
nav_order: 6
nav_order: 8
has_children: true
---
<!--

View file

@ -2,7 +2,7 @@
layout: default
title: Coding Guidelines
nav_order: 1
parent: Misc
parent: Contributors
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -1726,8 +1726,7 @@ must be set to 0 (default).
### Building cintltst
To compile this test suite using Microsoft Visual C++ (MSVC), follow the
instructions in `icu4c/source/readme.html#HowToInstall` for building the `allC`
workspace. This builds the libraries as well as the `cintltst` executable.
instructions in [How To Build And Install On Windows](../icu4c/build#how-to-build-and-install-on-windows). This builds the libraries as well as the `cintltst` executable.
### Executing cintltst

View file

@ -2,7 +2,7 @@
layout: default
title: Contributions
nav_order: 4
parent: Misc
parent: Contributors
---
<!--
© 2020 and later: Unicode, Inc. and others.

View file

@ -2,7 +2,7 @@
layout: default
title: User Guide Editing
nav_order: 5
parent: Misc
parent: Contributors
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -159,6 +159,7 @@ To activate the version of Ruby:
```bash
rbenv init # OR: eval "$(rbenv init -)"
rbenv shell <version-num>
rbenv versions # verify the specified version is in use
```
To install [Bundler](https://bundler.io/):
@ -183,6 +184,7 @@ instance. Then use Bundler to execute the Jekyll server.
```bash
rbenv init # OR: eval "$(rbenv init -)"
rbenv shell <version-num>
cd <ICU>/docs # change to User Guide docs root directory
bundle update
bundle exec jekyll server
```

View file

@ -1,7 +1,7 @@
---
layout: default
title: Misc
nav_order: 15
title: Contributors
nav_order: 17
has_children: true
---
<!--
@ -9,9 +9,9 @@ has_children: true
License & terms of use: http://www.unicode.org/copyright.html
-->
# Development
# Contributors
Top-level page for topics for ICU developers. See the subpages listed below for
Top-level page for topics for contributors to ICU. See the subpages listed below for
details:
[Coding Guidelines](codingguidelines.md)

View file

@ -2,7 +2,7 @@
layout: default
title: Custom ICU4C Synchronization
nav_order: 3
parent: Misc
parent: Contributors
---
<!--
© 2020 and later: Unicode, Inc. and others.

View file

@ -2,7 +2,7 @@
layout: default
title: Synchronization
nav_order: 2
parent: Misc
parent: Contributors
---
<!--
© 2020 and later: Unicode, Inc. and others.

View file

@ -1,7 +1,7 @@
---
layout: default
title: Formatting
nav_order: 7
nav_order: 9
has_children: true
---
<!--

View file

@ -1,6 +1,7 @@
---
layout: default
title: Glossary
parent: ICU
nav_order: 9000
---
<!--

View file

@ -24,19 +24,19 @@ License & terms of use: http://www.unicode.org/copyright.html
ICU builds and installs as relatively standard libraries. For details about
building, installing and porting see the [ICU4C
readme](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html) and the
[ICU4J readme](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4j/readme.html).
readme](../icu4c/) and the
[ICU4J readme](../icu4j/).
In addition, ICU4C installs several scripts and makefile fragments that help
build other code using ICU.
For C++, note that there are [Recommended Build
Options](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#RecBuild)
Options](icu4c/build#recommended-build-options)
(both for normal use and for ICU as system-level libraries) which are not
default simply for compatibility with older ICU-using code.
Starting with ICU 49, the ICU4C readme has a short section about
Starting with ICU 49, the ICU4C Readme has a short section about
[User-Configurable
Settings](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#UserConfig).
Settings](icu4c/build#user-configurable-settings).
## C++ Makefiles
@ -50,7 +50,7 @@ This table shows the package names used within pkg-config.
|**Package**|**Contents**|
|------|--------------------|
|icu-uc|Common (uc) and Data (dt/data) libraries|
|icu-i18n|Internationalization (in/i18n) library|icu-le [Layout Engine](layoutengine/index.md)|
|icu-i18n|Internationalization (in/i18n) library|icu-le [Layout Engine](../layoutengine/index.md)|
|icu-lx|Paragraph Layout|
|icu-io|[Ustdio](io/ustdio.md)/[iostream](io/ustream.md) library (icuio)
@ -154,13 +154,13 @@ ICU C++ APIs are normally defined in a versioned namespace, for example
"icu_50". There is a stable "icu" alias which should be used instead. (Entry
point versioning is only to allow for multiple ICU versions linked into one
program. [It is optional and should be off for system
libraries.](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#RecBuild))
libraries.](icu4c/build#recommended-build-options))
By default, and only for backward compatibility, the ICU headers contain a line
`using namespace icu_50;` which makes all ICU APIs visible in/with the global
namespace (and potentially collide with non-ICU APIs there). One of the
[Recommended Build
Options](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#RecBuild)
Options](icu4c/build#recommended-build-options)
is to turn this off.
To write forward declarations, use

View file

@ -140,7 +140,8 @@ following table shows how word lengths can differ among languages.
|English|German|Cyrillic-Serbian|
|--------|--------|-------------|
|cut|ausschneiden|исеци|
|copy|kopieren|копирајpasteeinfügenзалепи|
|copy|kopieren|копирај|
|paste|einfügen|залепи|
The description of the UI, especially user-visible pieces of text, must be kept
together and not embedded in the program's executable code. ICU provides the
@ -193,9 +194,9 @@ message is translated into a new language. ICU provides
[ChoiceFormat](format_parse/messages/index.md) (§) to help with these
occurrences.
> :point_right: **Note**: *There also might be situations where parts of the sentence change when other
> :point_right: **Note**: There also might be situations where parts of the sentence change when other
parts of the sentence also change (selecting between singular and plural nouns
that go after a number is the most common example). *
that go after a number is the most common example).
#### Measuring Units

View file

@ -0,0 +1,33 @@
---
layout: default
title: Release Info
nav_order: 9
parent: ICU
---
<!--
© 2020 and later: Unicode, Inc. and others.
License & terms of use: http://www.unicode.org/copyright.html
-->
# Release Info
{: .no_toc }
## Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
## What Is New In The Current Release?
See the [ICU download page](http://site.icu-project.org/download/) to find the subpage for the current release.
The subpage for the current release will contain information on changes since the last release, bug fixes, known issues, changes to supported platforms and build environments, and migration issues for existing applications migrating from previous ICU releases. The page will also include an API Change Report, both for ICU4C and ICU4J, for a complete list of APIs added, removed, or changed in this release.
Changes in previous releases can also be found the main [ICU download page](http://site.icu-project.org/download) in its version-specific subpages.
## License Information
The ICU projects (ICU4C and ICU4J) are hosted by the [Unicode Consortium](http://www.unicode.org/). The ICU binary and source files are distributed under the [UNICODE DATA FILES AND SOFTWARE LICENSE](http://www.unicode.org/copyright.html). The full copy of the license and third party software licenses are available in [LICENSE](https://github.com/unicode-org/icu/blob/main/icu4j/main/shared/licenses/LICENSE) file included in this package.

View file

@ -1,15 +1,15 @@
---
layout: default
title: ICU4C Readme
nav_order: 8
parent: ICU
title: Building ICU4C
nav_order: 2
parent: ICU4C
---
<!--
© 2020 and later: Unicode, Inc. and others.
License & terms of use: http://www.unicode.org/copyright.html
-->
# ICU4C Readme
# Building ICU4C
{: .no_toc }
## Contents
@ -20,306 +20,7 @@ License & terms of use: http://www.unicode.org/copyright.html
---
## Introduction
Today's software market is a global one in which it is desirable to develop and maintain one application (single source/single binary) that supports a wide variety of languages. The International Components for Unicode (ICU) libraries provide robust and full-featured Unicode services on a wide variety of platforms to help this design goal. The ICU libraries provide support for:
* The latest version of the Unicode standard
* Character set conversions with support for over 220 codepages
* Locale data for more than 300 locales
* Language sensitive text collation (sorting) and searching based on the Unicode Collation Algorithm (=ISO 14651)
* Regular expression matching and Unicode sets
* Transformations for normalization, upper/lowercase, script transliterations (50+ pairs)
* Resource bundles for storing and accessing localized information
* Date/Number/Message formatting and parsing of culture specific input/output formats
* Calendar specific date and time manipulation
* Text boundary analysis for finding characters, word and sentence boundaries
ICU has a sister project ICU4J that extends the internationalization capabilities of Java to a level similar to ICU. The ICU C/C++ project is also called ICU4C when a distinction is necessary.
## Getting started
This document describes how to build and install ICU on your machine. For other information about ICU please see the following table of links.
The ICU homepage also links to related information about writing internationalized software.
**Here are some useful links regarding ICU and internationalization in general.**
| ICU, ICU4C & ICU4J Homepage | <http://icu-project.org/> |
| FAQ - Frequently Asked Questions about ICU | <https://unicode-org.github.io/icu/userguide/icufaq/> |
| ICU User's Guide | <https://unicode-org.github.io/icu/> |
| How To Use ICU | <https://unicode-org.github.io/icu/userguide/howtouseicu.html> |
| Download ICU Releases | <http://site.icu-project.org/download> |
| ICU4C API Documentation Online | <http://icu-project.org/apiref/icu4c/> |
| Online ICU Demos | <http://demo.icu-project.org/icu-bin/icudemos> |
| Contacts and Bug Reports/Feature Requests | <http://site.icu-project.org/contacts> |
**Important:** Please make sure you understand the [Copyright and License Information](http://source.icu-project.org/repos/icu/trunk/icu4c/LICENSE).
## What Is New In The Current Release?
See the [ICU download page](http://site.icu-project.org/download/) to find the subpage for the current release, including any other changes, bug fixes, known issues, changes to supported platforms and build environments, and migration issues for existing applications migrating from previous ICU releases.
The subpage for the current release will also include an API Change Report, both for ICU4C and ICU4J, for a complete list of APIs added, removed, or changed in this release.
The list of API changes since the previous ICU4C release is available [here](https://htmlpreview.github.io/?https://raw.githubusercontent.com/unicode-org/icu/main/icu4c/APIChangeReport.html).
Changes in previous releases can also be found on the main [ICU download page](http://site.icu-project.org/download) in its version-specific subpages.
## How To Download the Source Code
There are two ways to download ICU releases:
* **Official Release Snapshot:**
If you want to use ICU (as opposed to developing it), you should download an official packaged version of the ICU source code. These versions are tested more thoroughly than day-to-day development builds of the system, and they are packaged in zip and tar files for convenient download. These packaged files can be found at [http://site.icu-project.org/download](http://site.icu-project.org/download).
The packaged snapshots are named `icu-nnnn.zip` or `icu-nnnn.tgz`, where `nnnn` is the version number. The .zip file is used for Windows platforms, while the .tgz file is preferred on most other platforms.
Please unzip this file.
> :point_right: **Note**: There may be additional commits on the `maint-*` branch for a particular version that are not included in the prepackaged download files.
* **GitHub Source Repository:**
If you are interested in developing features, patches, or bug fixes for ICU, you should probably be working with the latest version of the ICU source code. You will need to clone and checkout the code from our GitHub repository to ensure that you have the most recent version of all of the files. See our [source repository](http://site.icu-project.org/repository) for details.
## ICU Source Code Organization
In the descriptions below, `<ICU>` is the full path name of the ICU directory (the top level directory from the distribution archives) in your file system. You can also view the [ICU Architectural Design](design.md) section of the User's Guide to see which libraries you need for your software product. You need at least the data (`[lib]icudt`) and the common (`[lib]icuuc`) libraries in order to use ICU.
**The following files describe the code drop.**
| File | Description |
|-------------|----------------------------------------------------------------|
| readme.html | Describes the International Components for Unicode (this file) |
| LICENSE | Contains the text of the ICU license |
**The following directories contain source code and data files.**
<table>
<tr>
<th scope="col">Directory</th>
<th scope="col">Description</th>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>common</b>/</td>
<td>The core Unicode and support functionality, such as resource bundles,
character properties, locales, codepage conversion, normalization,
Unicode properties, Locale, and UnicodeString.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>i18n</b>/</td>
<td>Modules in i18n are generally the more data-driven, that is to say
resource bundle driven, components. These deal with higher-level
internationalization issues such as formatting, collation, text break
analysis, and transliteration.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>layoutex</b>/</td>
<td>Contains the ICU paragraph layout engine.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>io</b>/</td>
<td>Contains the ICU I/O library.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>data</b>/</td>
<td>
<p>This directory contains the source data in text format, which is
compiled into binary form during the ICU build process. It contains
several subdirectories, in which the data files are grouped by
function. Note that the build process must be run again after any
changes are made to this directory.</p>
<p>If some of the following directories are missing, it's probably
because you got an official download. If you need the data source files
for customization, then please download the complete ICU source code from <a
href="http://site.icu-project.org/repository">the ICU repository</a>.</p>
<ul>
<li><b>in/</b> A directory that contains a pre-built data library for
ICU. A standard source code package will contain this file without
several of the following directories. This is to simplify the build
process for the majority of users and to reduce platform porting
issues.</li>
<li><b>brkitr/</b> Data files for character, word, sentence, title
casing and line boundary analysis.</li>
<li><b>coll/</b> Data for collation tailorings. The makefile
<b>colfiles.mk</b> contains the list of resource bundle files.</li>
<li><b>locales/</b> These .txt files contain ICU language and
culture-specific localization data. Two special bundles are
<b>root</b>, which is the fallback data and parent of other bundles,
and <b>index</b>, which contains a list of installed bundles. The
makefile <b>resfiles.mk</b> contains the list of resource bundle
files. Some of the locale data is split out into the type-specific
directories curr, lang, region, unit, and zone, described below.</li>
<li><b>curr/</b> Locale data for currency symbols and names (including
plural forms), with its own makefile <b>resfiles.mk</b>.</li>
<li><b>lang/</b> Locale data for names of languages, scripts, and locale
key names and values, with its own makefile <b>resfiles.mk</b>.</li>
<li><b>region/</b> Locale data for names of regions, with its own
makefile <b>resfiles.mk</b>.</li>
<li><b>unit/</b> Locale data for measurement unit patterns and names,
with its own makefile <b>resfiles.mk</b>.</li>
<li><b>zone/</b> Locale data for time zone names, with its own
makefile <b>resfiles.mk</b>.</li>
<li><b>mappings/</b> Here are the code page converter tables. These
.ucm files contain mappings to and from Unicode. These are compiled
into .cnv files. <b>convrtrs.txt</b> is the alias mapping table from
various converter name formats to ICU internal format and vice versa.
It produces cnvalias.icu. The makefiles <b>ucmfiles.mk,
ucmcore.mk,</b> and <b>ucmebcdic.mk</b> contain the list of
converters to be built.</li>
<li><b>translit/</b> This directory contains transliterator rules as
resource bundles, a makefile <b>trnsfiles.mk</b> containing the list
of installed system translitaration files, and as well the special
bundle <b>translit_index</b> which lists the system transliterator
aliases.</li>
<li><b>unidata/</b> This directory contains the Unicode data files.
Please see <a href=
"http://www.unicode.org/">http://www.unicode.org/</a> for more
information.</li>
<li><b>misc/</b> The misc directory contains other data files which
did not fit into the above categories, including time zone
information, region-specific data, and other data derived from CLDR
supplemental data.</li>
<li><b>out/</b> This directory contains the assembled memory mapped
files.</li>
<li><b>out/build/</b> This directory contains intermediate (compiled)
files, such as .cnv, .res, etc.</li>
</ul>
<p>If you are creating a special ICU build, you can set the ICU_DATA
environment variable to the out/ or the out/build/ directories, but
this is generally discouraged because most people set it incorrectly.
You can view the <a href=
"https://unicode-org.github.io/icu/userguide/icudata">ICU Data
Management</a> section of the ICU User's Guide for details.</p>
</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>intltest</b>/</td>
<td>A test suite including all C++ APIs. For information about running
the test suite, see the build instructions specific to your platform
later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>cintltst</b>/</td>
<td>A test suite written in C, including all C APIs. For information
about running the test suite, see the build instructions specific to your
platform later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>iotest</b>/</td>
<td>A test suite written in C and C++ to test the icuio library. For
information about running the test suite, see the build instructions
specific to your platform later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>testdata</b>/</td>
<td>Source text files for data, which are read by the tests. It contains
the subdirectories <b>out/build/</b> which is used for intermediate
files, and <b>out/</b> which contains <b>testdata.dat.</b></td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>tools</b>/</td>
<td>Tools for generating the data files. Data files are generated by
invoking <i>&lt;ICU&gt;</i>/source/data/build/makedata.bat on Win32 or
<i>&lt;ICU&gt;</i>/source/make on UNIX.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>samples</b>/</td>
<td>Various sample programs that use ICU</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>extra</b>/</td>
<td>Non-supported API additions. Currently, it contains the 'uconv' tool
to perform codepage conversion on files.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>packaging</b>/</td>
<td>This directory contain scripts and tools for packaging the final
ICU build for various release platforms.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>config</b>/</td>
<td>Contains helper makefiles for platform specific build commands. Used
by 'configure'.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>allinone</b>/</td>
<td>Contains top-level ICU workspace and project files, for instance to
build all of ICU under one MSVC project.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>include</b>/</td>
<td>Contains the headers needed for developing software that uses ICU on
Windows.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>lib</b>/</td>
<td>Contains the import libraries for linking ICU into your Windows
application.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>bin</b>/</td>
<td>Contains the libraries and executables for using ICU on Windows.</td>
</tr>
</table>
## How To Build And Install ICU
### Recommended Build Options
## Recommended Build Options
Depending on the platform and the type of installation, we recommend a small number of modifications and build options. Note that C99 compatibility is now required.
@ -413,7 +114,7 @@ Depending on the platform and the type of installation, we recommend a small num
```
> :point_right: **Note**: this example shows a relative path to `runConfigureICU`. If you experience difficulty, try using an absolute path to `runConfigureICU` instead.
#### ICU as a System-Level Library
### ICU as a System-Level Library
If ICU is installed as a system-level library, there are further opportunities and restrictions to consider. For details, see the _Using ICU as an Operating System Level Library_ section of the [User Guide ICU Architectural Design](https://unicode-org.github.io/icu/userguide/design) chapter.
@ -425,13 +126,13 @@ If ICU is installed as a system-level library, there are further opportunities a
`runConfigureICU Linux --disable-renaming`
The public header files from this configuration must be installed for applications to include and get the correct entry point names.
### User-Configurable Settings
## User-Configurable Settings
ICU4C can be customized via a number of user-configurable settings. Many of them are controlled by preprocessor macros which are defined in the `source/common/unicode/uconfig.h` header file. Some turn off parts of ICU, for example conversion or collation, trading off a smaller library for reduced functionality. Other settings are recommended (see previous section) but their default values are set for better source code compatibility.
In order to change such user-configurable settings, you can either modify the `uconfig.h` header file by adding a specific `#define ...` for one or more of the macros before they are first tested, or set the compiler's preprocessor flags (`CPPFLAGS`) to include an equivalent `-D` macro definition.
### How To Build And Install On Windows
## How To Build And Install On Windows
Building International Components for Unicode requires:
@ -485,7 +186,7 @@ The steps are:
3. Run the I/O test suite, `iotest`. To do this: set the active startup project to "iotest", and press Ctrl+F5 to run it. Make sure that it passes without any errors.
8. You are now able to develop applications with ICU by using the libraries and tools in `<ICU>\bin\`. The headers are in `<ICU>\include\` and the link libraries are in `<ICU>\lib\`. To install the ICU runtime on a machine, or ship it with your application, copy the needed components from `<ICU>\bin\` to a location on the system PATH or to your application directory.
#### Building with other versions of Visual Studio
### Building with other versions of Visual Studio
The particular version of the MSVC compiler tool-set (and thus the corresponding version of Visual Studio) that is used to compile ICU is determined by the `PlatformToolset` property. This property is stored in two different shared files that are used to set common configuration settings amongst the various ICU `*.vcxproj` project files. For the non-UWP projects, this setting is in the shared file called `Build.Windows.ProjectConfiguration.props` located in the `allinone` directory. For the UWP projects, this setting is in the shared file called `Build.Windows.UWP.ProjectConfiguration.props`, also located in the `allinone` directory.
@ -495,14 +196,14 @@ In order to build the non-UWP projects with Visual Studio 2015 you will need to
> :point_right: **Note**: Using older versions of the MSVC compiler is generally not recommended due to the improved support for the C++11 standard in newer versions of the compiler.
#### Re-targeting the Windows 10 SDK for the UWP projects
### Re-targeting the Windows 10 SDK for the UWP projects
If the version of the Windows 10 SDK that you have installed does not match the version used by the UWP projects, then you will need to "retarget" them to use the version of the SDK that you have installed instead. There are two ways to do this:
* In Visual Studio you can right-click on the UWP projects in the 'Solution Explorer' and select the option 'Retarget Projects' from the context menu. This will open up a window where you can select the SDK version to target from a drop-down list of the various SDKs that are installed on the machine.
* Alternatively, you can manually edit the shared file called `Build.Windows.UWP.ProjectConfiguration.props` which is located in the `allinone` directory. You will need to change the of the `WindowsTargetPlatformVersion` property to the version of the SDK that you would like to use instead.
#### Using MSBUILD At The Command Line
### Using MSBUILD At The Command Line
You can build ICU from the command line instead of using the Visual Studio GUI. Assuming that you have properly installed Visual Studio to support command line building, you should have a shortcut for the "Developer Command Prompt" listed in the Start Menu. (For Visual Studio 2017 you will need to install the "Desktop development with C++" option).
@ -528,7 +229,7 @@ You can build ICU from the command line instead of using the Visual Studio GUI.
devenv.com source\allinone\allinone.sln /build "Release|x64"
```
#### Skipping the UWP Projects on the Command Line
### Skipping the UWP Projects on the Command Line
You can skip (or omit) building the UWP projects on the command line by passing the argument '`SkipUWP=true`' to either MSBUILD or devenv.
@ -544,25 +245,25 @@ You can skip (or omit) building the UWP projects on the command line by passing
You can also use Cygwin with the MSVC compiler to build ICU, and you can refer to the [How To Build And Install On Windows with Cygwin](#how-to-build-and-install-on-windows-with-cygwin) section for more details.
#### Setting Active Platform
### Setting Active Platform
Even though you are able to select "x64" as the active platform, if your operating system is not a 64 bit version of Windows, the build will fail. To set the active platform, two different possibilities are:
* Choose "Build" menu, select "Configuration Manager...", and select "Win32" or "x64" for the Active Platform Solution.
* Another way is to select the desired build configuration from "Solution Platforms" dropdown menu from the standard toolbar. It will say "Win32" or "x64" in the dropdown list.
#### Setting Active Configuration
### Setting Active Configuration
To set the active configuration, two different possibilities are:
* Choose "Build" menu, select "Configuration Manager...", and select "Release" or "Debug" for the Active Configuration Solution.
* Another way is to select the desired build configuration from "Solution Configurations" dropdown menu from the standard toolbar. It will say "Release" or "Debug" in the dropdown list.
#### Batch Configuration
### Batch Configuration
If you want to build the Win32 and x64 platforms and Debug and Release configurations at the same time, choose "Build" menu, and select "Batch Build...". Click the "Select All" button, and then click the "Rebuild" button.
### How To Build And Install On Windows with Cygwin
## How To Build And Install On Windows with Cygwin
Building International Components for Unicode with this configuration requires:
@ -589,7 +290,7 @@ There are two ways you can build ICU with Cygwin. You can build with gcc or Micr
7. Optionally, type `make check` to run the test suite, which checks for ICU's functionality integrity (See [testing note](#running-the-tests-from-the-command-line) below).
8. Type `make install` to install ICU. If you used the `--prefix=` option on `configure` or `runConfigureICU`, ICU will be installed to the directory you specified. (See [installation note](#installing-icu) below).
#### Configuring ICU on Windows
### Configuring ICU on Windows
Ensure that the order of the PATH is MSVC, Cygwin, and then other PATHs. The configure script needs certain tools in Cygwin (e.g. grep).
@ -603,7 +304,7 @@ In addition to the Unix [configuration note](#configuring-icu) the following con
* `--enable-static` (Requires that U_STATIC_IMPLEMENTATION be defined in user code that links against ICU's static libraries.)
* `--with-data-packaging=files` (The pkgdata tool currently does not work in this mode. Manual packaging is required to use this mode.)
### How To Build And Install On UNIX
## How To Build And Install On UNIX
Building International Components for Unicode on UNIX requires:
@ -645,21 +346,21 @@ gmake install
```
to install ICU. If you used the `--prefix=` option on `configure` or `runConfigureICU`, ICU will be installed to the directory you specified. (See [installation note](#installing-icu) below).
#### Configuring ICU
### Configuring ICU
Type `"./runConfigureICU --help"` for help on how to run it and a list of supported platforms. You may also want to type `"./configure --help"` to print the available configure options that you may want to give `runConfigureICU`. If you are not using the `runConfigureICU` script, or your platform is not supported by the script, you may need to set your `CC`, `CXX`, `CFLAGS` and `CXXFLAGS` environment variables, and type `"./configure"`. HP-UX users, please see this [note regarding HP-UX multithreaded build issues](#using-icu-in-a-multithreaded-environment-on-hp-ux) with newer compilers. Solaris users, please see this [note regarding Solaris multithreaded build issues](#linking-on-solaris).
ICU is built with strict compiler warnings enabled by default. If this causes excessive numbers of warnings on your platform, use the `--disable-strict` option to configure to reduce the warning level.
#### Running The Tests From The Command Line
### Running The Tests From The Command Line
You may have to set certain variables if you with to run test programs individually, that is apart from "gmake check". The environment variable **ICU_DATA** can be set to the full pathname of the data directory to indicate where the locale data files and conversion mapping tables are when you are not using the shared library (e.g. by using the .dat archive or the individual data files). The trailing "/" is required after the directory name (e.g. `$Root/source/data/out/` will work, but the value `$Root/source/data/out` is not acceptable). You do not need to set **ICU_DATA** if the complete shared data library is in your library path.
#### Installing ICU
### Installing ICU
Some platforms use package management tools to control the installation and uninstallation of files on the system, as well as the integrity of the system configuration. You may want to check if ICU can be packaged for your package management tools by looking into the `packaging` directory. (Please note that if you are using a snapshot of ICU from Git, it is probable that the packaging scripts or related files are not up to date with the contents of ICU at this time, so use them with caution).
### How To Build And Install On z/OS (OS/390)
## How To Build And Install On z/OS (OS/390)
You can install ICU on z/OS or OS/390 (the previous name of z/OS), but IBM tests only the z/OS installation. You install ICU in a z/OS UNIX system services file system such as HFS or zFS. On this platform, it is important that you understand a few details:
@ -683,7 +384,7 @@ export _CEE_RUNOPTS="HEAPPOOLS(ON),HEAP(4M,1M,ANY,FREE,0K,4080)"
* The rest of the instructions for building and testing ICU on z/OS with UNIX System Services are the same as the [How To Build And Install On UNIX](#how-to-build-and-install-on-unix) section.
#### z/OS (Batch/PDS) support outside the UNIX system services environment
### z/OS (Batch/PDS) support outside the UNIX system services environment
By default, ICU builds its libraries into the UNIX file system (HFS). In addition, there is a z/OS specific environment variable (OS390BATCH) to build some libraries into the z/OS native file system. This is useful, for example, when your application is externalized via Job Control Language (JCL).
@ -745,7 +446,7 @@ Secondary cylinders : 3
Data set name type : PDS
```
### How To Build And Install On The IBM i Family (IBM i, i5/OS OS/400)
## How To Build And Install On The IBM i Family (IBM i, i5/OS OS/400)
Before you start building ICU, ICU requires the following:
@ -804,7 +505,7 @@ gmake check
```
(The `QIBM_MULTI_THREADED=Y` flag will be automatically applied to intltest - you can look at the [iSeries Information Center](https://www.ibm.com/support/knowledgecenter/ssw_ibm_i_73/rzahw/rzahwceeco.htm) for more details regarding the running of multiple threads on IBM i.)
### How To Cross Compile ICU
## How To Cross Compile ICU
This section will explain how to build ICU on one platform, but to produce binaries intended to run on another. This is commonly known as a cross compile.
@ -837,148 +538,6 @@ gnumake
> :point_right: **Note**: `--with-cross-build` takes an absolute path.
5. Tests and testdata can be built with `gnumake tests`.
## How To Package ICU
There are many ways that a person can package ICU with their software products. Usually only the libraries need to be considered for packaging.
On UNIX, you should use `gmake install` to make it easier to develop and package ICU. The bin, lib and include directories are needed to develop applications that use ICU. These directories will be created relative to the `--prefix=`dir" configure option (See the [UNIX build instructions](#how-to-build-and-install-on-unix)). When ICU is built on Windows, a similar directory structure is built.
When changes have been made to the standard ICU distribution, it is recommended that at least one of the following guidelines be followed for special packaging.
1. Add a suffix name to the library names. This can be done with the `--with-library-suffix` configure option.
2. The installation script should install the ICU libraries into the application's directory.
Following these guidelines prevents other applications that use a standard ICU distribution from conflicting with any libraries that you need. On operating systems that do not have a standard C++ ABI (name mangling) for compilers, it is recommended to do this special packaging anyway. More details on customizing ICU are available in the [User's Guide](https://unicode-org.github.io/icu/userguide/). The [ICU Source Code Organization](#SourceCode) section of this readme.html gives a more complete description of the libraries.
ICU has several libraries for you to use. Here is an example of libraries that are frequently packaged.
| Library Name | Windows Filename | Linux Filename | Comment |
|-------------------------------------|------------------|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Data Library | icudtXYl.dll | libicudata.so.XY.Z | Data required by the Common and I18n libraries. There are many ways to package and [customize this data](https://unicode-org.github.io/icu/userguide/icudata), but by default this is all you need. |
| Common Library | icuucXY.dll | libicuuc.so.XY.Z | Base library required by all other ICU libraries. |
| Internationalization (i18n) Library | icuinXY.dll | libicui18n.so.XY.Z | A library that contains many locale based internationalization (i18n) functions. |
| Layout Extensions Engine | iculxXY.dll | libiculx.so.XY.Z | An optional engine for doing paragraph layout that uses parts of ICU. HarfBuzz is required. |
| ICU I/O (Unicode stdio) Library | icuioXY.dll | libicuio.so.XY.Z | An optional library that provides a stdio like API with Unicode support. |
| Tool Utility Library | icutuXY.dll | libicutu.so.XY.Z | An internal library that contains internal APIs that are only used by ICU's tools. If you do not use ICU's tools, you do not need this library. |
Normally only the above ICU libraries need to be considered for packaging. The versionless symbolic links to these libraries are only needed for easier development. The _X_, _Y_ and _Z_ parts of the name are the version numbers of ICU. For example, ICU 2.0.2 would have the name libicuuc.so.20.2 for the common library. The exact format of the library names can vary between platforms due to how each platform can handles library versioning.
## Important Notes About Using ICU
### Using ICU in a Multithreaded Environment
Some versions of ICU require calling the `u_init()` function from `uclean.h` to ensure that ICU is initialized properly. In those ICU versions, `u_init()` must be called before ICU is used from multiple threads. There is no harm in calling `u_init()` in a single-threaded application, on a single-CPU machine, or in other cases where `u_init()` is not required.
In addition to ensuring thread safety, `u_init()` also attempts to load at least one ICU data file. Assuming that all data files are packaged together (or are in the same folder in files mode), a failure code from `u_init()` usually means that the data cannot be found. In this case, the data may not be installed properly, or the application may have failed to call `udata_setCommonData()` or `u_setDataDirectory()` which specify to ICU where it can find its data.
Since `u_init()` will load only one or two data files, it cannot guarantee that all of the data that an application needs is available. It cannot check for all data files because the set of files is customizable, and some ICU services work without loading any data at all. An application should always check for error codes when opening ICU service objects (using `ucnv_open()`, `ucol_open()`, C++ constructors, etc.).
#### ICU 3.4 and later
ICU 3.4 self-initializes properly for multi-threaded use. It achieves this without performance penalty by hardcoding the core Unicode properties data, at the cost of some flexibility. (For details see Jitterbug 4497.)
`u_init()` can be used to check for data loading. It tries to load the converter alias table (`cnvalias.icu`).
#### ICU 2.6..3.2
These ICU versions require a call to `u_init()` before multi-threaded use. The services that are directly affected are those that don't have a service object and need to be fast: normalization and character properties.
`u_init()` loads and initializes the data files for normalization and character properties (`unorm.icu` and `uprops.icu`) and can therefore also be used to check for data loading.
#### ICU 2.4 and earlier
ICU 2.4 and earlier versions were not prepared for multithreaded use on multi-CPU platforms where the CPUs implement weak memory coherency. These CPUs include: Power4, Power5, Alpha, Itanium. `u_init()` was not defined yet.
#### Using ICU in a Multithreaded Environment on HP-UX
When ICU is built with aCC on HP-UX, the [`-AA`](http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801?ciid=eb08b3f1eee02110b3f1eee02110275d6e10RCRD) compiler flag is used. It is required in order to use the latest `<iostream>` API in a thread safe manner. This compiler flag affects the version of the C++ library being used. Your applications will also need to be compiled with `-AA` in order to use ICU.
#### Using ICU in a Multithreaded Environment on Solaris
##### Linking on Solaris
In order to avoid synchronization and threading issues, developers are **suggested** to strictly follow the compiling and linking guidelines for multithreaded applications, specified in the following SUn Solaris document available from Oracle. Most notably, pay strict attention to the following statements from Sun:
> To use libthread, specify `-lthread` before `-lc` on the ld command line, or last on the cc command line.
>
> To use libpthread, specify `-lpthread` before `-lc` on the ld command line, or last on the cc command line.
Failure to do this may cause spurious lock conflicts, recursive mutex failure, and deadlock.
Source: "_Multithreaded Programming Guide, Compiling and Debugging_", Sun Microsystems, 2002
[https://docs.oracle.com/cd/E19683-01/806-6867/compile-74765/index.html](https://docs.oracle.com/cd/E19683-01/806-6867/compile-74765/index.html)
Note, a version of that chapter from a 2008 document update covering both Solaris 9 and Solaris 10 is available here:
[http://docs.oracle.com/cd/E19253-01/816-5137/compile-94179/index.html](http://docs.oracle.com/cd/E19253-01/816-5137/compile-94179/index.html)
### Windows Platform
If you are building on the Windows platform, it is important that you understand a few of the following build details.
#### DLL directories and the PATH setting
As delivered, the International Components for Unicode build as several DLLs, which are placed in the `<ICU>\bin64` directory. You must add this directory to the PATH environment variable in your system, or any executables you build will not be able to access International Components for Unicode libraries. Alternatively, you can copy the DLL files into a directory already in your PATH, but we do not recommend this. You can wind up with multiple copies of the DLL and wind up using the wrong one.
#### Changing your PATH
##### Windows 2000/XP and above
Use the System Icon in the Control Panel. Pick the "Advanced" tab. Select the "Environment Variables..." button. Select the variable `PATH` in the lower box, and select the lower "Edit..." button. In the "Variable Value" box, append the string `;<ICU>\bin64` to the end of the path string. If there is nothing there, just type in `<ICU>\bin64`. Click the Set button, then the OK button.
> :point_right: **Note**: When packaging a Windows application for distribution and installation on user systems, copies of the ICU DLLs should be included with the application, and installed for exclusive use by the application. This is the only way to insure that your application is running with the same version of ICU, built with exactly the same options, that you developed and tested with. Refer to Microsoft's guidelines on the usage of DLLs, or search for the phrase "DLL hell" on [msdn.microsoft.com](http://msdn.microsoft.com/).
### UNIX Type Platform
If you are building on a UNIX platform, and if you are installing ICU in a non-standard location, you may need to add the location of your ICU libraries to your `LD_LIBRARY_PATH` or `LIBPATH` environment variable (or the equivalent runtime library path environment variable for your system). The ICU libraries may not link or load properly without doing this.
> :point_right: **Note**: If you do not want to have to set this variable, you may instead use the `--enable-rpath` option at configuration time. This option will instruct the linker to always look for the libraries where they are installed. You will need to use the appropriate linker options when linking your own applications and libraries against ICU, too. Please refer to your system's linker manual for information about runtime paths. The use of rpath also means that when building a new version of ICU you should not have an older version installed in the same place as the new version's installation directory, as the older libraries will used during the build, instead of the new ones, likely leading to an incorrectly build ICU. This is the proper behavior of rpath.
## Platform Dependencies
### Porting To A New Platform
If you are using ICU's Makefiles to build ICU on a new platform, there are a few places where you will need to add or modify some files. If you need more help, you can always ask the [icu-support mailing list](http://site.icu-project.org/contacts). Once you have finished porting ICU to a new platform, it is recommended that you contribute your changes back to ICU via the icu-support mailing list. This will make it easier for everyone to benefit from your work.
#### Data For a New Platform
For some people, it may not be necessary for completely build ICU. Most of the makefiles and build targets are for tools that are used for building ICU's data, and an application's data (when an application uses ICU resource bundles for its data).
Data files can be built on a different platform when both platforms share the same endianness and the same charset family. This assertion does not include platform dependent DLLs/shared/static libraries. For details see the User Guide [ICU Data](https://unicode-org.github.io/icu/userguide/icudata) chapter.
ICU 3.6 removes the requirement that ICU be completely built in the native operating environment. It adds the icupkg tool which can be run on any platform to turn binary ICU data files from any one of the three formats into any one of the other data formats. This allows a application to use ICU data built anywhere to be used for any other target platform.
**WARNING!** Building ICU without running the tests is not recommended. The tests verify that ICU is safe to use. It is recommended that you try to completely port and test ICU before using the libraries for your own application.
#### Adapting Makefiles For a New Platform
Try to follow the build steps from the [UNIX](#how-to-build-and-install-on-unix) build instructions. If the configure script fails, then you will need to modify some files. Here are the usual steps for porting to a new platform:
1. Create an mh file in `<ICU>/source/config/`. You can use mh-linux or a similar mh file as your base configuration.
2. Modify `<ICU>/source/aclocal.m4` to recognize your platform's mh file.
3. Modify `<ICU>/source/configure.in` to properly set your **platform** C Macro define.
4. Run [autoconf](http://www.gnu.org/software/autoconf/) in `<ICU>/source/` without any options. The autoconf tool is standard on most Linux systems.
5. If you have any optimization options that you want to normally use, you can modify `<ICU>/source/runConfigureICU` to specify those options for your platform.
6. Build and test ICU on your platform. It is very important that you run the tests. If you don't run the tests, there is no guarentee that you have properly ported ICU.
### Platform Dependent Implementations
The platform dependencies have been mostly isolated into the following files in the common library. This information can be useful if you are porting ICU to a new platform.
* **unicode/platform.h.in** (autoconf'ed platforms)
**unicode/p_XXXX_.h** (others: pwin32.h, ppalmos.h, ..): Platform-dependent typedefs and defines:
* Generic types like `UBool`, `int8_t`, `int16_t`, `int32_t`, `int64_t`, `uint64_t` etc.
* `U_EXPORT` and `U_IMPORT` for specifying dynamic library import and export
* String handling support for the `char16_t` and `wchar_t` types.
* **unicode/putil.h, putil.c**: platform-dependent implementations of various functions that are platform dependent:
* `uprv_isNaN`, `uprv_isInfinite`, `uprv_getNaN` and `uprv_getInfinity` for handling special floating point values.
* `uprv_tzset`, `uprv_timezone`, `uprv_tzname` and `time` for getting platform specific time and time zone information.
* `u_getDataDirectory` for getting the default data directory.
* `uprv_getDefaultLocaleID` for getting the default locale setting.
* `uprv_getDefaultCodepage` for getting the default codepage encoding.
* **umutex.h, umutex.c**: Code for doing synchronization in multithreaded applications. If you wish to use International Components for Unicode in a multithreaded application, you must provide a synchronization primitive that the classes can use to protect their global data against simultaneous modifications. We already supply working implementations for many platforms that ICU builds on.
* **umapfile.h, umapfile.c**: functions for mapping or otherwise reading or loading files into memory. All access by ICU to data from files makes use of these functions.
* Using platform specific `#ifdef` macros are highly discouraged outside of the scope of these files. When the source code gets updated in the future, these `#ifdef`'s can cause testing problems for your platform.
* * *

View file

@ -1,8 +1,8 @@
---
layout: default
title: ICU FAQ
nav_order: 6
parent: Misc
title: ICU4C FAQ
nav_order: 1
parent: ICU4C
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -42,7 +42,7 @@ versions of ICU, but we will assist in building other versions from source.
**Why don't you provide project files for my MSVC version (MSVC 2008, etc)?**
You can use the Cygwin build environment to build ICU from source against the
MSVC compiler. See the ICU4C Readme.
MSVC compiler. See the [Building ICU4C](./icu4c/build) page.
#### How do I install the binary versions of ICU?
@ -69,7 +69,7 @@ MSVC compiler. See the ICU4C Readme.
#### Can you help me build ICU4C for ...
We can try ... make sure you read the latest "readme" and also the [ICU
We can try ... make sure you read the [Building ICU4C](./icu4c/build) section and also the [ICU
Data](../icudata.md) section. You might also [searching the icu-support
archives](http://site.icu-project.org/contacts), and then posting a question
there. Additionally, sites such as
@ -146,7 +146,7 @@ upgrade-friendly.
#### How do I build ICU?
See the readme.html that is included with ICU.
See the [Building ICU4C](./icu4c/build) section.
#### How do I get 32- or 64-bit versions of the ICU libraries?

View file

@ -0,0 +1,447 @@
---
layout: default
title: ICU4C
nav_order: 3
has_children: true
---
<!--
© 2020 and later: Unicode, Inc. and others.
License & terms of use: http://www.unicode.org/copyright.html
-->
# ICU4C Readme
{: .no_toc }
## Contents
{: .no_toc .text-delta }
1. TOC
{:toc}
---
## Introduction
Today's software market is a global one in which it is desirable to develop and maintain one application (single source/single binary) that supports a wide variety of languages. The International Components for Unicode (ICU) libraries provide robust and full-featured Unicode services on a wide variety of platforms to help this design goal. The ICU libraries provide support for:
* The latest version of the Unicode standard
* Character set conversions with support for over 220 codepages
* Locale data for more than 300 locales
* Language sensitive text collation (sorting) and searching based on the Unicode Collation Algorithm (=ISO 14651)
* Regular expression matching and Unicode sets
* Transformations for normalization, upper/lowercase, script transliterations (50+ pairs)
* Resource bundles for storing and accessing localized information
* Date/Number/Message formatting and parsing of culture specific input/output formats
* Calendar specific date and time manipulation
* Text boundary analysis for finding characters, word and sentence boundaries
ICU has a sister project ICU4J that extends the internationalization capabilities of Java to a level similar to ICU. The ICU C/C++ project is also called ICU4C when a distinction is necessary.
## Getting started
This document describes how to build and install ICU on your machine. For other information about ICU please see the following table of links.
The ICU homepage also links to related information about writing internationalized software.
**Here are some useful links regarding ICU and internationalization in general.**
| ICU, ICU4C & ICU4J Homepage | <http://site.icu-project.org/> |
| ICU FAQ - Frequently Asked Questions about ICU | <https://unicode-org.github.io/icu/userguide/icu4c/faq> |
| ICU4J FAQ - Frequently Asked Questions about ICU4J | <https://unicode-org.github.io/icu/userguide/icu4j/faq> |
| ICU User's Guide | <https://unicode-org.github.io/icu/> |
| How To Use ICU | <https://unicode-org.github.io/icu/userguide/howtouseicu> |
| Download ICU Releases | <http://site.icu-project.org/download> |
| ICU4C API Documentation Online | <http://icu-project.org/apiref/icu4c/> |
| Online ICU Demos | <http://demo.icu-project.org/icu-bin/icudemos> |
| Contacts and Bug Reports/Feature Requests | <http://site.icu-project.org/contacts> |
**Important:** Please make sure you understand the [Copyright and License Information](https://github.com/unicode-org/icu/blob/main/icu4c/LICENSE).
## What Is New In The Current Release?
See the [ICU download page](http://site.icu-project.org/download/) to find the subpage for the current release, including any other changes, bug fixes, known issues, changes to supported platforms and build environments, and migration issues for existing applications migrating from previous ICU releases.
The subpage for the current release will also include an API Change Report, both for ICU4C and ICU4J, for a complete list of APIs added, removed, or changed in this release.
The list of API changes since the previous ICU4C release is available [here](https://htmlpreview.github.io/?https://raw.githubusercontent.com/unicode-org/icu/main/icu4c/APIChangeReport.html).
Changes in previous releases can also be found on the main [ICU download page](http://site.icu-project.org/download) in its version-specific subpages.
## How To Download the Source Code
There are two ways to download ICU releases:
* **Official Release Snapshot:**
If you want to use ICU (as opposed to developing it), you should download an official packaged version of the ICU source code. These versions are tested more thoroughly than day-to-day development builds of the system, and they are packaged in zip and tar files for convenient download. These packaged files can be found at [http://site.icu-project.org/download](http://site.icu-project.org/download).
The packaged snapshots are named `icu-nnnn.zip` or `icu-nnnn.tgz`, where `nnnn` is the version number. The .zip file is used for Windows platforms, while the .tgz file is preferred on most other platforms.
Please unzip this file.
> :point_right: **Note**: There may be additional commits on the `maint-*` branch for a particular version that are not included in the prepackaged download files.
* **GitHub Source Repository:**
If you are interested in developing features, patches, or bug fixes for ICU, you should probably be working with the latest version of the ICU source code. You will need to clone and checkout the code from our GitHub repository to ensure that you have the most recent version of all of the files. See our [source repository](http://site.icu-project.org/repository) for details.
## ICU Source Code Organization
In the descriptions below, `<ICU>` is the full path name of the ICU4C directory (the top level directory from the distribution archives) in your file system. You can also view the [ICU Architectural Design](design.md) section of the User's Guide to see which libraries you need for your software product. You need at least the data (`[lib]icudt`) and the common (`[lib]icuuc`) libraries in order to use ICU.
**The following files describe the code drop.**
| File | Description |
|-------------|----------------------------------------------------------------|
| LICENSE | Contains the text of the ICU license |
**The following directories contain source code and data files.**
<table>
<tr>
<th scope="col">Directory</th>
<th scope="col">Description</th>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>common</b>/</td>
<td>The core Unicode and support functionality, such as resource bundles,
character properties, locales, codepage conversion, normalization,
Unicode properties, Locale, and UnicodeString.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>i18n</b>/</td>
<td>Modules in i18n are generally the more data-driven, that is to say
resource bundle driven, components. These deal with higher-level
internationalization issues such as formatting, collation, text break
analysis, and transliteration.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>layoutex</b>/</td>
<td>Contains the ICU paragraph layout engine.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>io</b>/</td>
<td>Contains the ICU I/O library.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>data</b>/</td>
<td>
<p>This directory contains the source data in text format, which is
compiled into binary form during the ICU build process. It contains
several subdirectories, in which the data files are grouped by
function. Note that the build process must be run again after any
changes are made to this directory.</p>
<p>If some of the following directories are missing, it's probably
because you got an official download. If you need the data source files
for customization, then please download the complete ICU source code from <a
href="http://site.icu-project.org/repository">the ICU repository</a>.</p>
<ul>
<li><b>in/</b> A directory that contains a pre-built data library for
ICU. A standard source code package will contain this file without
several of the following directories. This is to simplify the build
process for the majority of users and to reduce platform porting
issues.</li>
<li><b>brkitr/</b> Data files for character, word, sentence, title
casing and line boundary analysis.</li>
<li><b>coll/</b> Data for collation tailorings. The makefile
<b>colfiles.mk</b> contains the list of resource bundle files.</li>
<li><b>locales/</b> These .txt files contain ICU language and
culture-specific localization data. Two special bundles are
<b>root</b>, which is the fallback data and parent of other bundles,
and <b>index</b>, which contains a list of installed bundles. The
makefile <b>resfiles.mk</b> contains the list of resource bundle
files. Some of the locale data is split out into the type-specific
directories curr, lang, region, unit, and zone, described below.</li>
<li><b>curr/</b> Locale data for currency symbols and names (including
plural forms), with its own makefile <b>resfiles.mk</b>.</li>
<li><b>lang/</b> Locale data for names of languages, scripts, and locale
key names and values, with its own makefile <b>resfiles.mk</b>.</li>
<li><b>region/</b> Locale data for names of regions, with its own
makefile <b>resfiles.mk</b>.</li>
<li><b>unit/</b> Locale data for measurement unit patterns and names,
with its own makefile <b>resfiles.mk</b>.</li>
<li><b>zone/</b> Locale data for time zone names, with its own
makefile <b>resfiles.mk</b>.</li>
<li><b>mappings/</b> Here are the code page converter tables. These
.ucm files contain mappings to and from Unicode. These are compiled
into .cnv files. <b>convrtrs.txt</b> is the alias mapping table from
various converter name formats to ICU internal format and vice versa.
It produces cnvalias.icu. The makefiles <b>ucmfiles.mk,
ucmcore.mk,</b> and <b>ucmebcdic.mk</b> contain the list of
converters to be built.</li>
<li><b>translit/</b> This directory contains transliterator rules as
resource bundles, a makefile <b>trnsfiles.mk</b> containing the list
of installed system translitaration files, and as well the special
bundle <b>translit_index</b> which lists the system transliterator
aliases.</li>
<li><b>unidata/</b> This directory contains the Unicode data files.
Please see <a href=
"http://www.unicode.org/">http://www.unicode.org/</a> for more
information.</li>
<li><b>misc/</b> The misc directory contains other data files which
did not fit into the above categories, including time zone
information, region-specific data, and other data derived from CLDR
supplemental data.</li>
<li><b>out/</b> This directory contains the assembled memory mapped
files.</li>
<li><b>out/build/</b> This directory contains intermediate (compiled)
files, such as .cnv, .res, etc.</li>
</ul>
<p>If you are creating a special ICU build, you can set the ICU_DATA
environment variable to the out/ or the out/build/ directories, but
this is generally discouraged because most people set it incorrectly.
You can view the <a href=
"https://unicode-org.github.io/icu/userguide/icudata">ICU Data
Management</a> section of the ICU User's Guide for details.</p>
</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>intltest</b>/</td>
<td>A test suite including all C++ APIs. For information about running
the test suite, see the build instructions specific to your platform
later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>cintltst</b>/</td>
<td>A test suite written in C, including all C APIs. For information
about running the test suite, see the build instructions specific to your
platform later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>iotest</b>/</td>
<td>A test suite written in C and C++ to test the icuio library. For
information about running the test suite, see the build instructions
specific to your platform later in this document.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/test/<b>testdata</b>/</td>
<td>Source text files for data, which are read by the tests. It contains
the subdirectories <b>out/build/</b> which is used for intermediate
files, and <b>out/</b> which contains <b>testdata.dat.</b></td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>tools</b>/</td>
<td>Tools for generating the data files. Data files are generated by
invoking <i>&lt;ICU&gt;</i>/source/data/build/makedata.bat on Win32 or
<i>&lt;ICU&gt;</i>/source/make on UNIX.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>samples</b>/</td>
<td>Various sample programs that use ICU</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>extra</b>/</td>
<td>Non-supported API additions. Currently, it contains the 'uconv' tool
to perform codepage conversion on files.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>packaging</b>/</td>
<td>This directory contain scripts and tools for packaging the final
ICU build for various release platforms.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>config</b>/</td>
<td>Contains helper makefiles for platform specific build commands. Used
by 'configure'.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/source/<b>allinone</b>/</td>
<td>Contains top-level ICU workspace and project files, for instance to
build all of ICU under one MSVC project.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>include</b>/</td>
<td>Contains the headers needed for developing software that uses ICU on
Windows.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>lib</b>/</td>
<td>Contains the import libraries for linking ICU into your Windows
application.</td>
</tr>
<tr>
<td><i>&lt;ICU&gt;</i>/<b>bin</b>/</td>
<td>Contains the libraries and executables for using ICU on Windows.</td>
</tr>
</table>
## How To Build And Install ICU
See the page on [building ICU4C](./build).
## How To Package ICU
See the page on [packaging ICU4C](./packaging).
## Important Notes About Using ICU
### Using ICU in a Multithreaded Environment
Some versions of ICU require calling the `u_init()` function from `uclean.h` to ensure that ICU is initialized properly. In those ICU versions, `u_init()` must be called before ICU is used from multiple threads. There is no harm in calling `u_init()` in a single-threaded application, on a single-CPU machine, or in other cases where `u_init()` is not required.
In addition to ensuring thread safety, `u_init()` also attempts to load at least one ICU data file. Assuming that all data files are packaged together (or are in the same folder in files mode), a failure code from `u_init()` usually means that the data cannot be found. In this case, the data may not be installed properly, or the application may have failed to call `udata_setCommonData()` or `u_setDataDirectory()` which specify to ICU where it can find its data.
Since `u_init()` will load only one or two data files, it cannot guarantee that all of the data that an application needs is available. It cannot check for all data files because the set of files is customizable, and some ICU services work without loading any data at all. An application should always check for error codes when opening ICU service objects (using `ucnv_open()`, `ucol_open()`, C++ constructors, etc.).
#### ICU 3.4 and later
ICU 3.4 self-initializes properly for multi-threaded use. It achieves this without performance penalty by hardcoding the core Unicode properties data, at the cost of some flexibility. (For details see Jitterbug 4497.)
`u_init()` can be used to check for data loading. It tries to load the converter alias table (`cnvalias.icu`).
#### ICU 2.6..3.2
These ICU versions require a call to `u_init()` before multi-threaded use. The services that are directly affected are those that don't have a service object and need to be fast: normalization and character properties.
`u_init()` loads and initializes the data files for normalization and character properties (`unorm.icu` and `uprops.icu`) and can therefore also be used to check for data loading.
#### ICU 2.4 and earlier
ICU 2.4 and earlier versions were not prepared for multithreaded use on multi-CPU platforms where the CPUs implement weak memory coherency. These CPUs include: Power4, Power5, Alpha, Itanium. `u_init()` was not defined yet.
#### Using ICU in a Multithreaded Environment on HP-UX
When ICU is built with aCC on HP-UX, the [`-AA`](http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801?ciid=eb08b3f1eee02110b3f1eee02110275d6e10RCRD) compiler flag is used. It is required in order to use the latest `<iostream>` API in a thread safe manner. This compiler flag affects the version of the C++ library being used. Your applications will also need to be compiled with `-AA` in order to use ICU.
#### Using ICU in a Multithreaded Environment on Solaris
##### Linking on Solaris
In order to avoid synchronization and threading issues, developers are **suggested** to strictly follow the compiling and linking guidelines for multithreaded applications, specified in the following SUn Solaris document available from Oracle. Most notably, pay strict attention to the following statements from Sun:
> To use libthread, specify `-lthread` before `-lc` on the ld command line, or last on the cc command line.
>
> To use libpthread, specify `-lpthread` before `-lc` on the ld command line, or last on the cc command line.
Failure to do this may cause spurious lock conflicts, recursive mutex failure, and deadlock.
Source: "_Multithreaded Programming Guide, Compiling and Debugging_", Sun Microsystems, 2002
[https://docs.oracle.com/cd/E19683-01/806-6867/compile-74765/index.html](https://docs.oracle.com/cd/E19683-01/806-6867/compile-74765/index.html)
Note, a version of that chapter from a 2008 document update covering both Solaris 9 and Solaris 10 is available here:
[http://docs.oracle.com/cd/E19253-01/816-5137/compile-94179/index.html](http://docs.oracle.com/cd/E19253-01/816-5137/compile-94179/index.html)
### Windows Platform
If you are building on the Windows platform, it is important that you understand a few of the following build details.
#### DLL directories and the PATH setting
As delivered, the International Components for Unicode build as several DLLs, which are placed in the `<ICU>\bin64` directory. You must add this directory to the PATH environment variable in your system, or any executables you build will not be able to access International Components for Unicode libraries. Alternatively, you can copy the DLL files into a directory already in your PATH, but we do not recommend this. You can wind up with multiple copies of the DLL and wind up using the wrong one.
#### Changing your PATH
##### Windows 2000/XP and above
Use the System Icon in the Control Panel. Pick the "Advanced" tab. Select the "Environment Variables..." button. Select the variable `PATH` in the lower box, and select the lower "Edit..." button. In the "Variable Value" box, append the string `;<ICU>\bin64` to the end of the path string. If there is nothing there, just type in `<ICU>\bin64`. Click the Set button, then the OK button.
> :point_right: **Note**: When packaging a Windows application for distribution and installation on user systems, copies of the ICU DLLs should be included with the application, and installed for exclusive use by the application. This is the only way to insure that your application is running with the same version of ICU, built with exactly the same options, that you developed and tested with. Refer to Microsoft's guidelines on the usage of DLLs, or search for the phrase "DLL hell" on [msdn.microsoft.com](http://msdn.microsoft.com/).
### UNIX Type Platform
If you are building on a UNIX platform, and if you are installing ICU in a non-standard location, you may need to add the location of your ICU libraries to your `LD_LIBRARY_PATH` or `LIBPATH` environment variable (or the equivalent runtime library path environment variable for your system). The ICU libraries may not link or load properly without doing this.
> :point_right: **Note**: If you do not want to have to set this variable, you may instead use the `--enable-rpath` option at configuration time. This option will instruct the linker to always look for the libraries where they are installed. You will need to use the appropriate linker options when linking your own applications and libraries against ICU, too. Please refer to your system's linker manual for information about runtime paths. The use of rpath also means that when building a new version of ICU you should not have an older version installed in the same place as the new version's installation directory, as the older libraries will used during the build, instead of the new ones, likely leading to an incorrectly build ICU. This is the proper behavior of rpath.
## Platform Dependencies
### Porting To A New Platform
If you are using ICU's Makefiles to build ICU on a new platform, there are a few places where you will need to add or modify some files. If you need more help, you can always ask the [icu-support mailing list](http://site.icu-project.org/contacts). Once you have finished porting ICU to a new platform, it is recommended that you contribute your changes back to ICU via the icu-support mailing list. This will make it easier for everyone to benefit from your work.
#### Data For a New Platform
For some people, it may not be necessary for completely build ICU. Most of the makefiles and build targets are for tools that are used for building ICU's data, and an application's data (when an application uses ICU resource bundles for its data).
Data files can be built on a different platform when both platforms share the same endianness and the same charset family. This assertion does not include platform dependent DLLs/shared/static libraries. For details see the User Guide [ICU Data](https://unicode-org.github.io/icu/userguide/icudata) chapter.
ICU 3.6 removes the requirement that ICU be completely built in the native operating environment. It adds the icupkg tool which can be run on any platform to turn binary ICU data files from any one of the three formats into any one of the other data formats. This allows a application to use ICU data built anywhere to be used for any other target platform.
**WARNING!** Building ICU without running the tests is not recommended. The tests verify that ICU is safe to use. It is recommended that you try to completely port and test ICU before using the libraries for your own application.
#### Adapting Makefiles For a New Platform
Try to follow the build steps from the [UNIX](#how-to-build-and-install-on-unix) build instructions. If the configure script fails, then you will need to modify some files. Here are the usual steps for porting to a new platform:
1. Create an mh file in `<ICU>/source/config/`. You can use mh-linux or a similar mh file as your base configuration.
2. Modify `<ICU>/source/aclocal.m4` to recognize your platform's mh file.
3. Modify `<ICU>/source/configure.in` to properly set your **platform** C Macro define.
4. Run [autoconf](http://www.gnu.org/software/autoconf/) in `<ICU>/source/` without any options. The autoconf tool is standard on most Linux systems.
5. If you have any optimization options that you want to normally use, you can modify `<ICU>/source/runConfigureICU` to specify those options for your platform.
6. Build and test ICU on your platform. It is very important that you run the tests. If you don't run the tests, there is no guarentee that you have properly ported ICU.
### Platform Dependent Implementations
The platform dependencies have been mostly isolated into the following files in the common library. This information can be useful if you are porting ICU to a new platform.
* **unicode/platform.h.in** (autoconf'ed platforms)
**unicode/p_XXXX_.h** (others: pwin32.h, ppalmos.h, ..): Platform-dependent typedefs and defines:
* Generic types like `UBool`, `int8_t`, `int16_t`, `int32_t`, `int64_t`, `uint64_t` etc.
* `U_EXPORT` and `U_IMPORT` for specifying dynamic library import and export
* String handling support for the `char16_t` and `wchar_t` types.
* **unicode/putil.h, putil.c**: platform-dependent implementations of various functions that are platform dependent:
* `uprv_isNaN`, `uprv_isInfinite`, `uprv_getNaN` and `uprv_getInfinity` for handling special floating point values.
* `uprv_tzset`, `uprv_timezone`, `uprv_tzname` and `time` for getting platform specific time and time zone information.
* `u_getDataDirectory` for getting the default data directory.
* `uprv_getDefaultLocaleID` for getting the default locale setting.
* `uprv_getDefaultCodepage` for getting the default codepage encoding.
* **umutex.h, umutex.c**: Code for doing synchronization in multithreaded applications. If you wish to use International Components for Unicode in a multithreaded application, you must provide a synchronization primitive that the classes can use to protect their global data against simultaneous modifications. We already supply working implementations for many platforms that ICU builds on.
* **umapfile.h, umapfile.c**: functions for mapping or otherwise reading or loading files into memory. All access by ICU to data from files makes use of these functions.
* Using platform specific `#ifdef` macros are highly discouraged outside of the scope of these files. When the source code gets updated in the future, these `#ifdef`'s can cause testing problems for your platform.
* * *
Copyright © 2016 and later: Unicode, Inc. and others. License & terms of use: [http://www.unicode.org/copyright.html](http://www.unicode.org/copyright.html)
Copyright © 1997-2016 International Business Machines Corporation and others. All Rights Reserved.

View file

@ -2,7 +2,7 @@
layout: default
title: Packaging ICU4C
nav_order: 3
parent: ICU Data
parent: ICU4C
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -26,6 +26,37 @@ This chapter describes, for the advanced user, how to package ICU4C for
distribution, whether alone, as part of an application, or as part of the
operating system.
There are many ways that a person can package ICU with their software products. Usually only the libraries need to be considered for packaging.
On UNIX, you should use `gmake install` to make it easier to develop and package ICU. The bin, lib and include directories are needed to develop applications that use ICU. These directories will be created relative to the `--prefix=`dir" configure option (See the [UNIX build instructions](#how-to-build-and-install-on-unix)). When ICU is built on Windows, a similar directory structure is built.
When changes have been made to the standard ICU distribution, it is recommended that at least one of the following guidelines be followed for special packaging.
1. Add a suffix name to the library names. This can be done with the `--with-library-suffix` configure option.
2. The installation script should install the ICU libraries into the application's directory.
Following these guidelines prevents other applications that use a standard ICU
distribution from conflicting with any libraries that you need. On operating systems
that do not have a standard C++ ABI (name mangling) for compilers, it is recommended to
do this special packaging anyway. More details on customizing ICU are available in the
[User's Guide](https://unicode-org.github.io/icu/userguide/).
The [ICU Source Code Organization](./index/#icu-source-code-organization) section of
the ICU4C gives a more complete description of the libraries.
ICU has several libraries for you to use. Here is an example of libraries that are frequently packaged.
| Library Name | Windows Filename | Linux Filename | Comment |
|-------------------------------------|------------------|------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Data Library | icudtXYl.dll | libicudata.so.XY.Z | Data required by the Common and I18n libraries. There are many ways to package and [customize this data](https://unicode-org.github.io/icu/userguide/icudata), but by default this is all you need. |
| Common Library | icuucXY.dll | libicuuc.so.XY.Z | Base library required by all other ICU libraries. |
| Internationalization (i18n) Library | icuinXY.dll | libicui18n.so.XY.Z | A library that contains many locale based internationalization (i18n) functions. |
| Layout Extensions Engine | iculxXY.dll | libiculx.so.XY.Z | An optional engine for doing paragraph layout that uses parts of ICU. HarfBuzz is required. |
| ICU I/O (Unicode stdio) Library | icuioXY.dll | libicuio.so.XY.Z | An optional library that provides a stdio like API with Unicode support. |
| Tool Utility Library | icutuXY.dll | libicutu.so.XY.Z | An internal library that contains internal APIs that are only used by ICU's tools. If you do not use ICU's tools, you do not need this library. |
Normally only the above ICU libraries need to be considered for packaging. The versionless symbolic links to these libraries are only needed for easier development. The _X_, _Y_ and _Z_ parts of the name are the version numbers of ICU. For example, ICU 2.0.2 would have the name libicuuc.so.20.2 for the common library. The exact format of the library names can vary between platforms due to how each platform can handles library versioning.
## Making ICU Smaller
The ICU project is intended to provide everything an application might need in
@ -118,11 +149,9 @@ data to be installed and removed without rebuilding ICU. For details, see the
## ICU Versions
(This section assumes the reader is familiar with ICU version numbers (§) as
(This section assumes the reader is familiar with [ICU version numbers](../design#version-numbers-in-icu) as
covered in the [Design](../design.md) chapter, and filename conventions for
libraries in the
[ReadMe](https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#HowToPackage)
.)
libraries as described above.)
### POSIX Library Names

View file

@ -2,7 +2,7 @@
layout: default
title: Plug-ins
nav_order: 4
parent: ICU Data
parent: ICU4C
---
<!--
© 2020 and later: Unicode, Inc. and others.

View file

@ -1,8 +1,8 @@
---
layout: default
title: ICU4J FAQ
nav_order: 7
parent: Misc
nav_order: 1
parent: ICU4J
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -94,9 +94,8 @@ generate a change report page by following steps.
1. Download [ICU4J 64 source package
archive](http://site.icu-project.org/download/64#TOC-ICU4J-Download)
from the ICU 64 download page and extract files to your local system.
2. Set up ICU4J build environment as explained in
[readme.html](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4j/readme.html)
included in the root directory of the ICU4J source package archive.
2. Set up ICU4J build environment as explained in the
[ICU4C Readme](./index).
3. Edit
[build.properties](https://github.com/unicode-org/icu/blob/main/icu4j/build.properties)
in the root directory and change the property value api.report.prev.version

View file

@ -1,8 +1,8 @@
---
layout: default
title: ICU4J Readme
nav_order: 8
parent: ICU
title: ICU4J
nav_order: 4
has_children: true
---
<!--
© 2020 and later: Unicode, Inc. and others.
@ -53,20 +53,6 @@ ICU4J is an add-on to the regular JRE that provides:
> :point_right: **Note:** We continue to provide assistance to Java, and in some cases, ICU4J support has been rolled into a later release of Java. For example, BCP47 language tag support including Unicode locale extensions is now in Java 7\. However, the most current and complete version is always found in ICU4J.
## What Is New In The Current Release?
See the [ICU download page](http://site.icu-project.org/download/) to find the subpage for the current release, including any other changes, bug fixes, known issues, changes to supported platforms and build environments, and migration issues for existing applications migrating from previous ICU releases.
The subpage for the current release will also include an API Change Report, both for ICU4C and ICU4J, for a complete list of APIs added, removed, or changed in this release.
The list of API changes since the previous ICU4J release is available [here](https://htmlpreview.github.io/?https://raw.githubusercontent.com/unicode-org/icu/main/icu4j/APIChangeReport.html).
Changes in previous releases can also be found the main [ICU download page](http://site.icu-project.org/download) in its version-specific subpages.
## License Information
The ICU projects (ICU4C and ICU4J) are hosted by the [Unicode Consortium](http://www.unicode.org/). The ICU binary and source files are distributed under the [UNICODE DATA FILES AND SOFTWARE LICENSE](http://www.unicode.org/copyright.html). The full copy of the license and third party software licenses are available in [LICENSE](https://github.com/unicode-org/icu/blob/main/icu4j/main/shared/licenses/LICENSE) file included in this package.
## Platform Dependencies
The minimum Java runtime version supported by ICU4J 68 is version 7\. Java runtime version 6 is not supported.
@ -115,8 +101,7 @@ Below, all directory paths are relative to the directory where the ICU4J source
| Path | Description |
|------------------------------|-----------------------------------------------------------------------------------------|
| readme.html | A description of ICU4J (International Components for Unicode for Java) |
| build.html | The main Ant build file for ICU4J. See [How to Install and Build](#how-to-install-and-build) for more information |
| build.xml | The main Ant build file for ICU4J. See [How to Install and Build](#how-to-install-and-build) for more information |
| main/shared/licenses/LICENSE | ICU license |
### ICU4J runtime class files
@ -339,7 +324,7 @@ For more information, read the Ant documentation and the **build.xml** file.
> :point_right: **Note**: **Eclipse users:** See the ICU4J site for information on [how to configure Eclipse](http://site.icu-project.org/setup/eclipse) to build and develop ICU4J on Eclipse IDE.
> :point_right: **Note**: To install and configure ICU4J Locale Service Provider, please refer the user guide page [ICU4J Locale Service Provider](https://unicode-org.github.io/icu/userguide/icu4j-locale-service-provider).
> :point_right: **Note**: To install and configure ICU4J Locale Service Provider, please refer the user guide page [ICU4J Locale Service Provider](./locale-service-provider).
## Trying Out ICU4J
@ -412,7 +397,7 @@ The files in `icudata.jar` get extracted to `com/ibm/icu/impl/data` in the build
### Building ICU4J Resources from ICU4C
ICU4J data is built by ICU4C tools. Please see [`icu4j-readme.txt`](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/icu4j-readme.txt) in `icu4c/source/data` for the procedures.
ICU4J data is built by ICU4C tools. Please see [ICU Data Build Tool](../icu_data/buildtool) for the procedures.
#### Generating Data from CLDR
@ -426,7 +411,7 @@ ICU4J data is built by ICU4C tools. Please see [`icu4j-readme.txt`](https://gith
4. Follow the instructions in [`icu4c/source/data/cldr-icu-readme.txt`](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/cldr-icu-readme.txt)
5. Rebuild ICU4C with the newly generated data.
6. Run ICU4C tests to verify that the new data is good.
7. Build ICU4J data from ICU4C data by following the procedures in [`icu4j/source/data/icu4j-readme.txt`](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/icu4j-readme.txt)
7. Build ICU4J data from ICU4C data by following the procedures in [ICU Data Build Tool](../icu_data/buildtool)
8. cd to `icu4j` dir
9. Build and test icu4j
@ -446,8 +431,6 @@ Your comments are important to making ICU4J successful. We are committed to inve
To submit comments, request features and report bugs, please see [ICU bug database information](http://site.icu-project.org/bugs) or contact us through the [ICU Support mailing list](http://site.icu-project.org/contacts). While we are not able to respond individually to each comment, we do review all comments.
## Thank you for your interest in ICU4J!
* * *
© 2016 and later: Unicode, Inc. and others.

View file

@ -1,8 +1,8 @@
---
layout: default
title: ICU4J Locale Service Provider
nav_order: 7
parent: ICU
nav_order: 2
parent: ICU4J
---
<!--
© 2020 and later: Unicode, Inc. and others.

View file

@ -75,8 +75,8 @@ print messages when errors are found in your config file.
$ pip3 install --user hjson jsonschema
To build ICU4J with custom data, you must first build ICU4C with custom data
and then generate the JAR file. For more information, read
[icu4j-readme.txt](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/icu4j-readme.txt).
and then generate the JAR file. For more information on building ICU4J, read the
[ICU4J Readme](../icu4j/).
### Locale Slicing

View file

@ -1,7 +1,7 @@
---
layout: default
title: ICU Data
nav_order: 13
nav_order: 15
has_children: true
---
<!--
@ -1127,15 +1127,14 @@ corresponding resource files already in that directory.
1. [ICU4C](http://icu-project.org/download/)
2. Compilers and tools required for [building ICU4C](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#HowToBuild).
2. Compilers and tools required for [building ICU4C](../icu4c/build).
3. J2SE SDK version 5 or above
#### Procedure
1. Download and build ICU4C on a Windows or Linux machine. For instructions on downloading and building ICU4C, please click
[here](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/main/icu4c/readme.html#HowToBuild).
[here](../icu4c/build).
2. Follow the remaining instructions in
[*$icu4c_root*/source/data/icu4j-readme.txt](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/icu4j-readme.txt).
*$icu4c_root* is the root directory of ICU4C source package.
the [ICU4J Readme](../icu4j/).

View file

@ -1,7 +1,7 @@
---
layout: default
title: IO
nav_order: 11
nav_order: 13
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Layout Engine
nav_order: 12
nav_order: 14
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Locales and Resources
nav_order: 5
nav_order: 7
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Chars and Strings
nav_order: 3
nav_order: 5
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Transforms
nav_order: 8
nav_order: 10
has_children: true
---
<!--

View file

@ -1,7 +1,7 @@
---
layout: default
title: Use From...
nav_order: 14
nav_order: 16
has_children: true
---
<!--

File diff suppressed because it is too large Load diff

View file

@ -13,915 +13,15 @@ h3.doc { text-decoration: underline }
</head>
<body style="background-color: rgb(255, 255, 255);" lang="EN-US"
link="#0000ff" vlink="#800080">
<h1>International Components for Unicode for Java (ICU4J)</h1>
<h2>Read Me for ICU4J 69.1</h2>
(Last Update: 2021-Mar-16)
<hr size="2" width="100%">
<p>
<!-- <b>Note:</b> This is a major release of ICU4J. It contains bug fixes and adds implementations
of inherited API and introduces new API or functionality. -->
<!-- <b>Note:</b> This is a preview release of ICU4J 69.
The contents of this document may not reflect the recent changes done
for ICU 69 development. It is not recommended for production use. -->
<!-- <b>Note:</b> This is a development milestone of ICU4J 69.
The contents of this document may not reflect the recent changes done
for ICU 69 development. It is not recommended for production use. -->
<b>Note:</b> This is a release candidate of ICU4J 69.
The contents of this document may not reflect the recent changes done
for ICU 69 development. This release candidate is intended for those
wishing to verify ICU 69 integration before final release. It is not
recommended for production use.
</p>
<p>For the most recent release, see the <a
href="http://www.icu-project.org/download/"> ICU4J
download site</a>. </p>
<h2 class="doc">Contents</h2>
<ul type="disc">
<li><a href="#introduction">Introduction to ICU4J</a></li>
<li><a href="#changes">Changes In This Release</a></li>
<li><a href="#license">License Information</a></li>
<li><a href="#PlatformDependencies">Platform Dependencies</a></li>
<li><a href="#download">How to Download ICU4J</a></li>
<li><a href="#WhatContain">The Structure and Contents of ICU4J</a></li>
<li><a href="#API">Where to Get Documentation</a></li>
<li><a href="#HowToInstallJavac">How to Install and Build</a></li>
<li><a href="#HowToModularize">How to modularize ICU4J</a></li>
<li><a href="#tryingout">Trying Out ICU4J</a></li>
<li><a href="#resources">ICU4J Resource Information</a></li>
<li><a href="#timezone">About ICU4J Time Zone</a></li>
<li><a href="#WhereToFindMore">Where to Find More Information</a></li>
<li><a href="#SubmittingComments">Submitting Comments, Requesting
Features and Reporting Bugs</a></li>
</ul>
<h2 class="doc"><a name="introduction"></a>Introduction to ICU4J</h2>
<p>The International Components for Unicode (ICU) library provides
robust and
full-featured Unicode services on a wide variety of platforms. ICU
supports the
most current version of the Unicode standard, including support for
supplementary characters (needed for GB 18030 repertoire support).</p>
<p>Java provides a strong foundation for global programs, and IBM and
the
ICU team played a key role in providing globalization technology to
Java. But because of its long release schedule, Java cannot always keep
up with evolving standards. The ICU team continues to extend Java's
Unicode and internationalization support, focusing on improving
performance,
keeping current with the Unicode standard, and providing richer APIs,
while
remaining as compatible as possible with the original Java text and
internationalization API design.</p>
<p>ICU4J is an add-on to the regular JRE that provides:
</p>
<ul>
<li><a
href="https://unicode-org.github.io/icu/userguide/collation"><b>Collation</b></a>
&#8211; rule-based, up-to-date Unicode Collation Algorithm (UCA) sorting order<br>
&nbsp;&nbsp;&nbsp;&nbsp;For fast multilingual string comparison; faster
and more complete than
the J2SE implementation</li>
<li><a href="https://unicode-org.github.io/icu/userguide/conversion/detection"><b>Charset
Detection</b></a> &#8211; Recognition of various single and multibyte charsets<br>
&nbsp;&nbsp;&nbsp;&nbsp;Useful for recognizing untagged text data</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/strings/unicodeset"><b>UnicodeSet</b></a>
&#8211; standard set operations optimized for sets of Unicode characters<br>
&nbsp;&nbsp;&nbsp;&nbsp;UnicodeSets can be built from string patterns
using any Unicode properties.</li>
<li><a href="https://unicode-org.github.io/icu/userguide/transforms"><b>Transforms</b></a>
&#8211; a flexible mechanism for Unicode text conversions<br>
&nbsp;&nbsp;&nbsp;&nbsp;Including Full/Halfwidth conversions,
Normalization, Case conversions, Hex
conversions, and transliterations between scripts (50+ pairs)</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/transforms/normalization"><b>Unicode
Normalization</b></a> &#8211; NFC, NFD, NFKD, NFKC<br>
&nbsp;&nbsp;&nbsp;&nbsp;For canonical text representations, needed for
XML and the net</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/datetime/calendar"><b>International
Calendars</b></a> &#8211; Arabic, Buddhist, Chinese, Hebrew, Japanese, Ethiopic, Islamic, Coptic and other calendars<br>
&nbsp;&nbsp;&nbsp;&nbsp;Required for correct presentation of dates in
certain countries</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/format_parse/datetime"><b>Date
Format
Enhancements</b></a> &#8211; Date/time pattern generator, Relative date formatting, etc.<br>
&nbsp;&nbsp;&nbsp;&nbsp;Enhancements to the normal Java date
formatting.</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/format_parse/numbers"><b>Number
Format
Enhancements</b></a> &#8211; Scientific Notation, Spelled-out, Compact decimal format, etc.<br>
&nbsp;&nbsp;&nbsp;&nbsp;Enhancements to the normal Java number
formatting. The spell-out format is
used for checks and similar documents</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/boundaryanalysis"><b>Enhanced
Word-Break Detection</b></a> &#8211; Rule-based, supports Thai, Khmer, Chinese, etc.<br>
&nbsp;&nbsp;&nbsp;&nbsp;Required for correct support of Thai</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/conversion/compression"><b>Unicode
Text
Compression</b></a> &#8211; Standard compression of Unicode text<br>
&nbsp;&nbsp;&nbsp;&nbsp;Suitable for large numbers of small fields,
where LZW and similar schemes
do not apply</li>
<li><a
href="https://unicode-org.github.io/icu/userguide/conversion"><b>Charset Conversion</b></a> &#8211; Conversion to and from different charsets.<br>
&nbsp;&nbsp;&nbsp;&nbsp;Plugs into Java CharsetProvider Service Provider Interface (SPI)</li>
<p>This readme has moved to the <a href="https://unicode-org.github.io/icu/userguide/icu/icu4j-readme/">ICU4J Readme</a>
section in the <a href="https://unicode-org.github.io/icu/">ICU User Guide</a>.</p>
</ul>
<blockquote>
<p><b>Note:</b> We continue to provide assistance to Java, and in some
cases, ICU4J support has been rolled into a later release of Java. For
example, BCP47 language tag support including Unicode locale extensions
is now in Java 7. However, the most current and complete version is always
found in ICU4J.</p>
</blockquote>
<hr />
<p> Copyright &copy; 2016 and later: Unicode, Inc. and others. License &amp; terms of use:
<a href="http://www.unicode.org/copyright.html">http://www.unicode.org/copyright.html</a><br/>
Copyright &copy; 1997-2016 International Business Machines Corporation and others.
All Rights Reserved.</p>
<h2 class="doc"><a name="changes"></a>Changes In This Release</h2>
<p>See the <a href="http://site.icu-project.org/download/69">ICU 69 download page</a>
for more informantion about changes in this release.</p>
<p>The list of API changes since the previous ICU4J release is available
<a href="APIChangeReport.html">here</a>.</p>
<h2 class="doc"><a name="license"></a>License Information</h2>
<p>
The ICU projects (ICU4C and ICU4J) are hosted by the
<a href="http://www.unicode.org/">Unicode Consortium</a>. The ICU binary
and source files are distributed under the
<a href="http://www.unicode.org/copyright.html">UNICODE DATA FILES
AND SOFTWARE LICENSE</a>. The full copy of the license and third party
software licenses are available in <a href="./main/shared/licenses/LICENSE">LICENSE</a>
file included in this package.
</p>
<h2 class="doc"><a name="PlatformDependencies"></a>Platform Dependencies</h2>
<p>
The minimum Java runtime version supported by ICU4J 69 is version 7. Java runtime version 6 is not supported.
</p>
<p>
ICU4J since version 63 depend on J2SE 7 functionality. Therefore, ICU4J only runs on
JRE version 7 or later. ICU4J 69 is tested on JRE 7, 8, 9, 10 and 11.
</p>
<h2 class="doc"><a name="download"></a>How to Download ICU4J</h2>
<p>There are a few different ways to download the ICU4J releases.
</p>
<ul type="disc">
<li><b>Official Release:</b><br>
If you want to use ICU4J (as opposed to developing it), your best bet
is to download an official, packaged version of the ICU4J library files.
These versions are tested more thoroughly than day-to-day development
builds, and they are packaged in jar files for convenient download.
<ul>
<li><a href="http://www.icu-project.org/download/">ICU Download page</a>.</li>
<li>Maven repository:
<pre>
&lt;dependency&gt;
&lt;groupId&gt;com.ibm.icu&lt;/groupId&gt;
&lt;artifactId&gt;icu4j&lt;/artifactId&gt;
&lt;version&gt;69.1&lt;/version&gt;
&lt;/dependency&gt;
&lt;dependency&gt;
&lt;groupId&gt;com.ibm.icu&lt;/groupId&gt;
&lt;artifactId&gt;icu4j-charset&lt;/artifactId&gt;
&lt;version&gt;69.1&lt;/version&gt;
&lt;/dependency&gt;
&lt;dependency&gt;
&lt;groupId&gt;com.ibm.icu&lt;/groupId&gt;
&lt;artifactId&gt;icu4j-localespi&lt;/artifactId&gt;
&lt;version&gt;69.1&lt;/version&gt;
&lt;/dependency&gt;
</pre>
</ul>
</ul>
<ul type="disc">
<li><b>GitHub Source Repository:</b><br>
If you are interested in developing features, patches, or bug fixes for
ICU4J, you should probably be working with the latest version of the
ICU4J source code. You will need to clone and checkout the code from our GitHub
repository to ensure that you have the most recent version of all of
the files. There are several ways to do this. Please follow the
directions that are contained on the <a
href="http://www.icu-project.org/repository/">Source
Repository page</a> for details.
</li>
</ul>
<p>For more details on how to download ICU4J directly from the web
site, please see the ICU download page at <a
href="http://www.icu-project.org/download/">http://www.icu-project.org/download/</a>
</p>
<h2 class="doc"><a name="WhatContain"></a>The Structure and Contents of
ICU4J</h2>
<p>Below, all directory pathes are relative to the directory where the
ICU4J source archive is extracted.
</p>
<p><b>Information and build files:</b></p>
<table border="1">
<tr>
<th>Path</th>
<th>Description</th>
</tr>
<tr>
<td>readme.html</td>
<td>A description of ICU4J (International Components for Unicode for Java)</td>
</tr>
<tr>
<td>build.html</td>
<td>The main Ant build file for ICU4J. See <a href="#HowToInstallJavac">How to Install
and Build</a> for more information</td>
</tr>
<tr>
<td>main/shared/licenses/LICENSE</td>
<td>ICU license</td>
</tr>
</table>
<p><b>ICU4J runtime class files:</b></p>
<table border="1">
<tr>
<th>Path</th>
<th>Sub-component Name</th>
<th>Build Dependencies</th>
<th>Public API Packages</th>
<th>Description</th>
</tr>
<tr>
<td>main/classes/charset</td>
<td>icu4j-charset</td>
<td>icu4j-core</td>
<td>com.ibm.icu.charset</td>
<td>Implementation of <code>java.nio.charset.spi.CharsetProvider</code>.
This sub-component is shipped as icu4j-charset.jar along with
ICU charset converter data files.</td>
</tr>
<tr>
<td>main/classes/collate</td>
<td>icu4j-collate</td>
<td>icu4j-core</td>
<td>com.ibm.icu.text<br>
com.ibm.icu.util</td>
<td>Collator APIs and implementation. Also includes some public API classes
that depend on Collator.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
<tr>
<td>main/classes/core</td>
<td>icu4j-core</td>
<td>n/a</td>
<td>com.ibm.icu.lang<br>
com.ibm.icu.math<br>
com.ibm.icu.text<br>
com.ibm.icu.util</td>
<td>ICU core API classes and implementation.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
<tr>
<td>main/classes/currdata</td>
<td>icu4j-currdata</td>
<td>icu4j-core</td>
<td>n/a</td>
<td>No public API classes. Provides access to currency display data.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
<tr>
<td>main/classes/langdata</td>
<td>icu4j-langdata</td>
<td>icu4j-core</td>
<td>n/a</td>
<td>No public API classes. Provides access to language display data.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
<tr>
<td>main/classes/localespi</td>
<td>icu4j-localespi</td>
<td>icu4j-core<br>
icu4j-collate<br>
</td>
<td>n/a</td>
<td>Implementation of various locale-sensitive service providers defined
in <code>java.text.spi</code> and <code>java.util.spi</code> in J2SE 6.0
or later Java releases.
This sub-component is shipped as icu4j-localespi.jar.</td>
</tr>
<tr>
<td>main/classes/regiondata</td>
<td>icu4j-regiondata</td>
<td>icu4j-core</td>
<td>n/a</td>
<td>No public API classes. Provides access to region display data.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
<tr>
<td>main/classes/translit</td>
<td>icu4j-translit</td>
<td>icu4j-core</td>
<td>com.ibm.icu.text</td>
<td>Transliterator APIs and implementation.
This sub-component is packaged as a part of icu4j.jar.</td>
</tr>
</table>
<p><b>ICU4J unit test files:</b></p>
<table border="1">
<tr>
<th>Path</th>
<th>Sub-component Name</th>
<th>Runtime Dependencies</th>
<th>Description</th>
</tr>
<tr>
<td>main/tests/charset</td>
<td>icu4j-charset-tests</td>
<td>icu4j-charset<br>
icu4j-core<br>
icu4j-test-framework</td>
<td>Test suite for charset sub-component.</td>
</tr>
<tr>
<td>main/tests/collate</td>
<td>icu4j-collate-tests</td>
<td>icu4j-collate<br>
icu4j-core<br>
icu4j-test-framework</td>
<td>Test suite for collate sub-component.</td>
</tr>
<tr>
<td>main/tests/core</td>
<td>icu4j-core-tests</td>
<td>icu4j-core<br>
icu4j-currdata<br>
icu4j-langdata<br>
icu4j-regiondata<br>
icu4j-test-framework</td>
<td>Test suite for core sub-component.</td>
</tr>
<tr>
<td>main/tests/framework</td>
<td>icu4j-test-framework</td>
<td>icu4j-core</td>
<td>Common ICU4J unit test framework and utilities.</td>
</tr>
<tr>
<td>main/tests/localespi</td>
<td>icu4j-localespi-tests</td>
<td>icu4j-core<br>
icu4j-collate<br>
icu4j-currdata<br>
icu4j-langdata<br>
icu4j-localespi<br>
icu4j-regiondata<br>
icu4j-test-framework</td>
<td>Test suite for localespi sub-component.</td>
</tr>
<tr>
<td>main/tests/packaging</td>
<td>icu4j-packaging-tests</td>
<td>icu4j-core<br>
icu4j-test-framework</td>
<td>Test suite for sub-component packaging.</td>
</tr>
<tr>
<td>main/tests/translit</td>
<td>icu4j-translit-tests</td>
<td>icu4j-core<br>
icu4j-translit
icu4j-test-framework</td>
<td>Test suite for translit sub-component.</td>
</tr>
</table>
<p><b>Others:</b></p>
<table border="1">
<tr>
<th>Path</th>
<th>Description</th>
</tr>
<tr>
<td>main/shared</td>
<td>Files shared by ICU4J sub-components under the <code>main</code> directory including:
<ul>
<li>ICU4J runtime data archive (icudata.jar).</li>
<li>ICU4J unit test data archive (testdata.jar).</li>
<li>Shared Ant build script and configuration files.</li>
<li>License files.</li>
</ul>
</td>
</tr>
<tr>
<td>demos</td>
<td>ICU4J demo programs.</td>
</tr>
<tr>
<td>perf-tests</td>
<td>ICU4J performance test files.</td>
</tr>
<tr>
<td>tools</td>
<td>ICU4J tools including:
<ul>
<li>Custom JavaDoc taglets used for generating ICU4J API references.</li>
<li>API report tool and data.</li>
<li>Other independent utilities used for ICU4J development.</li>
</ul>
</td>
</tr>
<tr>
<td>lib</td>
<td>Folder used for downloading depedency libraries.<br>
<b>Note:</b> ICU4J runtime libraries do not depend on any external libraries other
than JDK. These dependencies are for testing (such as JUnit).</td>
</tr>
</table>
<h2 class="doc"><a name="API"></a>Where to get Documentation</h2>
<p>The <a href="https://unicode-org.github.io/icu/userguide/">ICU user's
guide</a> contains lots of general information about ICU, in its C,
C++, and Java incarnations.</p>
<p>The complete API documentation for ICU4J (javadoc) is available on
the ICU4J web site, and can be built from the sources:
</p>
<ul>
<li><a href="http://www.icu-project.org/apiref/icu4j/">Index
to all ICU4J API</a></li>
<li><a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/CharsetDetector.html">Charset Detector</a> &#8211; Detection of charset from a byte stream</li>
<li>International Calendars &#8211;
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/BuddhistCalendar.html">Buddhist</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/ChineseCalendar.html">Chinese</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/CopticCalendar.html">Coptic</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/EthiopicCalendar.html">Ethiopic</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/GregorianCalendar.html">Gregorian</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/HebrewCalendar.html">Hebrew</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/IndianCalendar.html">Indian</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/IslamicCalendar.html">Islamic</a>,
<a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/JapaneseCalendar.html">Japanese</a>,
Persian, Dangi.</li>
<li>Time Zone Enhancements &#8211;
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/BasicTimeZone.html">Time zone transition and rule detection</a>,
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/VTimeZone.html">iCalendar VTIMEZONE formatting and parsing</a>,
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/RuleBasedTimeZone.html">Custom time zones constructed by user defined rules</a>.
<li>Date Format Enhancements &#8211; <a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/DateTimePatternGenerator.html">Date/Time Pattern Generator</a>,
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/DateIntervalFormat.html">Date Interval Format</a>,
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/DurationFormat.html">Duration Format</a>.
<li><a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/Normalizer.html">Unicode
Normalization</a> &#8211; Canonical text representation for W3C.</li>
<li><a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/NumberFormat.html">Number
Format Enhancements</a> &#8211; Scientific Notation, Spelled out.</li>
<li><a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/BreakIterator.html">Enhanced
word-break detection</a> &#8211; Rule-based, supports Thai</li>
<li><a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/Transliterator.html">Transliteration</a>
&#8211; A general framework for converting text from one format to another,
e.g. Cyrillic to Latin, or Hex to Unicode. </li>
<li>Unicode Text <a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeCompressor.html">Compression</a>
&amp; <a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeDecompressor.html">Decompression</a>
&#8211; 2:1 compression on English Unicode text.</li>
<li>Collation &#8211; <a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/RuleBasedCollator.html">Rule-based
sorting</a>, <a
href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/StringSearch.html">Efficient
multi-lingual searching</a>,
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/AlphabeticIndex.html">Alphabetic indexing</a></li>
</ul>
<h2 class="doc"><a name="HowToInstallJavac"></a>How to Install and Build</h2>
<p>
To install ICU4J, simply place the pre-built jar file <strong>icu4j.jar</strong>
on your Java CLASSPATH. If you need Charset API support please also place
<strong>icu4j-charset.jar</strong> on your class path along with <strong>icu4j.jar</strong>.
</p>
<p>
To build ICU4J, you will need JDK 7 or later (JDK 8 is the reference environment for this release)
and the Apache Ant version 1.9 or later. It's recommended to install both the JDK and Ant
somewhere <em>outside</em>the ICU4J directory. For example, on Linux you might install
these in <code>/usr/local</code>.</p>
<ul>
<li>Install JDK 8.</li>
<li>Install the <a href="http://ant.apache.org/"><strong>Apache Ant</strong></a>
1.9 or later.
<li>Set environment variables JAVA_HOME, ANT_HOME and PATH, for example:
<pre>
set JAVA_HOME=C:\jdk1.8.0
set ANT_HOME=C:\apache-ant
set PATH=%JAVA_HOME%\bin;%ANT_HOME%\bin;%PATH%</pre>
</pre>
</li>
</ul>
<p>Once the JDK and Ant are configured, run the desired target defined in
<strong>build.xml</strong>. The default target is "jar" which compiles ICU4J library
class files and create ICU4J jar files. For example:</p>
<blockquote>
<pre>C:\icu4j>ant
Buildfile: C:\icu4j\build.xml
info:
[echo] ----- Build Environment Information -------------------
[echo] Java Home: C:\jdk1.8.0\jre
[echo] Java Version: 1.8.0_181
[echo] Ant Home: C:\apache-ant
[echo] Ant Version: Apache Ant(TM) version 1.10.1 compiled on February 2 2017
[echo] OS: Windows 10
[echo] OS Version: 10.0
[echo] OS Arch: amd64
[echo] Host: ICUDEV
[echo] -------------------------------------------------------
core:
@compile:
[echo] build-local: ../../shared/../../build-local.properties
[echo] --- java compiler arguments ------------------------
[echo] source dir: C:\icu4j\main\classes\core/src
[echo] output dir: C:\icu4j\main\classes\core/out/bin
[echo] bootclasspath:
[echo] classpath:
[echo] source: 1.7
[echo] target: 1.7
[echo] debug: on
[echo] encoding: UTF-8
[echo] compiler arg: -Xlint:all,-deprecation,-dep-ann,-options,-overrides
[echo] ----------------------------------------------------
[mkdir] Created dir: C:\icu4j\main\classes\core\out\bin
[javac] Compiling 470 source files to C:\icu4j\main\classes\core\out\bin
[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
compile:
@copy:
[copy] Copying 24 files to C:\icu4j\main\classes\core\out\bin
set-icuconfig-datapath:
copy-data:
[unjar] Expanding: C:\icu4j\main\shared\data\icudata.jar into C:\icu4j\main\
classes\core\out\bin
[unjar] Expanding: C:\icu4j\main\shared\data\icutzdata.jar into C:\icu4j\mai
n\classes\core\out\bin
...
...
...
localespi:
@compile:
[echo] build-local: ../../shared/../../build-local.properties
[echo] --- java compiler arguments ------------------------
[echo] source dir: C:\icu4j\main\classes\localespi/src
[echo] output dir: C:\icu4j\main\classes\localespi/out/bin
[echo] bootclasspath:
[echo] classpath: C:\icu4j\main\classes\core\out\lib\icu4j-core.jar;C:
\icu4j\main\classes\collate\out\lib\icu4j-collate.jar
[echo] source: 1.7
[echo] target: 1.7
[echo] debug: on
[echo] encoding: UTF-8
[echo] compiler arg: -Xlint:all,-deprecation,-dep-ann,-options
[echo] ----------------------------------------------------
[mkdir] Created dir: C:\icu4j\main\classes\localespi\out\bin
[javac] Compiling 22 source files to C:\icu4j\main\classes\localespi\out\bin
compile:
@copy:
[copy] Copying 11 files to C:\icu4j\main\classes\localespi\out\bin
copy:
@jar:
[mkdir] Created dir: C:\icu4j\main\classes\localespi\out\lib
[copy] Copying 1 file to C:\icu4j\main\classes\localespi\out
[jar] Building jar: C:\icu4j\main\classes\localespi\out\lib\icu4j-localesp
i.jar
jar:
@src-jar:
[jar] Building jar: C:\icu4j\main\classes\localespi\out\lib\icu4j-localesp
i-src.jar
src-jar:
build:
jar:
[copy] Copying 1 file to C:\icu4j
[copy] Copying 1 file to C:\icu4j
BUILD SUCCESSFUL
Total time: 30 seconds</pre>
</blockquote>
<I>Note: The above output is an example. The numbers are likely to be different with the current version ICU4J.</I>
<p>The following are some targets that you can provide to <b>ant</b>.
For more targets run <code>ant -projecthelp</code> or see the build.xml file.</p>
<table border="1">
<tr>
<th>jar (default)</th>
<td>Create ICU4J runtime library jar archives (<code>icu4j.jar</code>,
<code>icu4j-charset.jar</code> and <code>icu4j-localespi.jar</code>)
in the root ICU4J directory.</td>
</tr>
<tr>
<th>check</th>
<td>Build all ICU4J runtime library classes and corresponding unit test cases,
then run the tests.</td>
</tr>
<tr>
<th>clean</th>
<td>Remove all build output files.</td>
</tr>
<tr>
<th>main</th>
<td>Build all ICU4J runtime library sub-components (under the directory
<code>main/classes</code>).</td>
</tr>
<tr>
<th>tests</th>
<td>Build all ICU4J unit test sub-components (under the directory <code>main/tests</code>)
and their dependencies.</td>
</tr>
<tr>
<th>tools</th>
<td>Build the tools.</td>
</tr>
<tr>
<th>docs</th>
<td>Run javadoc over the ICU4J runtime library files, generating an HTML documentation
tree in the subdirectory <code>doc</code>.</td>
</tr>
<tr>
<th>jarDocs</th>
<td>Create ICU4J doc jar archive (<code>icu4jdocs.jar</code>) containing API reference
docs in the root ICU4J directory. </td>
</tr>
<tr>
<th>jarDemos</th>
<td>Create ICU4J demo jar archive (<code>icu4jdemos.jar</code>) in the root ICU4J
directory.</td>
</tr>
</table>
<p>For more information, read the Ant documentation and the <strong>build.xml</strong>
file.</p>
<p><b>Note:</b> If you get an OutOfMemoryError when you are running <tt>"ant check"</tt>,
you can set the heap size of the jvm by setting the environment variable JVM_OPTIONS
to the appropriate java options.</p>
<p><b>Eclipse users:</b> See the ICU4J site for information on<a
href="http://site.icu-project.org/setup/eclipse">
how to configure Eclipse</a> to build and develop ICU4J on Eclipse IDE.</p>
<p><b>Note:</b> To install and configure ICU4J Locale Service Provider, please refer the user guide
page <a href="https://unicode-org.github.io/icu/userguide/icu4j-locale-service-provider">ICU4J Locale
Service Provider</a>.</p>
<h2 class="doc"><a name="tryingout"></a>Trying Out ICU4J</h2>
<p><strong>Note:</strong> the demos provided with ICU4J are for the
most part undocumented. This list can show you where to look, but
you'll have to experiment a bit. The demos are <strong>unsupported</strong>
and may change or disappear without notice.</p>
<p>The icu4j.jar file contains only the ICU4J runtime library classes, not the
demo classes, so unless you build ICU4J there is little to try out.
</p>
<h3 class="doc">Charset</h3>
To try out the <strong>Charset</strong> package, build <strong>icu4j.jar</strong> and
<strong>icu4j-charset.jar</strong> using the 'jar' target.
You can use the charsets by placing these files on your classpath.
<blockquote><tt>java -cp $icu4j_root/icu4j.jar:$icu4j_root/icu4j-charset.jar &lt;your program&gt;</tt></blockquote>
<h3 class="doc">Other demos</h3>
<p>The other demo programs are <strong>not supported</strong> and
exist only to let you experiment with the ICU4J classes. First, build ICU4J using <tt>ant&nbsp;jarDemos</tt>.
Then launch the demos as below:</p>
<blockquote><tt>java -jar $icu4j_root/icu4jdemos.jar</tt></blockquote>
<h2 class="doc"><a name="resources">ICU4J Resource Information</a></h2>
Starting with release 2.1, ICU4J includes its own
resource information
which is completely independent of the JRE resource information. (Note,
ICU4J 2.8 to 3.4, time zone information depends on the underlying JRE).
The ICU4J resource information is equivalent to the information in ICU4C and
many resources are, in fact, the same binary files that ICU4C uses.
<p>
By default the ICU4J distribution includes all of the standard resource
information. It is located under the directory com/ibm/icu/impl/data.
Depending on the service, the data is in different locations and in
different formats. <strong>Note:</strong> This will continue to change
from release to release, so clients should not depend on the exact
organization
of the data in ICU4J.</p>
<ul>
<li>The primary <b>locale data</b> is under the directory <tt>icudt69b</tt>,
as a set of <tt>".res"</tt> files whose names are the locale identifiers.
Locale naming is documented the <code>com.ibm.icu.util.ULocale</code>
class, and the use of these names in searching for resources is documented
in <code>com.ibm.icu.util.UResourceBundle</code>.</li>
<li>The <b>break iterator data</b> is under the directory <tt>icudt69b/brkitr</tt>,
as a set of <tt>".res"</tt>, <tt>".brk"</tt> and <tt>".dict"</tt> files.</li>
<li>The <b>collation data</b> is under the directory <tt>icudt69b/coll</tt>,
as a set of <tt>".res"</tt> files.</li>
<li>The <b>currency display name data</b> is under the directory <tt>icudt69b/curr</tt>,
as a set of <tt>".res"</tt> files.</li>
<li>The <b>language display name data</b> is under the directory <tt>icudt69b/lang</tt>,
as a set of <tt>".res"</tt> files.</li>
<li>The <b>rule-based number format data</b> is under the directory
<tt>icudt69b/rbnf</tt>, as a set of <tt>".res"</tt> files.
<li>The <b>region display name data</b> is under the directory <tt>icudt69b/region</tt>,
as a set of <tt>".res"</tt> files.</li>
<li>The <b>rule-based transliterator data</b> is under the directory
<tt>icudt69b/translit</tt>, as a set of <tt>".res"</tt> files.</li>
<li>The <b>measurement unit data</b> is under the directory <tt>icudt69b/unit</tt>,
as a set of <tt>".res"</tt> files.</li>
<li>The <b>time zone display name data</b> is under the directory
<tt>icudt69b/zone</tt>, as a set of <tt>".res"</tt> files.</li>
<li>The <b>character property data</b> and default <b>unicode collation algorithm
(UCA) data</b> is found under the directory <tt>icudt69b</tt>, as a set of
<tt>".icu"</tt> files. </li>
<li>The <b>normalization data</b> is found under the directory <tt>icudt69b</tt>,
as a set of <tt>".nrm"</tt> files. </li>
<li>The <b>character set converter data</b> is under the directory
<tt>icudt69b</tt>, as a set of <tt>".cnv"</tt> files. These files are
currently included only in icu-charset.jar.</li>
<li>The <b>time zone rule data</b> is under the directory
<tt>icudt69b</tt>, as <tt>zoneinfo64.res</tt>.</li>
<li>The <b>holiday data</b> is under the directory <tt>icudt69b</tt>,
as a set of <tt>".class"</tt> files, named <tt>"HolidayBundle_"</tt>
followed by the locale ID.</li>
</ul>
<p>
Some of the data files alias or otherwise reference data from other
data files. One reason for this is because some locale names have
changed. For example, <tt>he_IL</tt> used to be <tt>iw_IL</tt>. In
order to support both names but not duplicate the data, one of the
resource files refers to the other file's data. In other cases, a
file may alias a portion of another file's data in order to save
space. Currently ICU4J provides no tool for revealing these
dependencies.</p>
<blockquote><strong>Note:</strong> Java's <code>Locale</code> class
silently converts the language code <tt>"he"</tt> to <tt>"iw"</tt>
when you construct the Locale (for versions of Java through Java 5). Thus
Java cannot be used to locate resources that use the <tt>"he"</tt>
language code. ICU, on the other hand, does not perform this
conversion in ULocale, and instead uses aliasing in the locale data to
represent the same set of data under different locale
ids.</blockquote>
<p>
Resource files that use locale ids form a hierarchy, with up to four
levels: a root, language, region (country), and variant. Searches for
locale data attempt to match as far down the hierarchy as possible,
for example, <tt>"he_IL"</tt> will match <tt>he_IL</tt>, but
<tt>"he_US"</tt> will match <tt>he</tt> (since there is no <tt>US</tt>
variant for he, and <tt>"xx_YY</tt> will match root (the
default fallback locale) since there is no <tt>xx</tt> language code
in the locale hierarchy. Again, see
<code>java.util.ResourceBundle</code> for more information.
</p>
<p>
<strong>Currently ICU4J provides no tool for revealing these
dependencies</strong> between data files, so trimming the data
directly in the ICU4J project is a hit-or-miss affair. The key point
when you remove data is to make sure to remove all dependencies on
that data as well. For example, if you remove <tt>he.res</tt>, you
need to remove <tt>he_IL.res</tt>, since it is lower in the hierarchy,
and you must remove iw.res, since it references <tt>he.res</tt>, and
<tt>iw_IL.res</tt>, since it depends on it (and also references
<tt>he_IL.res</tt>).
</p>
<p>
Unfortunately, the jar tool in the JDK provides no way to remove items
from a jar file. Thus you have to extract the resources, remove the
ones you don't want, and then create a new jar file with the remining
resources. See the jar tool information for how to do this. Before
'rejaring' the files, be sure to thoroughly test your application with
the remaining resources, making sure each required resource is
present.
</p>
<h3 class="doc">Using additional resource files with ICU4J</h3>
<blockquote>
<table cellpadding="3" frame="border" rules="none" width="50%">
<tbody>
<tr>
<td><b><font color="red" size="+1">Warning:</font> Resource
file formats can change across releases of ICU4J!</b></td>
</tr>
<tr>
<td>The format of ICU4J resources is not part of the API.
Clients who develop their own resources for use with ICU4J should be
prepared to
regenerate them when they move to new releases of ICU4J.</td>
</tr>
</tbody>
</table>
</blockquote>
<p>
We are still developing ICU4J's resource mechanism. Currently it
is not possible to mix icu's new binary <tt>.res</tt>
resources
with traditional java-style <tt>.class</tt> or <tt>.txt</tt>
resources. We might
allow for this in a future release, but since the resource data and
format is not formally
supported, you run the risk of incompatibilities with future releases
of ICU4J.
</p>
<p>
Resource data in ICU4J is checked in to the repository as a jar file
containing the resource binaries, <tt>$icu4j_root/main/shared/data/icudata.jar</tt>.
This means that inspecting the contents of these resources is difficult.
They currently are compiled from ICU4C <tt>.txt</tt> file data. You
can view the contents of the ICU4C text resource files to understand
the contents of the ICU4J resources.
</p>
<p>
The files in <tt>icudata.jar</tt> get extracted to <tt>com/ibm/icu/impl/data</tt>
in the build output directory by some build targets.
</p>
<h3 class="doc"><a name="resourcesICU4C">Building ICU4J Resources from ICU4C</a></h3>
ICU4J data is built by ICU4C tools. Please see "icu4j-readme.txt" in icu4c/source/data for the procedures.
<h5> Generating Data from CLDR </h5>
<I> Note: This procedure assumes that all 3 sources are present</I>
<ol>
<li>Checkout or download CLDR version 'release-39'</li>
<li>Checkout ICU with tag 'release-69-1'</li>
<li>cd to icu4c/source/data directory</li>
<li>Follow the instructions in icu4c/source/data/cldr-icu-readme.txt</li>
<li>Rebuild ICU4C with the newly generated data.</li>
<li>Run ICU4C tests to verify that the new data is good.</li>
<li>Build ICU4J data from ICU4C data by following the procedures in icu4j/source/data/icu4j-readme.txt</li>
<li>cd to icu4j dir</li>
<li>Build and test icu4j</li>
</ol>
<h2 class="doc"><a name="timezone"></a>About ICU4J Time Zone</h2>
<p>ICU4J library includes the latest time zone data, as of the release date.
However, time zone data is frequently updated in response
to changes made by local governments around the world. If you need to update
the time zone data, please refer the ICU user guide topic
<a href="https://unicode-org.github.io/icu/userguide/datetime/timezone#updating-the-time-zone-data">Updating the Time Zone Data</a>.</p>
<p>You can optionally configure ICU4J date and time
service classes to use underlying JDK TimeZone implementation (see the ICU4J API reference
<a href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/util/TimeZone.html">TimeZone</a>
for the details). When this configuration is enabled, ICU's own time zone data
won't be used and you have to get time zone data patches from the JRE vendor.</p>
<h2 class="doc"><a name="WhereToFindMore"></a>Where to Find More
Information</h2>
<p><a href="http://www.icu-project.org/">http://www.icu-project.org/</a>
is the home page of International Components for Unicode development project</p>
<h2 class="doc"><a name="SubmittingComments"></a>Submitting Comments,
Requesting Features and
Reporting Bugs</h2>
<p>Your comments are important to making ICU4J successful. We are
committed to investigate any bug reports or suggestions,
and will use your feedback to help plan future releases.</p>
<p>To submit comments, request features and report bugs,
please see <a href="http://www.icu-project.org/bugs.html">ICU bug database
information</a> or contact us through the <a
href="http://www.icu-project.org/contacts.html">ICU Support
mailing list</a>. While we are not able to respond individually to each comment, we do
review all comments.</p>
<br>
<br>
<h2>Thank you for your interest in ICU4J!</h2>
<br>
<hr align="center" size="2" width="100%">
<p><I><font size="-1">© 2016 and later: Unicode, Inc. and others.<br>
License & terms of use: <a href="http://www.unicode.org/copyright.html">http://www.unicode.org/copyright.html</a>
</font></I></p>
</body>
</html>