From db89fd9197affd44560b04849bdf45374e3a9b69 Mon Sep 17 00:00:00 2001 From: George Rhoten Date: Tue, 5 Feb 2002 04:19:24 +0000 Subject: [PATCH] ICU-1627 Undo the damage from the last edit (lose some of the changes too). Mozilla creates non-standard HTML. X-SVN-Rev: 7563 --- icu4c/readme.html | 2883 +++++++++++++++++++++++---------------------- 1 file changed, 1455 insertions(+), 1428 deletions(-) diff --git a/icu4c/readme.html b/icu4c/readme.html index f65c00ed190..c3107fd414c 100644 --- a/icu4c/readme.html +++ b/icu4c/readme.html @@ -1,12 +1,18 @@ - + + - - - - - - - ReadMe for ICU + + + + + + + + + ReadMe for ICU - - + -

-International Components for Unicode
-ICU 2.0 ReadMe

-Version: 2002-Jan-11 -
Copyright © 1997-2002 International Business Machines Corporation -and others. All Rights Reserved. -
-
-

-Table of Contents

- - - -
-

-Introduction

-Today's software market is a global one in which it is desirable to develop -and maintain one application (single source/single binary) that supports -a wide variety of languages. The International Components for Unicode (C/C++) -provides tools to help write platform-independent applications that are -internationalized and localized, with support for: - -ICU has a sister project ICU4J -that extends the internationalization capabilities of Java to a level similar -to ICU. The ICU C/C++ project is also called ICU4C when a distinction is -necessary. -

-Getting -started

-This document describes how to build and install ICU on your machine. For -other information about ICU please see the following table of links. -
The ICU homepage also links to related information about writing internationalized -software. -
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Here are some useful links regarding ICU and internationalization -in general. 
ICU Homepagehttp://oss.software.ibm.com/icu/
ICU4J Homepagehttp://oss.software.ibm.com/icu4j/
FAQ - Frequently Asked Questions about ICUhttp://oss.software.ibm.com/icu/userguide/icufaq.html
ICU User's Guidehttp://oss.software.ibm.com/icu/userguide/
Download ICU Releaseshttp://oss.software.ibm.com/icu/download/
API Documentation Onlinehttp://oss.software.ibm.com/icu/apiref/
Online ICU Demoshttp://oss.software.ibm.com/icu/demo/
Contacts & Bug Reports/Feature Requestshttp://oss.software.ibm.com/icu/archives/
- -

Important: Please make sure you understand the Copyright -and License Information. -

-What -is new in this release?

-The following list concentrates on changes that affect existing applications -migrating from previous ICU releases. For more news about this release, -see the ICU 2.0 -download page. -

-Support for Unicode 3.1.1

-ICU 2.0 has been upgraded to support Unicode -3.1.1, which includes the addition of 44,946 new encoded characters. -These characters cover several historic scripts, several sets of symbols, -and a very large collection of additional CJK ideographs. -

As part of this upgrade, a number of ICU services have been reviewed -and improved with regards to handling supplementary characters (surrogate -pairs). Especially, normalization is revamped for support of supplementary -characters and higher performance. -

-Euro transition

-Locale data for countries that are switching their national currencies -to the Euro is updated to use the Euro symbol and appropriate currency -formatting. The old data is available in _PREEURO locale variants. The -_EURO variant selector can still be used to unambiguously get Euro currency -symbol formatting. For some time around the transition, software should -explicitly specify _PREEURO and _EURO variants to make sure to get the -intended currency format. -

For more on this topic see the developerWorks -article "Are you really ready for the Euro?". -

-API changes

-Functions that take C-style string input arguments with const UChar *src -and int32_t srcLength now consistently treat srcLength==-1 to mean that -the input string is NUL-terminated and get srcLength=u_strlen(src). -

Functions that take C-style string output arguments with UChar *dest -and int32_t destCapacity now handle NUL-termination of the output string -consistently. If the output length is equal to destCapacity, then dest -is filled with the output string and a warning code is set. For details -about string handling see the User's -Guide Strings chapter. -

Some APIs have been deprecated for a long time (more than a year) -and have been removed now. -
Some other APIs have been marked as deprecated because they -are replaced by improved APIs; the newly deprecated APIs will be available -for another year. In particular, the C++ classes UnicodeConverter, Unicode, -and BiDi are deprecated in favor of the equally powerful C APIs. -
A few draft APIs have changed, especially for transliteration. -

APIs that take a rules or pattern string (for collation, transliteration, -message formats, etc.) now also take a UParseError structure that -is filled with useful debugging information when a rule syntax error is -detected. This makes it easier in large rules to find problems. As a result, -the signatures of some functions have changed. The old signatures will -be available for about a year by #defining a constant. See affected header -files for details. -

The C++ Normalizer class had a partially broken model for iterative -normalization; this is redone in a more consistent way. See the Normalizer -API documentation for details. -

-Memory and resource cleanup

-ICU is carefully tested for memory leaks. Some memory is held in internal -caches that do not normally get released during normal operation. These -are not leaks because ICU continues to use them as necessary. -

For testing purposes (for memory leaks) and for a small number of applications -it can be useful to close all the memory that is allocated for a library. -ICU 2.0 supports this with a new function u_cleanup() -that may be called after an application has released all ICU objects. u_cleanup() -will then release all of ICU's internal memory. The ICU libraries can then -even be unloaded cleanly without shutting down the process. -

-ICU versioning - C++ namespaces

-Beginning with ICU 2.0, multiple releases of ICU can be used in the same -process. Together with an arbitrary number of post-2.0 releases, one pre-2.0 -release can be loaded and active. -

This is achieved by renaming all library exports to include a release -number suffix. Each global function and each class is renamed in this way -using a header file with #defines. For C++, if the compiler supports namespaces, -all ICU C++ classes are defined in the "icu" namespace. If the compiler -does not support namespaces, then the classes are renamed instead. This -change also reduces the chance of naming collisions with other libraries. -

For details see the User's -Guide Design Chapter. -

-Data loading changed

-ICU data loading is simplified for most users. By default, the ICU build -creates a DLL/shared library that is linked directly with the common library -([lib]icuuc). By placing all ICU libraries including the data -library into the same folder, ICU should start up and find its data immediately. -Dynamic loading of data from DLLs/shared libraries is not supported any -more. -

Before ICU 2.0, ICU did not itself link directly with its data library, -but some ICU applications did (like the Xerces XML parser) and called udata_setCommonData(). -This is not necessary any more in the default case. -
On the other hand, this same technique can now be used to efficiently -load application data (e.g., for its own localization). An application -can build a data DLL/library of its own, link it, and call the new API -udata_setAppData(). -

For details on finding and loading ICU data and on options for portable, -common data files etc. see the User's -Guide ICU Data Chapter. -

-Collation improvements

-The performance of Japanese Katakana collation is improved, and the Japanese -collation is changed for conformance with the JIS X 4061 standard. The -improvement is in the handling of the length and iteration marks, making -the processing of regular letters faster. -

The JIS X 4061 standard specifies a 5-level sorting algorithm. Sorting -with all five levels according to JIS is achieved in ICU 2.0 with the "identical" -strength. The fifth level distinguishes regular character codes from compatibility -variants. -

There is special code to handle the fourth (quarternary) level of the -JIS standard, which distinguishes between Hiragana and Katakana letters. -In ICU 2.0 string comparisons (like ucol_strcoll), when using the "shifted" -option, this is slow because it generates complete sort keys for both strings. -This is not an issue if the "shifted" option is not used, or if the string -comparison is done with fewer levels. -

Quarternary strength, without the "shifted" option, is the default for -Japanese collation in ICU 2.0. -

Three-level sorting (tertiary strength) and lower — if sufficient -— is faster even with "shifted" on (for string comparisons: much -faster in this case). -

-License Change (for ICU 1.8.1 and up)

-The ICU projects (ICU4C and ICU4J) have changed their licenses from the -IPL (IBM Public License) to the X license. The X license is a non-viral -and recommended free software license that is compatible with the GNU GPL -license. This is effective starting with release 1.8.1 of ICU4C and release -1.3.1 of ICU4J. All previous ICU releases will continue to utilize the -IPL. New ICU releases will adopt the X license. The users of previous releases -of ICU will need to accept the terms and conditions of the X license in -order to adopt the new ICU releases. -

The main effect of the change is to provide GPL compatibility. The X -license is listed as GPL compatible, see the gnu page at http://www.gnu.org/philosophy/license-list.html#GPLCompatibleLicenses. -

The text of the X license is available at http://www.x.org/terms.htm. -The IBM version contains the essential text of the license, omitting the -X-specific trademarks and copyright notices. -

For more details please see the press -announcement and the Project -FAQ. -

-Transliterator improvements

-The transliterator service has undergone an extensive overhaul, in both -the rule-based engine and the built-in system rules. For a complete description -see the User's -Guide chapter on transliteration. - - -

-UnicodeSet Improvements

- - - -
-

-How -to Download the Source Code

-There are two ways to download ICU releases: - - -

-ICU -Source Code Organization

-In the descriptions below, <ICU> is the full path name -of the icu directory - the top level directory from the distribution archives -- in your file system. -
  - - - - - - - - - - - - - - -
The following files describe the code drop. 
readme.htmlDescribes the International Components for Unicode (this file)
license.htmlContains the text of the ICU license
- -
  -
  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +

Version: 2002-Feb-04
+ Copyright © 1997-2002 International Business Machines Corporation and + others. All Rights Reserved.

+ +
- - +

Table of Contents

- - + - +
  • Getting started
  • - - +
  • What is new in this release?
  • - - +
  • How to Download the Source Code
  • - - +
  • ICU Source Code Organization
  • - - +
  • + How to Build And Install ICU -
  • - + - +
  • Windows
  • - - +
  • Unix
  • - - +
  • OS/390 (zSeries)
  • - - +
  • OS/400 (iSeries)
  • + + - - +
  • + Important Notes About Using ICU -
  • - -
    The following directories contain source code and data files. 
    <ICU>/source/common/The core Unicode and support functionality, such as resource bundles, -character properties, locales, codepage conversion, normalization, Unicode -properties, Locale, and UnicodeString.
    <ICU>/source/i18n/Modules in i18n are generally the more data-driven, that is to say -resource bundle driven, components. These deal with higher level internationalization -issues such as formatting, collation, text break analysis, and transliteration.
    <ICU>/source/dataThis directory contains the source data in text format which is compiled -into binary form during the ICU build process. It contains several subdirectories, -in which the data files are grouped by function. Note that the build process -must be run again after any changes are made to this directory. -
  • -brkitr/  Data files for character, word, sentence, and -line boundary analysis.
  • - -
  • -locales/  These .txt files contain ICU language and culture-specific -localization data. Two special bundles are root, which is the fallback -data and parent of other bundles, and index which contains a list -of installed bundles. The makefile resfiles.mk contains the list -of resource bundle files.
  • - -
  • -mappings/   Here are the code page converter tables, -.ucm files containing mappings to and from Unicode. These are compiled -into .cnv files. convrtrs.txt is the alias mapping table from various -converter name formats to ICU internal format and vice versa. It produces -cnvalias.dat. The makefiles which contain the list of converters to be -built are ucmfiles.mk, ucmcore.mk, and ucmebcdic.mk.
  • - -
  • -translit/   This directory contains Transliterator rules -as resource bundles, a makefile trnsfiles.mk containing the list -of installed system translitaration files, and as well the special bundle -translit_index -which lists the system transliterator aliases.
  • - -
  • -unidata/ This directory contains the Unicode data files. Please -see http://www.unicode.org/ for more -information.
  • - -
  • -misc/  The misc directory contains other data files which did -not fit into the above categories. Currently it only contains timezone.txt, -a generated file which is compiled into tz.dat, and containing time zone -information.
  • - -
  • -out/ This directory contains the assembled memory mapped files.
  • - -
  • -out/build This directory contains intermediate (compiled) files, -such as .cnv, .res, etc.
  • -
    <ICU>/source/test/intltest/A test suite including all C++ APIs. For information about running -the test suite, see the users' guide.
    <ICU>/source/test/cintltst/A test suite written in C, including all C APIs. For information about -running the test suite, see the users' guide.
    <ICU>/source/test/testdataSource text files for data which is read by the tests. It contains -the subdirectories out/build/ which is used for intermediate files, -and out/ which contains the files test1.cnv through test4.cnv, -and testdata.dat.  Note that the tests call u_setDataDirectory("<ICU>/source/test/testdata/lib"), -so that ICU will load these files as if they were part of the ICU data -package, for testing purposes. This was formerly accomplished by setting -the ICU_DATA environment variable to point at these files. ICU_DATA should -not be set under normal circumstances.
    <ICU>/source/toolsTools for generating the data files. Data files are generated by invoking -<ICU>/source/data/build/makedata.bat -on Win32 or <ICU>/source/make on Unix.
    <ICU>/source/samplesVarious sample programs that use ICU
    <ICU>/source/extraNon-supported API additions. Currently, it contains the 'ustdio' file -i/o library
    <ICU>/source/layoutContains the ICU layout engine (not a rasterizer).
    <ICU>/packaging -
    <ICU>/debian
    These directories contain scripts and tools for packaging the final -ICU build for various release platforms.
    <ICU>/source/configContains helper makefiles for platform specific build commands. Used -by 'configure'.
    <ICU>/source/allinoneContains top-level ICU project files, for instance to build all of -ICU under one MSVC project.
    - -

    -How -To Build And Install ICU

    +