ICU-9867 Added a note about pattern V semantics change in readme.html

X-SVN-Rev: 33170
This commit is contained in:
Yoshito Umaoka 2013-02-11 20:12:57 +00:00
parent a5c74c721c
commit 3fcfaf6848

View file

@ -5,7 +5,7 @@
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Style-Type" content="text/css2">
<title>ReadMe for ICU4J</title>
<meta name="COPYRIGHT" content="Copyright 2000-2012, International Business Machines Corporation and others. All Rights Reserved.">
<meta name="COPYRIGHT" content="Copyright 2000-2013, International Business Machines Corporation and others. All Rights Reserved.">
<style type="text/css">
h3.doc { background: #CCCCFF }
h4.doc { text-decoration: underline }
@ -14,8 +14,8 @@ h4.doc { text-decoration: underline }
<body style="background-color: rgb(255, 255, 255);" lang="EN-US"
link="#0000ff" vlink="#800080">
<h2>International Components for Unicode for Java (ICU4J)</h2>
<h3>Read Me for ICU4J 50 (50.1)</h3>
(Last Update: 2012-Nov-02)
<h3>Read Me for ICU4J 51</h3>
(Last Update: 2013-Feb-11)
<hr size="2" width="100%">
<p><b>Note:</b> This is major release of ICU4J. It contains bug fixes and adds implementations
@ -125,47 +125,34 @@ found in ICU4J.</p>
<h3 class="doc"><a name="changes"></a>Changes In This Release</h3>
<p>See the <a href="http://sites.google.com/site/icusite/download/50">ICU 50 download page</a>
<p>See the <a href="http://sites.google.com/site/icusite/download/51">ICU 51 download page</a>
about new features in this release.
The list of API changes since the previous ICU4J release is available
<a href="http://source.icu-project.org/repos/icu/icu4j/tags/release-50-1/APIChangeReport.html">here</a>.</p>
<h4><code>Collator.ReorderCodes.DEFAULT</code> value</h4>
<p>We changed the numeric value of
<code>static final Collator.ReorderCodes.DEFAULT</code> from +1 to -1
so that it does not collide with a valid UScript code (<code>UScript.INHERITED</code>),
and to make it match <code>UScript.INVALID_CODE</code> and C/C++ <code>UCOL_REORDER_CODE_DEFAULT</code>,
as had been intended.
Programs using <code>Collator.ReorderCodes.DEFAULT</code> must be recompiled.</p>
<h4>Date format pattern "V"</h4>
<p>The date format pattern "V" was introduced in ICU 3.8 (inherited from CLDR 1.5) as
a variation of pattern "z" to support time zone abbreviation format such as "PST".
The pattern "z" use a time zone abbreviation only when it is commonly used for a locale.
The pattern "V" was slightly different from pattern "z" and the pattern designates
a time zone abbreviation even it is not commonly used for a locale. For example, time
zone abbreviation "AEST" for Australian Eastern Standard Time might not be well recognized
by people in the United States. For the zone, pattern "z" does not use "AEST" (instead, use
UTC offset format "GMT+10:00, as the fallback) , while pattern "V" used to print out "AEST".
In CLDR 21, the data used for checking commonly used or not was completely removed (CLDR
ticket <a href="http://unicode.org/cldr/trac/ticket/4052">#4052</a>), so the difference
between pattern "z" and "V" is no longer available since ICU 49 (based on CLDR 21 specification).</p>
<h4><code>class DictionaryBasedBreakIterator</code> has been removed.</h4>
<p>In CLDR 23, the CLDR technical committee decided to reuse the semantically deprecated
pattern "V" for a different purpose. With the new specification, the date format pattern
"V" is used for short time zone IDs, such as "uslax" for zone America/Los_Angeles. ICU 51
implements the new specification. So existing ICU users currently using custom date format
patterns with pattern "V" are suggested to change them to pattern "z".</p>
<p>The functionality of the <code>DictionaryBasedBreakIterator</code>
was moved into the base <code>RuleBasedBreakIterator</code>, and improved.
In particular, Java RBBI now handles multiple built-in dictionaries,
selecting them by character script, as in C++.
As a result, creating a BreakIterator for a particular language or dictionary is
unnecessary and obsolete.</p>
<p>Note that the existing pattern "VVVV" for a time zone's generic location name is not
affected by the new specification and the pattern "VVVV" continues to work as same as
previous ICU releases.</p>
<p>The dictionary data structure was changed, so ICU 50 would not be able to
handle an old <code>InputStream dictionaryStream</code>.
(If we wanted to formally keep a <code>class DictionaryBasedBreakIterator</code>
it would be an empty class, and its one public constructor would
throw an <code>UnsupportedOperationException</code>.)</p>
<p>The DBBI class was not usable by itself:
A dictionary alone is not sufficient for dictionary-based breaking.
For each dictionary there must be specific code with the specific matching algorithm
(e.g., longest match vs. word-frequency-based word sequence probability)
and with a script-specific test for whether a dictionary word was found at a syllable boundary
(or whatever is the appropriate criterium).</p>
<p>RBBI and its DBBI subclasses had an "intimate connection" and were not effectively separable.</p>
<p>In C++, DBBI was already not public API and has been removed several years ago.</p>
<p>If any users have subclasses of <code>class DictionaryBasedBreakIterator</code>,
they will need to be changed to subclass <code>RuleBasedBreakIterator</code> directly.</p>
<h3 class="doc"><a name="license"></a>License Information</h3>
<p>
@ -1058,7 +1045,7 @@ review all comments.</p>
<h2>Thank you for your interest in ICU4J!</h2>
<br>
<hr align="center" size="2" width="100%">
<p><I><font size="-1">Copyright &copy; 2002-2012 International Business
<p><I><font size="-1">Copyright &copy; 2002-2013 International Business
Machines Corporation and others. All Rights
Reserved.<br>
4400 North First Street, San Jos&eacute;, CA 95193, USA