ICU-986 add new API to ChoiceFormat using closures array

X-SVN-Rev: 4931
2025-04-17 02:37:25 +00:00 · 2001-06-11 17:21:09 +00:00 · 2001-06-11 17:21:09 +00:00 · 15fd2afcdb
commit 15fd2afcdb
parent df4993cedb
1 changed files with 283 additions and 79 deletions
--- a/icu4c/source/i18n/unicode/choicfmt.h
+++ b/icu4c/source/i18n/unicode/choicfmt.h
@ -29,106 +29,212 @@
 #include "unicode/fieldpos.h"
 #include "unicode/format.h"

-
 /**
- * A ChoiceFormat allows you to attach a format to a range of numbers.
- * It is generally used in a MessageFormat for doing things like plurals.
- * The choice is specified with an ascending list of doubles, where each item
- * specifies a half-open interval up to the next item:
+ * <p><code>ChoiceFormat</code> converts between ranges of numeric values
+ * and string names for those ranges. A <code>ChoiceFormat</code> splits
+ * the real number line <code>-Inf</code> to <code>+Inf</code> into two
+ * or more contiguous ranges. Each range is mapped to a
+ * string. <code>ChoiceFormat</code> is generally used in a
+ * <code>MessageFormat</code> for displaying grammatically correct
+ * plurals such as &quot;There are 2 files.&quot;</p>
+ * 
+ * <p>There are two methods of defining a <code>ChoiceFormat</code>; both
+ * are equivalent.  The first is by using a string pattern. This is the
+ * preferred method in most cases.  The second method is through direct
+ * specification of the arrays that make up the
+ * <code>ChoiceFormat</code>.</p>
+ * 
+ * <p><strong>Patterns</strong></p>
+ * 
+ * <p>In most cases, the preferred way to define a
+ * <code>ChoiceFormat</code> is with a pattern. Here is an example of a
+ * <code>ChoiceFormat</code> pattern:</p>
+ * 
+ * <pre>    0#are no files|1#is one file|1&lt;are many files</pre>
+ * 
+ * <p>The pattern consists of a number or <em>range specifiers</em>
+ * separated by vertical bars U+007C (<code>|</code>). There is no
+ * vertical bar after the last range.  Each range specifier is of the
+ * form <em>number separator string</em>.</p>
+ * 
+ * <p><em>Number</em> is a floating point number that can be parsed by a
+ * default <code>NumberFormat</code> for the US locale. It gives the
+ * lower limit of this range. The lower limit is either inclusive or
+ * exclusive, depending on the <em>separator</em>. (The upper limit is
+ * given by the lower limit of the next range.)  The Unicode infinity
+ * sign U+221E is recognized for positive infinity. It may be preceded by
+ * '<code>-</code>' (U+002D) to indicate negative infinity.</p>
+ * 
+ * <p><em>String</em> is the format string for this range, with special
+ * characters enclosed in single quotes (<code>'The #
+ * sign'</code>). Single quotes themselves are indicated by two single
+ * quotes in a row (<code>'o''clock'</code>).</p>
+ * 
+ * <p><em>Separator</em> is one of the following single characters:
+ * 
+ * <ul>
+ *   <li>U+0023 (<code>#</code>) indicates that the lower limit given by
+ *     <em>number</em> is inclusive.  That is, the limit value belongs to
+ *     this range.  Another way of saying this is that the corresponding
+ *     closure is <code>FALSE</code>.  The Unicode less than or equals
+ *     sign U+2264 may be used in place of <code>#</code>.</li>
+ *   <li>U+003C (<code>&lt;</code>) indicates that the lower limit given
+ *     by <em>number</em> is exclusive.  This means that the limit
+ *     belongs to the prior range.</li> Another way of saying this is
+ *     that the corresponding closure is <code>TRUE</code>.
+ * </ul>
+ * 
+ * <p>See below for more information about closures.</p>
+ * 
+ * <p><strong>Arrays</strong></p>
+ * 
+ * <p>A <code>ChoiceFormat</code> defining <code>n</code> intervals
+ * (<code>n</code> &gt;= 2) is specified by three arrays of
+ * <code>n</code> items:
+ * 
+ * <ul>
+ *   <li><code>double limits[]</code> gives the start of each
+ *     interval. This must be a non-decreasing list of values, none of
+ *     which may be <code>NaN</code>.</li>
+ *   <li><code>UBool closures[]</code> determines whether each limit
+ *     value is contained in the interval below it or in the interval
+ *     above it. If <code>closures[i]</code> is <code>FALSE</code>, then
+ *     <code>limits[i]</code> is a member of interval
+ *     <code>i</code>. Otherwise it is a member of interval
+ *     <code>i+1</code>. If no closures array is specified, this is
+ *     equivalent to having all closures be <code>FALSE</code>. Closures
+ *     allow one to specify half-open, open, or closed intervals.</li>
+ *   <li><code>UnicodeString formats[]</code> gives the string label
+ *     associated with each interval.</li>
+ * </ul>
+ * 
+ * <p><strong>Formatting and Parsing</strong></p>
+ * 
+ * <p>During formatting, a number is converted to a
+ * string. <code>ChoiceFormat</code> accomplishes this by mapping the
+ * number to an interval using the following rule. Given a number
+ * <code>X</code> and and index value <code>j</code> in the range
+ * <code>0..n-1</code>, where <code>n</code> is the number of ranges:</p>
+ * 
+ * <blockquote><code>X</code> matches <code>j</code> if and only if
+ * <code>limit[j] &lt;= X &lt; limit[j+1]</code>
+ * </blockquote>
+ * 
+ * <p>(This assumes that all closures are <code>FALSE</code>.  If some
+ * closures are <code>TRUE</code> then the relations must be changed to
+ * <code>&lt;=</code> or <code>&lt;</code> as appropriate.) If there is
+ * no match, then either the first or last index is used, depending on
+ * whether the number is too low or too high. Once a number is mapped to
+ * an interval <code>j</code>, the string <code>formats[j]</code> is
+ * output.</p>
+ * 
+ * <p>During parsing, a string is converted to a
+ * number. <code>ChoiceFormat</code> finds the element
+ * <code>formats[j]</code> equal to the string, and returns
+ * <code>limits[j]</code> as the parsed value.</p>
+ * 
+ * <p><strong>Notes</strong></p>
+ * 
+ * <p>The first limit value does not define a range boundary. For
+ * example, in the pattern &quot;<code>1.0#a|2.0#b</code>&quot;, the
+ * intervals are [-Inf, 2.0) and [2.0, +Inf].  It appears that the first
+ * interval should be [1.0, 2.0).  However, since all values that are too
+ * small are mapped to range zero, the first interval is effectively
+ * [-Inf, 2.0).  However, the first limit value <em>is</em> used during
+ * formatting. In this example, <code>parse(&quot;a&quot;)</code> returns
+ * 1.0.</p>
+ * 
+ * <p>There are no gaps between intervals and the entire number line is
+ * covered.  A <code>ChoiceFormat</code> maps <em>all</em> possible
+ * double values to a finite set of intervals.</p>
+ * 
+ * <p>The non-number <code>NaN</code> is mapped to interval zero during
+ * formatting.</p>
+ * 
+ * <p><strong>Examples</strong></p>
+ * 
+ * <p>Here is an example of two arrays that map the number
+ * <code>1..7</code> to the English day of the week abbreviations
+ * <code>Sun..Sat</code>. No closures array is given; this is the same as
+ * specifying all closures to be <code>FALSE</code>.</p>
+ * 
+ * <pre>    {1,2,3,4,5,6,7},
+ *     {&quot;Sun&quot;,&quot;Mon&quot;,&quot;Tue&quot;,&quot;Wed&quot;,&quot;Thur&quot;,&quot;Fri&quot;,&quot;Sat&quot;}</pre>
+ * 
+ * <p>Here is an example that maps the ranges [-Inf, 1), [1, 1], and (1,
+ * +Inf] to three strings. That is, the number line is split into three
+ * ranges: x &lt; 1.0, x = 1.0, and x &gt; 1.0.</p>
+ * 
+ * <pre>    {0, 1, 1},
+ *     {FALSE, FALSE, TRUE},
+ *     {&quot;no files&quot;, &quot;one file&quot;, &quot;many files&quot;}</pre>
+ * 
+ * <p>Here is a simple example that shows formatting and parsing: </p>
+ * 
 * <pre>
 * \code
- *     X matches j if and only if limit[j] <= X < limit[j+1]
- * \endcode
- * </pre>
- * If there is no match, then either the first or last index is used, depending
- * on whether the number is too low or too high.  The length of the array of
- * formats must be the same as the length of the array of limits.
- * For example,
- * <pre>
- * \code
- *      {1,2,3,4,5,6,7},
- *           {"Sun","Mon","Tue","Wed","Thur","Fri","Sat"}
- *      {0, 1, ChoiceFormat::nextDouble(1)},
- *           {"no files", "one file", "many files"}
- *  \endcode
- * </pre>
- * (nextDouble can be used to get the next higher double, to make the half-open
- * interval.)
- * <P>
- * Here is a simple example that shows formatting and parsing:
- * <pre>
- * \code
- *   void SimpleChoiceExample( void )
- *   {
+ *   #include &lt;unicode/choicfmt.h&gt;
+ *   #include &lt;unicode/unistr.h&gt;
+ *   #include &lt;iostream.h&gt;
+ *   
+ *   int main(int argc, char *argv[]) {
 *       double limits[] = {1,2,3,4,5,6,7};
- *       UnicodeString monthNames[] = {"Sun","Mon","Tue","Wed","Thur","Fri","Sat"};
- *       ChoiceFormat* form = new ChoiceFormat(limits, monthNames, 7 );
- *       ParsePosition* status = new ParsePosition(0);
+ *       UnicodeString monthNames[] = {
+ *           &quot;Sun&quot;,&quot;Mon&quot;,&quot;Tue&quot;,&quot;Wed&quot;,&quot;Thu&quot;,&quot;Fri&quot;,&quot;Sat&quot;};
+ *       ChoiceFormat fmt(limits, monthNames, 7);
 *       UnicodeString str;
- *       FieldPosition f1(0), f2(0);
- *       for (double i = 0.0; i <= 8.0; ++i) {
- *           status->setIndex(0);
- *           Formattable parseResult;
- *           str.remove();
- *           cout << i << " -> " << form->format(i,str, f1) 
- *                     << " -> " << parseResult << endl;
+ *       char buf[256];
+ *       for (double x = 1.0; x &lt;= 8.0; x += 1.0) {
+ *           fmt.format(x, str);
+ *           buf[str.extract(0, str.length(), buf, 256, &quot;&quot;)] = 0;
+ *           str.truncate(0);
+ *           cout &lt;&lt; x &lt;&lt; &quot; -&gt; &quot;
+ *                &lt;&lt; buf &lt;&lt; endl;
 *       }
- *       delete form;
- *       delete status;
- *       cout << endl;
+ *       cout &lt;&lt; endl;
+ *       return 0;
 *   }
 * \endcode
 * </pre>
- * Here is a more complex example, with a pattern format.
+ * 
+ * <p>Here is a more complex example using a <code>ChoiceFormat</code>
+ * constructed from a pattern together with a
+ * <code>MessageFormat</code>.</p>
+ * 
 * <pre>
 * \code
- *   void ComplexChoiceExample( void )
- *   {
+ *   #include &lt;unicode/choicfmt.h&gt;
+ *   #include &lt;unicode/msgfmt.h&gt;
+ *   #include &lt;unicode/unistr.h&gt;
+ *   #include &lt;iostream.h&gt;
+ * 
+ *   int main(int argc, char *argv[]) {
+ *       UErrorCode status = U_ZERO_ERROR;
 *       double filelimits[] = {0,1,2};
- *       UnicodeString filepart[] = {"are no files","is one file","are {2} files"};
+ *       UnicodeString filepart[] =
+ *           {&quot;are no files&quot;,&quot;is one file&quot;,&quot;are {0} files&quot;};
 *       ChoiceFormat* fileform = new ChoiceFormat(filelimits, filepart, 3 );
- *       UErrorCode success = U_ZERO_ERROR;
- *       const Format* testFormats[] = { fileform, NULL, NumberFormat::createInstance(success) };
- *       MessageFormat* pattform = new MessageFormat("There {0} on {1}", success );
- *       pattform->setFormats( testFormats, 3 );
- *       Formattable testArgs[] = {0L, "Disk_A", 0L};
+ *       Format* testFormats[] =
+ *           {fileform, NULL, NumberFormat::createInstance(status)};
+ *       MessageFormat pattform(&quot;There {0} on {1}&quot;, status );
+ *       pattform.adoptFormats(testFormats, 3);
+ *       Formattable testArgs[] = {0L, &quot;Disk A&quot;};
 *       FieldPosition fp(0);
 *       UnicodeString str;
- *       for (int32_t i = 0; i < 4; ++i) {
+ *       char buf[256];
+ *       for (int32_t i = 0; i &lt; 4; ++i) {
 *           Formattable fInt(i);
 *           testArgs[0] = fInt;
- *           testArgs[2] = testArgs[0];
- *           str.remove();
- *           pattform->format(testArgs, 3, str, fp, success );
- *           cout << "Output for i=" << i << " : " << str << endl;
+ *           pattform.format(testArgs, 2, str, fp, status );
+ *           buf[str.extract(0, str.length(), buf, &quot;&quot;)] = 0;
+ *           str.truncate(0);
+ *           cout &lt;&lt; &quot;Output for i=&quot; &lt;&lt; i &lt;&lt; &quot; : &quot; &lt;&lt; buf &lt;&lt; endl;
 *       }
- *       delete pattform;
- *       cout << endl;
+ *       cout &lt;&lt; endl;
+ *       return 0;
 *   }
 * \endcode
 * </pre>
- * ChoiceFormat objects may be converted to and from patterns.  The
- * syntax of these patterns is [TODO fill in this section with detail].
- * Here is an example of a ChoiceFormat pattern:
- * <P>
- * You can either do this programmatically, as in the above example,
- * or by using a pattern (see ChoiceFormat for more information) as in:
- * <pre>
- * \code
- *        "0#are no files|1#is one file|1&lt;are many files"
- * \endcode
- * </pre>
- * Here the notation is:
- * <pre>
- *  \code
- *        <number> "#"  Specifies a limit value.
- *        <number> "<"  Specifies a limit of nextDouble(<number>).
- *        <number> ">"  Specifies a limit of previousDouble(<number>).
- *  \endcode
- * </pre>
- * Each limit value is followed by a string, which is terminated by
- * a vertical bar character ("|"), except for the last string, which
- * is terminated by the end of the string.
 */
 class U_I18N_API ChoiceFormat: public NumberFormat {
 public:
@ -159,6 +265,27 @@ public:
                 const UnicodeString* formats,
                 int32_t count );

+    /**
+     * Construct a new ChoiceFormat with the given limits and formats.
+     * Copy the limits and formats (instead of adopting them).  By
+     * default, each limit in the array specifies the inclusive lower
+     * bound of its range, and the exclusive upper bound of the previous
+     * range.  However, if the isLimitOpen element corresponding to a
+     * limit is TRUE, then the limit is the exclusive lower bound of its
+     * range, and the inclusive upper bound of the previous range.
+     * @param limits Array of limit values
+     * @param closures Array of booleans specifying whether each
+     * element of 'limits' is open or closed.  If FALSE, then the
+     * corresponding limit is a member of the range above it.  If TRUE,
+     * then the limit belongs to the range below it.
+     * @param formats Array of formats
+     * @param count Size of 'limits', 'closures', and 'formats' arrays
+     */
+    ChoiceFormat(const double* limits,
+                 const UBool* closures,
+                 const UnicodeString* formats,
+                 int32_t count);
+
    /**
     * Copy constructor.
     * @stable
@ -225,6 +352,20 @@ public:
                              UnicodeString* formatsToAdopt,
                              int32_t count );  

+    /**
+     * Set the choices to be used in formatting.  The arrays are adopted
+     * and should not be deleted by the caller.  See class description
+     * for documenatation of the limits, closures, and formats arrays.
+     * @param limitsToAdopt Array of limits to adopt
+     * @param closuresToAdopt Array of limit booleans to adopt
+     * @param formatsToAdopt Array of format string to adopt
+     * @param count The size of the above arrays
+     */
+    virtual void adoptChoices(double* limitsToAdopt,
+                              UBool* closuresToAdopt,
+                              UnicodeString* formatsToAdopt,
+                              int32_t count);
+    
    /**
     * Set the choices to be used in formatting.
     *
@ -240,12 +381,33 @@ public:
    virtual void setChoices(const double* limitsToCopy,
                            const UnicodeString* formatsToCopy,
                            int32_t count );    
+
+    /**
+     * Set the choices to be used in formatting.  See class description
+     * for documenatation of the limits, closures, and formats arrays.
+     * @param limits Array of limits
+     * @param closures Array of limit booleans
+     * @param formats Array of format string
+     * @param count The size of the above arrays
+     */
+    virtual void setChoices(const double* limits,
+                            const UBool* closures,
+                            const UnicodeString* formats,
+                            int32_t count);
+
    /**
     * Get the limits passed in the constructor.
     * @return the limits.
     * @stable
     */
    virtual const double* getLimits(int32_t& count) const;
+    
+    /**
+     * Get the limit booleans passed in the constructor.  The caller
+     * must not delete the result.
+     * @return the closures
+     */
+    virtual const UBool* getClosures(int32_t& count) const;

    /**
     * Get the formats passed in the constructor.
@ -424,10 +586,52 @@ private:
     */
    static UnicodeString& dtos(double value, UnicodeString& string, UErrorCode& status);

+    static UMTX fgMutex;
    static NumberFormat* fgNumberFormat;
    static char fgClassID;

+    static const UnicodeString fgPositiveInfinity;
+    static const UnicodeString fgNegativeInfinity;
+
+    /**
+     * Each ChoiceFormat divides the range -Inf..+Inf into fCount
+     * intervals.  The intervals are:
+     *
+     *         0: fChoiceLimits[0]..fChoiceLimits[1]
+     *         1: fChoiceLimits[1]..fChoiceLimits[2]
+     *        ...
+     *  fCount-2: fChoiceLimits[fCount-2]..fChoiceLimits[fCount-1]
+     *  fCount-1: fChoiceLimits[fCount-1]..+Inf
+     *
+     * Interval 0 is special; during formatting (mapping numbers to
+     * strings), it also contains all numbers less than
+     * fChoiceLimits[0], as well as NaN values.
+     *
+     * Interval i maps to and from string fChoiceFormats[i].  When
+     * parsing (mapping strings to numbers), then intervals map to
+     * their lower limit, that is, interval i maps to fChoiceLimit[i].
+     *
+     * The intervals may be closed, half open, or open.  This affects
+     * formatting but does not affect parsing.  Interval i is affected
+     * by fClosures[i] and fClosures[i+1].  If fClosures[i]
+     * is FALSE, then the value fChoiceLimits[i] is in interval i.
+     * That is, intervals i and i are:
+     *
+     *  i-1:                 ... x < fChoiceLimits[i]
+     *    i: fChoiceLimits[i] <= x ...
+     *
+     * If fClosures[i] is TRUE, then the value fChoiceLimits[i] is
+     * in interval i-1.  That is, intervals i-1 and i are:
+     *
+     *  i-1:                ... x <= fChoiceLimits[i]
+     *    i: fChoiceLimits[i] < x ...
+     *
+     * Because of the nature of interval 0, fClosures[0] has no
+     * effect.
+
+     */
    double*         fChoiceLimits;
+    UBool*          fClosures;
    UnicodeString*  fChoiceFormats;
    int32_t         fCount;
 };