ICU-3650 Update gencnval and alias table documentation

X-SVN-Rev: 14691
This commit is contained in:
George Rhoten 2004-03-12 06:44:55 +00:00
parent 2546c71963
commit 123c565384
3 changed files with 20 additions and 152 deletions

View file

@ -19,14 +19,12 @@ subdir = tools/gencnval
SECTION = 1
MANX_FILES = $(TARGET:$(EXEEXT)=).$(SECTION)
MAN5_FILES = convrtrs.txt.5
GENERATED_MAN_FILES = $(TARGET).$(SECTION) convrtrs.txt.5
GENERATED_MAN_FILES = $(TARGET).$(SECTION)
ALL_MAN_FILES = $(MANX_FILES) $(MAN5_FILES)
ALL_MAN_FILES = $(MANX_FILES)
ICUDATADIR=$(top_builddir)/data/
CONVRTRSFILE=$(top_srcdir)/../data/convrtrs.txt
## Extra files to remove for 'make clean'
CLEANFILES = *~ $(GENERATED_MAN_FILES) $(DEPS) $(RES_FILES) $(TEST_FILES)
@ -45,7 +43,7 @@ DEPS = $(OBJECTS:.o=.d)
## List of phony targets
.PHONY : all all-local install install-local clean clean-local \
distclean distclean-local dist dist-local check \
check-local build-data install-man install-man5 install-manx
check-local build-data install-man install-manx
## Clear suffix list
.SUFFIXES :
@ -84,10 +82,7 @@ $(TARGET) : $(OBJECTS)
$(LINK.cc) $(OUTOPT)$@ $^ $(LIBS)
# man page
install-man: install-man5 install-manx
install-man5: $(MAN5_FILES)
$(MKINSTALLDIRS) $(DESTDIR)$(mandir)/man5
$(INSTALL_DATA) $? $(DESTDIR)$(mandir)/man5
install-man: install-manx
install-manx: $(MANX_FILES)
$(MKINSTALLDIRS) $(DESTDIR)$(mandir)/man$(SECTION)
$(INSTALL_DATA) $? $(DESTDIR)$(mandir)/man$(SECTION)
@ -95,9 +90,6 @@ install-manx: $(MANX_FILES)
%.$(SECTION): $(srcdir)/%.$(SECTION).in
cd $(top_builddir) \
&& CONFIG_FILES=$(subdir)/$@ CONFIG_HEADERS= $(SHELL) ./config.status
%.5: $(srcdir)/%.5.in
cd $(top_builddir) \
&& CONFIG_FILES=$(subdir)/$@ CONFIG_HEADERS= $(SHELL) ./config.status
# only on linux probably ?
#$(TARGET).ps: $(TARGET).$(SECTION)

View file

@ -1,131 +0,0 @@
.\" Hey, Emacs! This is -*-nroff-*- you know...
.\"
.\" convrtrs.txt.5: manual page for the convrtrs.txt file
.\"
.\" Copyright (C) 2000-2002 IBM, Inc. and others.
.\"
.\" Manual page by Yves Arrouye <yves@realnames.com>.
.\"
.TH CONVRTRS.TXT 5 "22 July 2002" "ICU MANPAGE" "ICU @VERSION@ Manual"
.SH NAME
.B convrtrs.txt
\- ICU converters aliases file
.br
.B cnvalias.icu
\- binary ICU converters aliases file
.SH DESCRIPTION
The file
.B convrtrs.txt
lists the names of the converters that ICU can handle, along with
their known aliases. ICU can open a converter given either its real name or
any of its aliases.
.B convrtrs.txt
is read by
.BR gencnval (1)
in order to generate the binary data that ICU uses to represent the converters
aliases information.
.PP
Each converter and its aliases are described on a separate lines; fields
on each line are separated by white space. The order of records in
.B convrtrs.txt
is significant: if a given name appears multiple times, the last one prevails.
Names of converters and aliases are compared without considering case; the
characters dash (U+002D HYPHEN-MINUS), underscore (U+005F LOW LINE), and
space (U+0020 SPACE) are also ignored during comparison
(even though spaces cannot be used in
.B convrtrs.txt
since white space is significant as a field delimiter).
Thus the names
.BR UTF-8 ,
.BR utf_8 ,
and
.BR "Utf 8"
are equivalent converters names.
.PP
The format of
.B convrtrs.txt
can be described by the following BNF grammar:
.PP
.RS
.nf
converters ::= tags { converter }
converter ::= name [ tags ] { alias }
alias ::= name [ tags ]
tags ::= '{' { tag } '}'
tag ::= standard{*}
comment ::= '#' \fIanything\fP
.fi
.RE
.PP
Line continuation and comment sytax are similar to the GNU make syntax.
Any lines beginning with whitespace (e.g. U+0020 SPACE or U+0009 HORIZONTAL
TABULATION) are presumed to be a continuation of the previous line.
.PP
The file must start with a list of recognized tags. These tags are used to
get the correct converter implementation based on the defined standard tag.
For instance, Shift-JIS on an IBM platform may be different from Shift-JIS
on a Windows platform.
.PP
A
.I name
can use any character other than white space and the '{' and '#' delimiters.
In practice, names are usually restricted to the set of uppercase and
lowercase latin letters plus arabic digits, the dash, the underscore,
and the colon characters. It is recommended to follow this convention
when naming new converters or their aliases.
.PP
A
.I comment
starts with the pound character '#' and ends with the current
line. Comments are ignored.
.PP
The
.I name
of a given
.I converter
must match its algorithmic name if the converter is algorithmic, or
its file name if the converter is table-driven. The table for the
converter
.B ibm-912
for example, is expected to be in the
.B ibm-912.cnv
file.
An
.I alias
has no such restriction, as aliases are just arbitrary names
associated to a given converter.
.PP
The presence of a
.I tag
after a converter or alias name means that this name is associated to
a given standard set of names. Two well-known such standards are the
.B MIME
and
.B IANA
registries of names. The default ICU
.B convrtrs.txt
file already uses these tags.
These tags must be declared at the beginning of the file.
Names appropriate for a given standard can be retrieved
programmatically by using the ucnv_getStandardName() or the
ucnv_openStandardNames() function. The asterisk (U+002A) is
used to note which standard name is the default, and the
preceding alias is returned by ucnv_getStandardName(). A standard
tag may have multiple aliases recognized by the same standard for
the same converter name.
.SH CAVEATS
The
.B convrtrs.txt
file is not directly read by ICU. It must be transformed into a binary
file by
.BR gencnval (1)
first. Also, depending on the way ICU was packaged, even the resulting
.B cnvalias.icu
file may not be read by ICU. Please refer to the ICU manual for more
information on which files are effectively read by ICU at runtime, and
how to produce them.
.SH COPYRIGHT
Copyright (C) 2000-2002 IBM, Inc. and others.
.SH SEE ALSO
.BR gencnval (1),
.BR pkgdata (1)

View file

@ -2,11 +2,12 @@
.\"
.\" gencnval.1: manual page for the gencnval utility
.\"
.\" Copyright (C) 2000 IBM, Inc. and others.
.\" Copyright (C) 2000-2004 IBM, Inc. and others.
.\"
.\" Manual page by Yves Arrouye <yves@realnames.com>.
.\" Manual page by George Rhoten
.\"
.TH GENCNVAL 1 "16 April 2002" "ICU MANPAGE" "ICU @VERSION@ Manual"
.TH GENCNVAL 1 "11 March 2004" "ICU MANPAGE" "ICU @VERSION@ Manual"
.SH NAME
.B gencnval
\- compile the converters aliases file
@ -16,6 +17,9 @@
.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
]
[
.BR "\-v\fP, \fB\-\-verbose"
]
[
.BR "\-c\fP, \fB\-\-copyright"
]
[
@ -32,7 +36,7 @@
converts the ICU aliases file
.I converterfile
into the binary file
.BR cnvalias.dat .
.BR cnvalias.icu .
This binary file can then be read directly by ICU, or used by
.BR pkgdata (1)
for incorporation into a larger archive or library.
@ -47,6 +51,10 @@ file is used.
.BR "\-h\fP, \fB\-?\fP, \fB\-\-help"
Print help about usage and exit.
.TP
.BR "\-v\fP, \fB\-\-verbose"
Display verbose output. This information can include information about
conflicting aliases and the converters the aliases resolve to.
.TP
.BR "\-c\fP, \fB\-\-copyright"
Include a copyright notice in the binary data.
.TP
@ -71,14 +79,13 @@ important to make sure that it is present if
.B ICU_DATA
is set.
.SH FILES
.TP \w'\fB@thesysconfdir@/@PACKAGE@/convrtrs.txt'u+3n
.B @thesysconfdir@/@PACKAGE@/convrtrs.txt
Description of ICU's converters and their aliases.
.TP \w'\fB@PACKAGE@/source/data/mappings/convrtrs.txt'u+3n
.B @PACKAGE@/source/data/mappings/convrtrs.txt
Description of ICU's converters and their aliases. This data file is not
normally installed, and it is available as a part of ICU source code.
.SH VERSION
@VERSION@
.SH COPYRIGHT
Copyright (C) 2000-2002 IBM, Inc. and others.
Copyright (C) 2000-2004 IBM, Inc. and others.
.SH SEE ALSO
.BR convrtrs.txt (5)
.br
.BR pkgdata (1)