ICU-8112 moved by srl, incorrectly filed under #7548: spoof api docs cleanup

X-SVN-Rev: 27695
This commit is contained in:
Andy Heninger 2010-02-26 01:18:21 +00:00
parent ac780b9f96
commit cae10acab5

View file

@ -56,7 +56,7 @@ U_NAMESPACE_USE
* from these Unicode documents.
*
* The tests available on identifiers fall into two general categories:
* -# Single identier tests. Check whether an identifier is
* -# Single identifier tests. Check whether an identifier is
* potentially confusable with any other string, or is suspicious
* for other reasons.
* -# Two identifier tests. Check whether two specific identifiers are confusable.
@ -70,7 +70,7 @@ U_NAMESPACE_USE
* -# Perform the checks using the pre-configured USpoofChecker. The results indicate
* which (if any) of the selected tests have identified possible problems with the identifier.
* Results are reported as a set of USpoofChecks flags; this mirrors the form in which
* the set of tests to perform was originally specified tothe USpoofChecker.
* the set of tests to perform was originally specified to the USpoofChecker.
*
* A USpoofChecker may be used repeatedly to perform checks on any number of identifiers.
*
@ -88,19 +88,19 @@ U_NAMESPACE_USE
* When testing whether pairs of identifiers are confusable, with the uspoof_areConfusable()
* family of functions, the relevant tests are
*
* -# USPOOF_SINGLE_SCRIPT_CONFUSABLE: All of the characters from the two idenifiers are
* -# USPOOF_SINGLE_SCRIPT_CONFUSABLE: All of the characters from the two identifiers are
* from a single script, and the two identifiers are visually confusable.
* -# USPOOF_MIXED_SCRIPT_CONFUSABLE: At least one of the identifiers contains characters
* from more than one script, and the two identifiers are visually confusable.
* -# USPOOF_WHOLE_SCRIPT_CONFUSABLE: Each of the two idenifiers is of a single script, but
* the the two identifiers are from different scripts, and they are visually confusable.
* -# USPOOF_WHOLE_SCRIPT_CONFUSABLE: Each of the two identifiers is of a single script, but
* the two identifiers are from different scripts, and they are visually confusable.
*
* The safest approach is to enable all three of these checks as a group.
*
* USPOOF_ANY_CASE is a modifier for the above tests. If the identifiers being checked can
* be of mixed case and are used in a case-sensitive manner, this option should be specified.
*
* If the identiers being checked are used in a case-insensitive manner, and if they are
* If the identifiers being checked are used in a case-insensitive manner, and if they are
* displayed to users in lower-case form only, the USPOOF_ANY_CASE option should not be
* specified. Confusabality issues involving upper case letters will not be reported.
*
@ -108,10 +108,10 @@ U_NAMESPACE_USE
* the relevant tests are:
*
* -# USPOOF_MIXED_SCRIPT_CONFUSABLE: the identifier contains characters from multiple
* scripts, and there exists an identier of a single script that is visually confusable.
* scripts, and there exists an identifier of a single script that is visually confusable.
* -# USPOOF_WHOLE_SCRIPT_CONFUSABLE: the identifier consists of characters from a single
* script, and there exists a visually confusable identifier.
* The visally confusable identifier also consists of characters from a single script.
* The visually confusable identifier also consists of characters from a single script.
* but not the same script as the identifier being checked.
* -# USPOOF_ANY_CASE: modifies the mixed script and whole script confusables tests. If
* specified, the checks will confusable characters of any case. If this flag is not
@ -121,7 +121,7 @@ U_NAMESPACE_USE
* This is not a test for confusable identifiers
* -# USPOOF_INVISIBLE: check an identifier for the presence of invisible characters,
* such as zero-width spaces, or character sequences that are
* likely not to display, such as multiple occurences of the same
* likely not to display, such as multiple occurrences of the same
* non-spacing mark. This check does not test the input string as a whole
* for conformance to any particular syntax for identifiers.
* -# USPOOF_CHAR_LIMIT: check that an identifier contains only characters from a specified set
@ -129,10 +129,23 @@ U_NAMESPACE_USE
* uspoof_setAllowedLocales().
*
* Note on Scripts:
* Characters from the Unicode Scripts "Common" and "Inherited" are ignored when consdering
* Characters from the Unicode Scripts "Common" and "Inherited" are ignored when considering
* the script of an identifier. Common characters include digits and symbols that
* are normally used with text from more than one script.
*
* Identifier Skeletons: A skeleton is a transformation of an identifier, such that
* all identifiers that are confusable with each other have the same skeleton.
* Using skeletons, it is possible to build a dictionary data structure for
* a set of identifiers, and then quickly test whether a new identifier is
* confusable with an identifier already in the set. The uspoof_getSkeleton()
* family of functions will produce the skeleton from an identifier.
*
* Note that skeletons are not guaranteed to be stable between versions
* of Unicode or ICU, so an applications should not rely on creating a permanent,
* or difficult to update, database of skeletons. Instabilities result from
* identifying new pairs or sequences of characters that are visually
* confusable, and thus must be mapped to the same skeleton character(s).
*
*/
struct USpoofChecker;
@ -156,9 +169,9 @@ typedef enum USpoofChecks {
/** Mixed script confusable test.
* When checking a single identifier, report a problem if
* the identifier contains multiple scripts, and
* is confusable with some other identifer in a single script
* is confusable with some other identifier in a single script
* When testing whether two identifiers are confusable, report that they are if
* the two IDs are visually confusable, and
* the two IDs are visually confusable,
* and at least one contains characters from more than one script.
*/
USPOOF_MIXED_SCRIPT_CONFUSABLE = 2,
@ -167,7 +180,7 @@ typedef enum USpoofChecks {
* When checking a single identifier, report a problem if
* The identifier is of a single script, and
* there exists a confusable identifier in another script.
* When testing whether two identfiers are confusable, report that they are if
* When testing whether two identifiers are confusable, report that they are if
* each is of a single script,
* the scripts of the two identifiers are different, and
* the identifiers are visually confusable.
@ -177,20 +190,20 @@ typedef enum USpoofChecks {
/** Any Case Modifier for confusable identifier tests.
If specified, consider all characters, of any case, when looking for confusables.
If USPOOF_ANY_CASE is not specified, identifiers being checked are assumed to have been
case folded. Upper case conusable characters will not be checked.
case folded. Upper case confusable characters will not be checked.
Selects between Lower Case Confusable and
Any Case Confusable. */
USPOOF_ANY_CASE = 8,
/** Check that an identifer contains only characters from a
/** Check that an identifier contains only characters from a
* single script (plus chars from the common and inherited scripts.)
* Applies to checks of a single identifier check only.
*/
USPOOF_SINGLE_SCRIPT = 16,
/** Check an identifier for the presence of invisble characters,
/** Check an identifier for the presence of invisible characters,
* such as zero-width spaces, or character sequences that are
* likely not to display, such as multiple occurences of the same
* likely not to display, such as multiple occurrences of the same
* non-spacing mark. This check does not test the input string as a whole
* for conformance to any particular syntax for identifiers.
*/
@ -223,7 +236,7 @@ uspoof_open(UErrorCode *status);
/**
* Open a Spoof checker from its serialized from, stored in 32-bit-aligned memory.
* Inverse of uspoof_serialize().
* The memory containing the serailized data must remain valid and unchanged
* The memory containing the serialized data must remain valid and unchanged
* as long as the spoof checker, or any cloned copies of the spoof checker,
* are in use. Ownership of the memory remains with the caller.
* The spoof checker (and any clones) must be closed prior to deleting the
@ -260,7 +273,7 @@ uspoof_openFromSerialized(const void *data, int32_t length, int32_t *pActualLeng
* input string is zero terminated.
* @param confusablesWholeScript
* a pointer to the whole script confusables definitions,
* as found in the file xonfusablesWholeScript.txt from unicode.org.
* as found in the file confusablesWholeScript.txt from unicode.org.
* @param confusablesWholeScriptLen The length of the whole script confusables text, or
* -1 if the input string is zero terminated.
* @param errType In the event of an error in the input, indicates
@ -432,7 +445,7 @@ uspoof_getAllowedLocales(USpoofChecker *sc, UErrorCode *status);
*
* @param sc The USpoofChecker
* @param chars A Unicode Set containing the list of
* charcters that are permitted. Ownership of the set
* characters that are permitted. Ownership of the set
* remains with the caller. The incoming set is cloned by
* this function, so there are no restrictions on modifying
* or deleting the USet after calling this function.
@ -479,7 +492,7 @@ uspoof_getAllowedChars(const USpoofChecker *sc, UErrorCode *status);
*
* @param sc The USpoofChecker
* @param chars A Unicode Set containing the list of
* charcters that are permitted. Ownership of the set
* characters that are permitted. Ownership of the set
* remains with the caller. The incoming set is cloned by
* this function, so there are no restrictions on modifying
* or deleting the USet after calling this function.
@ -517,7 +530,7 @@ uspoof_getAllowedUnicodeSet(const USpoofChecker *sc, UErrorCode *status);
/**
* Check the specified string for possible security issues.
* The text to be checked will typically be an indentifier of some sort.
* The text to be checked will typically be an identifier of some sort.
* The set of checks to be performed is specified with uspoof_setChecks().
*
* @param sc The USpoofChecker
@ -533,7 +546,7 @@ uspoof_getAllowedUnicodeSet(const USpoofChecker *sc, UErrorCode *status);
* is not needed.
* If the string passes the requested checks the
* parameter value will not be set.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Spoofing or security issues detected with the input string are
* not reported here, but through the function's return value.
@ -552,7 +565,7 @@ uspoof_check(const USpoofChecker *sc,
/**
* Check the specified string for possible security issues.
* The text to be checked will typically be an indentifier of some sort.
* The text to be checked will typically be an identifier of some sort.
* The set of checks to be performed is specified with uspoof_setChecks().
*
* @param sc The USpoofChecker
@ -566,7 +579,7 @@ uspoof_check(const USpoofChecker *sc,
* is not needed.
* If the string passes the requested checks the
* parameter value will not be set.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Spoofing or security issues detected with the input string are
* not reported here, but through the function's return value.
@ -588,7 +601,7 @@ uspoof_checkUTF8(const USpoofChecker *sc,
#if U_SHOW_CPLUSPLUS_API
/**
* Check the specified string for possible security issues.
* The text to be checked will typically be an indentifier of some sort.
* The text to be checked will typically be an identifier of some sort.
* The set of checks to be performed is specified with uspoof_setChecks().
*
* @param sc The USpoofChecker
@ -600,7 +613,7 @@ uspoof_checkUTF8(const USpoofChecker *sc,
* is not needed.
* If the string passes the requested checks the
* parameter value will not be set.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Spoofing or security issues detected with the input string are
* not reported here, but through the function's return value.
@ -649,7 +662,7 @@ uspoof_checkUnicodeString(const USpoofChecker *sc,
* @param length2 The length of the second string, expressed in
* 16 bit UTF-16 code units, or -1 if the string is
* zero terminated.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Confusability of the strings is not reported here,
* but through this function's return value.
@ -682,7 +695,7 @@ uspoof_areConfusable(const USpoofChecker *sc,
* confusability. The strings are in UTF-18 format.
* @param length2 The length of the second string in bytes, or -1
* if the string is zero terminated.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Confusability of the strings is not reported here,
* but through this function's return value.
@ -713,7 +726,7 @@ uspoof_areConfusableUTF8(const USpoofChecker *sc,
* confusability. The strings are in UTF-8 format.
* @param s2 The second of the two strings to be compared for
* confusability. The strings are in UTF-18 format.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* Confusability of the strings is not reported here,
* but through this function's return value.
@ -755,7 +768,7 @@ uspoof_areConfusableUnicodeString(const USpoofChecker *sc,
* @param destCapacity The length of the output buffer, in 16 bit units.
* The destCapacity may be zero, in which case the function will
* return the actual length of the skeleton.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* @return The length of the skeleton string. The returned length
* is always that of the complete skeleton, even when the
@ -794,7 +807,7 @@ uspoof_getSkeleton(const USpoofChecker *sc,
* @param destCapacity The length of the output buffer, in bytes.
* The destCapacity may be zero, in which case the function will
* return the actual length of the skeleton.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check. Possible Errors include U_INVALID_CHAR_FOUND
* for invalid UTF-8 sequences, and
* U_BUFFER_OVERFLOW_ERROR if the destination buffer is too small
@ -835,7 +848,7 @@ uspoof_getSkeletonUTF8(const USpoofChecker *sc,
* @param destCapacity The length of the output buffer, in bytes.
* The destCapacity may be zero, in which case the function will
* return the actual length of the skeleton.
* @param status The error code, set if an error occured while attempting to
* @param status The error code, set if an error occurred while attempting to
* perform the check.
* @return A reference to the destination (skeleton) string.
*