XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (534 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
9.99Mb size Format: txt, pdf, ePub

,
ass
, or
se
(among others). Few surprises here.

Java also allows a collation to perform decomposition of combined characters. For example, the character
ç
can be decomposed into two characters, the letter
c
and a nonspacing cedilla. The advantage of doing this is that Unicode allows two ways of representing a word such as
gar
ç
on
, using either six codepoints or seven, and normalizing the text so it only uses one of these forms gives better results when matching strings. For collating, Java chooses to use the decomposed form in which the accents are represented separately. (For more information on normalization, see the entry for the
normalize-unicode()
function on page 847.)

Under such a collation, the string
gar
ç
on
is represented as seven collation units, the same as the collation units for the string
garc,on
, in which the cedilla is represented by a separate nonspacing character. The effect of this is that the result of
contains(“gar
ç
on”, $t)
is true when
$t
is any of
ç
,
r
ç
, or
ç
o
, and also when it is
c
or
rc

Other books

RR05 - Tender Mercies by Lauraine Snelling
Swarm (Dead Ends) by G.D. Lang
Your Worst Nightmare by P.J. Night
The Drop by Dennis Lehane
Flaming Dove by Daniel Arenson
Keeping the World Away by Margaret Forster
The Crowning Terror by Franklin W. Dixon
The Best Friend by R.L. Stine
Trout Fishing in America by Richard Brautigan