XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (535 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
13.26Mb size Format: txt, pdf, ePub

, but not (and here's the surprise) when it is
co
.

I've written
garc,on
to illustrate that the
c
and the cedilla are two separate Unicode codepoints. But of course the cedilla is actually a nonspacing character, so in real life this string of seven codepoints would appear on the page as
gar
ç
on
.

Java could instead have standardized on the composed form of the character, but the accent-blind matching would then not work:
contains(“gar
ç
on”, “c”)
would be false.

Now let's look at a case where a pair of characters represents a single collation unit. Here we turn back to Spanish, where in older publications
ch
collates after
c
and
ll
collates after
l
. We can set this up in Java by defining a
RuleBaseCollator
using a rule that defines
c
<
ch
<
d
and
l
<
ll
<
m
. (Modern Spanish practice follows the English collating rules, so I had to set up these rules myself.)

Other books

Down a Lost Road by J. Leigh Bralick
The Girl Who Could Not Dream by Sarah Beth Durst
The Wrong Sister by Kris Pearson
Old Man's Ghosts by Tom Lloyd
The Clue in the Embers by Franklin W. Dixon
Arthur & George by Julian Barnes
One Shot Too Many by Nikki Winter
Dangerous Ladies by Christina Dodd