XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition (725 page)

BOOK: XSLT 2.0 and XPath 2.0 Programmer's Reference, 4th Edition
11.62Mb size Format: txt, pdf, ePub


  

    

      
.*,\s*[A-Z]{2},\s*USA\s*$’)”>

        

                             regex=“

(.*),\s*([A-Z]{{2}}),\s*USA\s*$”>

          

            

              USA

              

                

              

              

                

                  

                

              

            

          

          

            Error: string “

                          does not match regex

          

        

      

      

        

      

       


The effect of these rules is that we end up with event records of the form:


   

      

      principal

   

   28 JUL 1929

   

      

         USA

         NY

         Long Island

         Southampton

      

   


The names “Long Island” and “Southampton” are classified as levels 6 and 7 because we don't know enough about them to classify them more accurately: levels up to 5 have reserved meanings, whereas 6 and above are available for arbitrary purposes. The ordering of levels is significant: higher levels are intended to represent a finer granularity of place name, which is why we have reversed the order of the original components of the name.

Debugging the Stylesheet

This completes the presentation of the stylesheet used to convert the data from GEDCOM 5.5 to 6.0 format. I'd like to add some notes, however, from my experience of developing this stylesheet. The vast majority of my errors in coding this stylesheet, unless they were basic XSLT or XPath errors, were detected as a result of the on-the-fly validation of the result document against its schema. These errors included:

  • Leaving out required attributes
  • Misspelling element names (for example,
    ExternalId
    for
    ExternalID
    )
  • Generating elements in the wrong order
  • Placing an element at the wrong level of nesting
  • Generating an invalid value for an attribute

In the case of Saxon, a few of these errors are detected at stylesheet compile time, but most are reported while executing the stylesheet, and in nearly all cases the error message identifies exactly where the stylesheet is wrong. For example, if the code in the initial template is changed to read:


   


then the transformation fails with the message:

Validation error on line 27 of ged55-to-6.xsl:

  XTTE1510: Required attribute @Target is missing

  (See http://www.w3.org/TR/xmlschema-1/#cvc-complex-type clause 4)

This process caught quite a few basic XSLT coding errors. For example, I originally wrote:


  

    

  


in which the curly braces around
@REF
have been omitted. This resulted in the error message:

Validation error on line 64 of ged55-to-6.xsl:

  The value ‘@REF’ is not a valid NCName

The error message arises because in the absence of curly braces, the system has tried to use
@REF
as the literal value of the
Ref
attribute, and this is not allowed because the attribute is defined in the schema to have type
IDREF
, which is a subtype of
NCName
. An
NCName
cannot contain an
@
character.

Similarly, errors in the picture of the
format-date()
function call were picked up because they resulted in a string that did not match the picture defined in the schema for the
StandardDate
type.

However, schema validation of the result tree will not pick up all errors. I had some trouble, for example, getting the regular expression for matching place names right, but the errors simply resulted in the output file containing an empty

element, which is allowed by the schema.

Displaying the Family Tree Data

What we want to do now is to write a stylesheet that displays the data in a GEDCOM file in HTML format. We want the display to look something like the following screenshot (see
Figure 19-1
).

Other books

Punishment by Linden MacIntyre
Trigger City by Sean Chercover
Death in the Air by Shane Peacock
Double Double by Ken Grimes
Lorraine Connection by Dominique Manotti