[tei-council] Schematron embedded in RelaxNG

Martin Holmes mholmes at uvic.ca
Mon Sep 24 12:53:49 EDT 2012


Hi all,

During the FTF I was talking to Sebastian about the fact that our 
Schematron constraints are embedded directly into RelaxNG files; he 
believed that this meant that normal validation would cause them to be 
triggered, but I couldn't remember having been caught by a Schematron 
constraint when validating against a RelaxNG schema, even though I know 
I know I've violated them. So I've just started testing this.

The RelaxNG schema for tei_all:

<http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng>

does indeed contain our handful of existing Schematron rules; if you 
look at the element definition for <lg>, you'll see the constraint we 
recently added, which says that an <lg> must contain at least one child 
<l>, <lg> or <gap>.

However, when I create a test document violating this constraint and one 
of the others, and validate it against this schema in Oxygen 14, nothing 
happens. (The test document is below.)

So I went digging to try to figure out whether it was realistic to 
expect Schematron validation as part of RelaxNG validation. According to 
the Jing home page, Jing only has "experimental" support for Schematron, 
and it's Schematron 1.5, not ISO Schematron. The documentation also says 
that "Jing's implementation is not based on the reference Schematron 1.5 
implementation. It is implemented partly in XSLT and partly in Java. 
This implementation requires that the Schematron elements be properly 
namespaced using the namespace URI http://www.ascc.net/xml/schematron." 
However, our schema uses the ISO Schematron namespace, 
"http://purl.oclc.org/dsdl/schematron". They often do define a prefix 
for the old namespace, but they use the new one instead -- as they 
should, I think, since we're using ISO Schematron.

Looking at the other available RelaxNG validators, the only one that 
promises the possibility of Schematron support is the Sun MSV validator, 
which seems to have disappeared from the web with the Oracle take-over.

So we don't seem to have a working validation mechanism for Schematron 
embedded in RelaxNG schemas, unless I've missed some key configuration 
option in Oxygen or at the command line.

So the next thing I thought I'd do was to generate a Schematron schema 
from Roma based on tei_all. I went to the latest and greatest Roma, here:

<http://tei.oucs.ox.ac.uk/Roma/>

chose to reduce a schema from the maximum possible schema, and made no 
changes. I generated an ISO Schematron schema from that, and got a file 
with no constraints in it at all:

<?xml version="1.0" encoding="utf-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron"
         xmlns:oxdoc="http://www.oxygenxml.com/ns/doc/xsl"
         queryBinding="xslt2">
    <title>ISO Schematron rules</title>
 
<!--namespaces:--><!--keys:--><!--patterns:--><!--constraints:--></schema>

So that doesn't work, for some reason. Instead, I downloaded 
tei_all.odd, and ran this:

roma --isoschematron tei_all.odd .

to get an ISO Schematron file. Validating my test file against that 
produced the expected errors.

So unless I'm missing something, it's unlikely that most people are 
getting any benefit from our Schematron constraints, unless they're 
aware that they have to manually generate a schematron schema and link 
it into their XML files. If this is the case, I think we should be 
documenting this process somewhere.


This is my test doc, which violates two of our Schematron constraints:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model 
href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" 
type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
     <teiHeader>
         <fileDesc>
             <titleStmt>
                 <title>Testing Schematron constraints in RelaxNG</title>
             </titleStmt>
             <publicationStmt><p>Public test document</p></publicationStmt>
             <sourceDesc><p>Born digital</p></sourceDesc>
         </fileDesc>
     </teiHeader>
     <text>
         <body>
             <div>
                 <lg>
                     <stage>Enter James stage left, looking shifty.</stage>
                     <figure><figDesc>Picture of some lines of 
poetry.</figDesc></figure>
                 </lg>
                 <p><ref target="http://www.uvic.ca/" 
cRef="UVic">UVic</ref></p>
             </div>
         </body>
     </text>
</TEI>

Cheers,
Martin
-- 
Martin Holmes
mholmes at uvic.ca
UVic Humanities Computing and Media Centre


More information about the tei-council mailing list