[tei-council] Schematron embedded in RelaxNG

James Cummings James.Cummings at it.ox.ac.uk
Mon Sep 24 13:56:23 EDT 2012


Martin,

When I start a new XML document, then manually type in a root 
node, then associate a schema with it (and check the 'use 
schematron' checkbox), then correct my TEI until I ahve a valid 
document... the top looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-model 
href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" 
type="application/xml" 
schematypens="http://purl.oclc.org/dsdl/schematron"?>
<?xml-model 
href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng" 
type="application/xml" 
schematypens="http://relaxng.org/ns/structure/1.0"?>

i.e. has both a schematron xml-model and a relaxng one.

If I do this I get an error if I don't have an <l> inside my <lg> 
(or a nested <lg>). However, this is just a vague schema based 
error because it says:
"E [Jing] element "lg" incomplete; expected element "addSpan", 
"alt", "altGrp", "anchor", "argument", "byline", "camera", 
"caption", "cb", "certainty", "damageSpan", "dateline", 
"delSpan", "desc", "docAuthor", "docDate", "epigraph", "fLib", 
"figure", "fs", "fvLib", "fw", "gap", "gb", "head", "incident", 
"index", "interp", "interpGrp", "join", "joinGrp", "kinesic", 
"l", "label", "lb", "lg", "link", "linkGrp", "listTranspose", 
"meeting", "metamark", "milestone", "move", "notatedMusic", 
"note", "opener", "pause", "pb", "precision", "respons", 
"salute", "shift", "sound", "space", "span", "spanGrp", "stage", 
"substJoin", "tech", "timeline", "view", "vocal", "witDetail" or 
"writing"

Although the only thing that makes the error go away is an lg or 
an l, it isn't a very useful error message.

Is there another schematron constraint I can test instead?


-James

On 24/09/12 17:53, Martin Holmes wrote:
> Hi all,
>
> During the FTF I was talking to Sebastian about the fact that our
> Schematron constraints are embedded directly into RelaxNG files; he
> believed that this meant that normal validation would cause them to be
> triggered, but I couldn't remember having been caught by a Schematron
> constraint when validating against a RelaxNG schema, even though I know
> I know I've violated them. So I've just started testing this.
>
> The RelaxNG schema for tei_all:
>
> <http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng>
>
> does indeed contain our handful of existing Schematron rules; if you
> look at the element definition for <lg>, you'll see the constraint we
> recently added, which says that an <lg> must contain at least one child
> <l>, <lg> or <gap>.
>
> However, when I create a test document violating this constraint and one
> of the others, and validate it against this schema in Oxygen 14, nothing
> happens. (The test document is below.)
>
> So I went digging to try to figure out whether it was realistic to
> expect Schematron validation as part of RelaxNG validation. According to
> the Jing home page, Jing only has "experimental" support for Schematron,
> and it's Schematron 1.5, not ISO Schematron. The documentation also says
> that "Jing's implementation is not based on the reference Schematron 1.5
> implementation. It is implemented partly in XSLT and partly in Java.
> This implementation requires that the Schematron elements be properly
> namespaced using the namespace URI http://www.ascc.net/xml/schematron."
> However, our schema uses the ISO Schematron namespace,
> "http://purl.oclc.org/dsdl/schematron". They often do define a prefix
> for the old namespace, but they use the new one instead -- as they
> should, I think, since we're using ISO Schematron.
>
> Looking at the other available RelaxNG validators, the only one that
> promises the possibility of Schematron support is the Sun MSV validator,
> which seems to have disappeared from the web with the Oracle take-over.
>
> So we don't seem to have a working validation mechanism for Schematron
> embedded in RelaxNG schemas, unless I've missed some key configuration
> option in Oxygen or at the command line.
>
> So the next thing I thought I'd do was to generate a Schematron schema
> from Roma based on tei_all. I went to the latest and greatest Roma, here:
>
> <http://tei.oucs.ox.ac.uk/Roma/>
>
> chose to reduce a schema from the maximum possible schema, and made no
> changes. I generated an ISO Schematron schema from that, and got a file
> with no constraints in it at all:
>
> <?xml version="1.0" encoding="utf-8"?>
> <schema xmlns="http://purl.oclc.org/dsdl/schematron"
>           xmlns:oxdoc="http://www.oxygenxml.com/ns/doc/xsl"
>           queryBinding="xslt2">
>      <title>ISO Schematron rules</title>
>
> <!--namespaces:--><!--keys:--><!--patterns:--><!--constraints:--></schema>
>
> So that doesn't work, for some reason. Instead, I downloaded
> tei_all.odd, and ran this:
>
> roma --isoschematron tei_all.odd .
>
> to get an ISO Schematron file. Validating my test file against that
> produced the expected errors.
>
> So unless I'm missing something, it's unlikely that most people are
> getting any benefit from our Schematron constraints, unless they're
> aware that they have to manually generate a schematron schema and link
> it into their XML files. If this is the case, I think we should be
> documenting this process somewhere.
>
>
> This is my test doc, which violates two of our Schematron constraints:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <?xml-model
> href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng"
> type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
> <TEI xmlns="http://www.tei-c.org/ns/1.0">
>       <teiHeader>
>           <fileDesc>
>               <titleStmt>
>                   <title>Testing Schematron constraints in RelaxNG</title>
>               </titleStmt>
>               <publicationStmt><p>Public test document</p></publicationStmt>
>               <sourceDesc><p>Born digital</p></sourceDesc>
>           </fileDesc>
>       </teiHeader>
>       <text>
>           <body>
>               <div>
>                   <lg>
>                       <stage>Enter James stage left, looking shifty.</stage>
>                       <figure><figDesc>Picture of some lines of
> poetry.</figDesc></figure>
>                   </lg>
>                   <p><ref target="http://www.uvic.ca/"
> cRef="UVic">UVic</ref></p>
>               </div>
>           </body>
>       </text>
> </TEI>
>
> Cheers,
> Martin
>


-- 
Dr James Cummings, researchsupport at it.ox.ac.uk
Research Support, IT Services, University of Oxford


More information about the tei-council mailing list