[tei-council] Schematron embedded in RelaxNG
Martin Holmes
mholmes at uvic.ca
Mon Sep 24 12:53:49 EDT 2012
Hi all,
During the FTF I was talking to Sebastian about the fact that our
Schematron constraints are embedded directly into RelaxNG files; he
believed that this meant that normal validation would cause them to be
triggered, but I couldn't remember having been caught by a Schematron
constraint when validating against a RelaxNG schema, even though I know
I know I've violated them. So I've just started testing this.
The RelaxNG schema for tei_all:
<http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng>
does indeed contain our handful of existing Schematron rules; if you
look at the element definition for <lg>, you'll see the constraint we
recently added, which says that an <lg> must contain at least one child
<l>, <lg> or <gap>.
However, when I create a test document violating this constraint and one
of the others, and validate it against this schema in Oxygen 14, nothing
happens. (The test document is below.)
So I went digging to try to figure out whether it was realistic to
expect Schematron validation as part of RelaxNG validation. According to
the Jing home page, Jing only has "experimental" support for Schematron,
and it's Schematron 1.5, not ISO Schematron. The documentation also says
that "Jing's implementation is not based on the reference Schematron 1.5
implementation. It is implemented partly in XSLT and partly in Java.
This implementation requires that the Schematron elements be properly
namespaced using the namespace URI http://www.ascc.net/xml/schematron."
However, our schema uses the ISO Schematron namespace,
"http://purl.oclc.org/dsdl/schematron". They often do define a prefix
for the old namespace, but they use the new one instead -- as they
should, I think, since we're using ISO Schematron.
Looking at the other available RelaxNG validators, the only one that
promises the possibility of Schematron support is the Sun MSV validator,
which seems to have disappeared from the web with the Oracle take-over.
So we don't seem to have a working validation mechanism for Schematron
embedded in RelaxNG schemas, unless I've missed some key configuration
option in Oxygen or at the command line.
So the next thing I thought I'd do was to generate a Schematron schema
from Roma based on tei_all. I went to the latest and greatest Roma, here:
<http://tei.oucs.ox.ac.uk/Roma/>
chose to reduce a schema from the maximum possible schema, and made no
changes. I generated an ISO Schematron schema from that, and got a file
with no constraints in it at all:
<?xml version="1.0" encoding="utf-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron"
xmlns:oxdoc="http://www.oxygenxml.com/ns/doc/xsl"
queryBinding="xslt2">
<title>ISO Schematron rules</title>
<!--namespaces:--><!--keys:--><!--patterns:--><!--constraints:--></schema>
So that doesn't work, for some reason. Instead, I downloaded
tei_all.odd, and ran this:
roma --isoschematron tei_all.odd .
to get an ISO Schematron file. Validating my test file against that
produced the expected errors.
So unless I'm missing something, it's unlikely that most people are
getting any benefit from our Schematron constraints, unless they're
aware that they have to manually generate a schematron schema and link
it into their XML files. If this is the case, I think we should be
documenting this process somewhere.
This is my test doc, which violates two of our Schematron constraints:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model
href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_all.rng"
type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<teiHeader>
<fileDesc>
<titleStmt>
<title>Testing Schematron constraints in RelaxNG</title>
</titleStmt>
<publicationStmt><p>Public test document</p></publicationStmt>
<sourceDesc><p>Born digital</p></sourceDesc>
</fileDesc>
</teiHeader>
<text>
<body>
<div>
<lg>
<stage>Enter James stage left, looking shifty.</stage>
<figure><figDesc>Picture of some lines of
poetry.</figDesc></figure>
</lg>
<p><ref target="http://www.uvic.ca/"
cRef="UVic">UVic</ref></p>
</div>
</body>
</text>
</TEI>
Cheers,
Martin
--
Martin Holmes
mholmes at uvic.ca
UVic Humanities Computing and Media Centre
More information about the tei-council
mailing list