[tei-council] Schematron rules
John A. Walsh
jawalsh at indiana.edu
Tue Sep 9 06:49:36 EDT 2008
Sebastian,
I think you make a good argument for 2 as the most elegant solution.
I'm not too bothered that <content> and <constraint> do "substantially
the same thing." If we had one schema language that could "do it all"
we wouldn't need both, but we don't have such a language, so <content>
describes the content of the element, and <constraint> provides
*additional* constraints for the content of the element. It seems it
might make sense, conceptually, for <constraint> to be a child of
<content>.
John
--
| John A. Walsh
| Assistant Professor, School of Library and Information Science
| Indiana University, 1320 East Tenth Street, Bloomington, IN 47405
| www: <http://www.slis.indiana.edu/faculty/jawalsh/>
| Voice:812-856-0707 Fax:812-856-2062 <mailto:jawalsh at indiana.edu>
On Sep 9, 2008, at 5:54 AM, Sebastian Rahtz wrote:
> Schematron rules in ODD
> -----------------------
>
> Currently, we have no formal view on how constraints
> must be expressed in ODD. We have a general element
> <content> whose content model is "text" by default,
> but which should be changed to "any XML", and that is
> as far as we go. For the purposes of TEI P5, we
> redefine <content> to be "<valList> or any RELAX NG element,
> followed by any Schematron element", allowing us to
> write Schematron rules to extend the RELAX NG rules. This
> means that the TEI uses a conformant extension of itself.
>
> Problems
> --------
> There are two problems:
>
> 1. Schematron rules can only be expressed in the context of an
> element, which is rather counter to the spirit of it. Where do
> we say "all ID attributes must be at least 8 characters long"?
>
> 2. A common requirement in a project ODD would be to
> *add* Schematron rules, but this at present means duplicating
> the whole of <content> in "replace" mode, since the components
> are not identifiable.
>
> Solutions
> ---------
>
> There are three possibilities for fixing this:
>
> 1. allow Schematron rules to occur at the end of any
> <classSpec>, <elementSpec> or <macroSpec> (and even
> <schemaSpec>), just sitting there in their own namespace
>
> 2. allow Schematron rules in
> <classSpec>, <elementSpec> or <macroSpec> (and even
> <schemaSpec>) inside a new element <constraint>,
> alongside <content> in <elementSpec>, and added to
> the other *Spec.
>
> 3. separate the Schematron entirely from the *Spec
> and say that the whole thing must be maintained
> separately, and not able to be tied to a particular
> element. It could be dropped in under <schemaSpec>.
>
> The first choice is inelegant, though conformant
> (since the added elements would be in their own namespace),
> and relatively easy to implement (a small change to the
> current setup). It would mean no
> change to the ODD language. The ODD processing tools
> we have would be adapted in an ad hoc way.
>
> The second choice would mean a change to ODD, as it
> would add a new element with no other purpose
> at present, and no default content model other than
> "anyXML". It would be fairly easy to implement and
> support in eg Roma. The main argument against this
> is that it is an ad hoc extension to ODD, with two
> elements <content> and <constraint> doing substantially
> the same thing.
>
> The third choice is simple to implement, but allows for no extension
> of ODD building on ODD, or granularity in the rules.
>
>
> Conclusion
> ----------
>
> The disadvantages of the first and third proposals
> seem to me to outweigh the issues of the second.
> I therefore propose that we we add a <constraint> element, with
> a content model of "any XML", in the following places
>
> 1. as a sibling of <content> in elementSpec
> 2. as a sibling of <datatype> in attDef
> 3. as a child of <classSpec>
> 4. as a child of <schemaSpec>
>
> For TEI itself, we would constrain the "any XML"
> as follows:
>
> - if the parent is <elementSpec> or <schemaSpec> allow <s:pattern>
> - if the parent is <schemaSpec> allow <s:ns> as well
> - if the parent is <classSpec> allow <s:assert>,
> and generate <s:pattern> for each member of the class.
>
> --
> Sebastian Rahtz
> Information Manager, Oxford University Computing Services
> 13 Banbury Road, Oxford OX2 6NN. Phone +44 1865 283431
>
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
More information about the tei-council
mailing list