[tei-council] Chapter 16 - Linking, Segmentation, and Alignment

Lou Burnard lou.burnard at oucs.ox.ac.uk
Thu Jan 31 12:32:11 EST 2008


Brett Zamir wrote:
> *
>
> 16.1.1 Pointers and Links*
>
> For the lines, "...elements pointed to or linked by this *simple 
> method* must be identifiable using the global <att>xml:id</att> 
> attribute. This implies that *they **must be present in the same 
> document, *and that they must bear unique <att>xml:id</att> values." 
> While I see that the chapter later covers external references (as the 
> next sentence also indicates), I don't see how any "simple method" had 
> been introduced or implied by the preceding. If "simple method" was 
> meant to refer to the type of pointers which only consist of the hash 
> mark, I think that ought to be made specific, or otherwise those 
> familiar with the fact that pointers can point to external documents 
> might be led to think that <link> doesn't allow external references.
Yes, this para has been inadequately updated. I've revised it somewhat.


>
> *16.2.4.3 left(pointer) and right(pointer)
> *
> For this line, "Because most pointer schemes return nodes or ranges 
> rather than points, the following description lists the behavior of 
> left() and right() for all three types of possible location that might 
> result from interpreting its argument.", I am confused, especially 
> since the list that follows has four types including a point.
>
I'm confused too. I don't see what the "Because" is doing there, so I've 
removed it. As you can probably tell this chapter needs quite some 
stylistic improvement...



> *16.2.4.4 range(pointer1, pointer2)*
>
> Should the list here also include the case of a pointer resolving to a 
> node set?
Probably, but I am not at all sure what it would mean!

>
> *16.2.4.5 string-range(pointer, offset [, length])
> *
> I made a few changes here. If I am wrong here, I believe the use of 
> string-range in 16.4.2 Alignment of Parallel Texts needs to be changed 
> (more than the slight amount I changed it).

Your changes look OK to me.
>
> *16.2.5 Canonical References
> *
> In the perhaps impossible event that someone would need to find the 
> text, %24 , there would be no way to do it?
>
You mean because there's no escape mechanism? I assume you'd do it by 
replacing the % by whatever its decimal value is. But I really cannot 
take this problem very seriously...

> *16.3 Blocks, Segments, and Anchors
> *
> For the line, "it may be more useful to use the <gi>s</gi> element for 
> this purpose, since this means that the <gi>seg</gi> element can then 
> be used to mark both features within s-units and segments composed of 
> s-units", is this really true? Couldn't <seg> be used for encoding a 
> level higher than, lower than, and at the sentence level? Perhaps this 
> should instead include an example of how <s> would free one up to use 
> the @type attribute in a more detailed way?
The point is that <s> is available for end-to-end segmentation only, and 
once you've done that, you can use <seg> to group or fragment your <s> 
units more or less  ad lib. You could do it all with <seg> of course, 
but <s> is specially provided for this purpose.


>
> Since the examples list <seg> being used for a level higher than <s>, 
> why does the definition at 
> http://tei.oucs.ox.ac.uk/P5/Guidelines-web/en/html/ref-seg.html state 
> it contains "any arbitrary *phrase-level *unit of text"?
Hmm, yes, this is a mistake. Its content model permits inter-level 
elements as well. Corrected.

>
> For the line, "The <gi>seg</gi> element has the same content as a 
> paragraph in prose: it can therefore be used to group together 
> consecutive sequences of <ident type="class">model.inter</ident> class 
> elements, such as lists, quotations, notes, stage directions, etc.", 
> if <seg> can include inter elements, doesn't that make <seg> a "chunk" 
> element according to 
> http://tei.oucs.ox.ac.uk/P5/Guidelines-web/en/html/ST.html which means 
> it can appear within divisions and the like?
>
No. It's a member of model.segLike, which is a subclass of model.phrase. 
The fact that it can contain inter level elements doesn't affect where 
it can itself occur.


> *16.4.2 Alignment of Parallel Texts
> *
> Sorry, but I don't follow this line, "Note that use of the <gi>ab</gi> 
> element allows us to mark up the orthographic sentences in both 
> languages independently of the alignment". If <s> can work at being a 
> sentence for either language, why can't <p>?
<p> has additional semantics -- it marks a paragraph -- sio wouldnt be 
appropriate here. I think the point is that the alignment can be done at 
the <ab> level when, as here, the sentences are not in the same order, 
but I agree it's not clear.
>
> *16.6 Identical Elements and Virtual Copies
> *
> Might reference be made here of how @copyOf compares to XInclude, as 
> per the information in section 16.9?
Possibly, but I have to admit that I am not sure what that comparison 
would reveal...

>
> *16.8 Alternation*
>
> 1) Might it be mentioned here that @exclude does not need (if it 
> indeed doesn't, as would seem logical) to be on both elements, as all 
> of the examples seem to have it?
>
What do you mean by "both" elements? It is just coincidence that there 
are two elements in each example here -- the exclude attribute points to 
(at least one) other element which its parent excludes.

> 2) For the examples connected by "This is interpreted to mean that 
> either the first or the third <gi>u</gi> element tag appears, and is 
> thus equivalent to just the alternation of those two tags", I'm 
> confused as to why one wouldn't wish to just have the simpler example 
> in the first place. My apologies if this is a dumb question...

Dumb questions are good ("why has the emperor no clothes?" eg) but in 
this particular case, we are trying to show the full generality of the 
mechanism, which may sometimes involve going all round the houses. As here.

>
> 3) For the example, <seg exclude="#lee2" xml:id="we2" type="word">We</seg>
> ...are both @exclude and <alt/> used in the example since there is no 
> @mode="excl" set on the first two <alt/>'s, whereas if they had them, 
> there would be no need for the @exclude attributes? Why are the first 
> two <alt/>'s present at all?
>
The first two <alt>s are there to recap the knowledge already discussed 
-- that the word is either Lee or We (and yes, you could do this with 
@mode=excl but that would be inconsistent with the presentation earlier 
I think)  The other two <alt>s are there to show the further 
consequences of one or other of the first two <alt> pair members being 
selected.  But this particular example has always made my head hurt....



> Given that it was several pages back, I think the line from the 
> reference for <alt/> at 
> http://tei.oucs.ox.ac.uk/P5/Guidelines-web/en/html/ref-alt.html , "If 
> mode is incl each weight states the probability that the corresponding 
> alternative occurs given that at least one of the other alternatives 
> occurs." might bear repeating here for the sake of this and the next 
> example, especially since this line was only present within the 
> definition and not the main flow of the tutorial.
Good idea. Have put it in. My head still hurts tho.

>
> I've been really trying to grapple with this and the following 
> example, but I haven't fully grasped it. If, for example, the weight 
> is .5 for the <alt/> with targets #we2 and #fun2, doesn't this imply 
> that the probability of fun being present when "when occurs" is 50% 
> and not 40% as the text states? (though I understand why it states 40% 
> also)...
See above...

>
> *16.9.2 Overview of XInclude
> *
> 1) I guess the line in note 66, "The version on which this text is 
> based on is the W3C Recommendation dated 20 December 2004" is still 
> current?
>
Yes. I've removed one of the "on"s though.


> 2) I deleted the sentence "Fallback content or resources can be 
> specified in case of a failure to fetch the requested resource." 
> because it seems redundant with the discussion a paragraph later of 
> the <xi:fallback> element.

Fine by me.

>
> 3) I changed the reference to IURI to IRI, but left the comment asking 
> whether it should be explained since IRI is only mentioned once and is 
> not mentioned elsewhere in the docs.
>
I've corrected it back to URI -- it may be that we need to talk about 
IRIs as distinct from URIs somewhere, but in this context it is simply 
confusing.

>
> *16.9.5 Including Text or XML Fragments*
>
> Not sure if you wanted to introduce what the components within range() 
> mean when having a pattern like in the docs:  
> range(/1/2/1.0,/1/2/11.1) .  Also, range-to() wasn't introduced earlier.
>

All of this section needs rewriting for clarity and comprehensiveness, 
in my view. But not by tomorrow, I'm  afraid.




More information about the tei-council mailing list