[tei-council] word-dividing

Dot Porter dot.porter at gmail.com
Wed Jul 1 07:53:55 EDT 2009


Would/could this not apply as well to <pb> and <cb>?

Dot

On Wed, Jul 1, 2009 at 12:51 PM, Lou Burnard<lou.burnard at oucs.ox.ac.uk> wrote:
> After much head scratching here in Oxford, we've decided on "nobreak"
>
> I added a couple more examples and a bit more discussion, taking
> examples from some real projects too. Affected are the definition for
> <lb> and the discussion of milestones in CO.
>
>
>
>
> Daniel Paul O'Donnell wrote:
>> I think "word-dividing" in this case means "splitting individual words
>> atwain" rather than "demarcating their boundaries" ;)
>>
>> In my edition of Cædmon's Hymn I needed to encode space and lb
>> similarly explicitly: i.e. indicating whether it fell within the word
>> or between words: the stylesheets (such as they were in those days)
>> handled them differently depending on the value of @type (which I'd
>> made universal). White space wouldn't have done it for me, because I
>> was reformatting the data with and without the word-internal spaces
>> and lines depending on the view the user selected.
>>
>> -dan
>>
>> Lou Burnard wrote:
>>> Gabriel BODARD wrote:
>>>
>>>> Lou Burnard wrote:
>>>>
>>>
>>>
>>>>>> (9) lb: should we add an example of the usage of
>>>>>> lb/type=word-dividing, which currently sits a little uncomfortably
>>>>>> in the note. I suggest "Cae<lb type="worddiv"/>sari".
>>>>>>
>>>>> Don't know what note you're referring to. Don't see the point of
>>>>> the @type attribute. Haven't done anything.
>>>>>
>>>> This was discussed some months ago, and is the reason @type was
>>>> allowed on <lb> in the first place. There is currently a note at the
>>>> bottom of LB that says: "The type attribute may be used to
>>>> characterize the linebreak in any respect, for example as
>>>> word-breaking or not." We have literally thousands of examples of
>>>> this in EpiDoc files, where words are not always tagged explicitly
>>>> and it's the only way we can be sure to tokenize correctly. I just
>>>> thought an example would help to clarify the use-case.
>>>>
>>>> (If people feel strongly that [e.g.] "wordDividing" would be a
>>>> better recommended value than "worddiv", I'm happy to make that part
>>>> of our P5 upgrade script.)
>>>>
>>>>
>>>
>>> I don't mind adding examples, but this one confuses me. Isn't the
>>> point that the <lb/> in your example does NOT divide the word ? so
>>> both "wordDividing" and "worddiv" seem exactly the opposite of what
>>> you want here. How about "nowordbreak" or "nwb"?
>>>
>>> I know I lost this argument last time, but I still think in practice
>>> I'd deal with this by putting in whitespace where the <lb> coincided
>>> with a word boundary and leaving  it out where it didn't!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Best,
>>>>
>>>> G
>>>>
>>>>
>>>
>>> _______________________________________________
>>> tei-council mailing list
>>> tei-council at lists.village.Virginia.EDU
>>> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>>>
>>
>
> _______________________________________________
> tei-council mailing list
> tei-council at lists.village.Virginia.EDU
> http://lists.village.Virginia.EDU/mailman/listinfo/tei-council
>



-- 
*~*~*~*~*~*~*~*~*~*~*
Dot Porter (MA, MSLS)          Metadata Manager
Digital Humanities Observatory (RIA), Regus House, 28-32 Upper
Pembroke Street, Dublin 2, Ireland
-- A Project of the Royal Irish Academy --
Phone: +353 1 234 2444        Fax: +353 1 234 2400
http://dho.ie          Email: dot.porter at gmail.com
*~*~*~*~*~*~*~*~*~*~*


More information about the tei-council mailing list