[tei-council] Prefacing ids with "tei_"

Piotr Bański bansp at o2.pl
Tue Apr 24 07:03:04 EDT 2012


1. I agree that a positive message "please add to your filters a
positive exception for 'tei_.+'" is sociologically better than the
negative message "don't reject 'msad' because we happen to use it". It
has a better chance of being successful, even if only as courtesy of one
open-content developer to another.

2. prefacing all ids with "tei_" indeed seems more consistent, at least
it doesn't raise user questions of "why do you prefix here and not
there", which may, psychologically, contribute to the impression that
the TEI is more complex than it really is (which is quite complex anyway).

3. since (as I understand this) we're talking about an internal change
targetting the web Guidelines and related resources (also the kind of
output that should not be worked on directly), the results may be tested
and quickly fixed if necessary; and "being overprotective", which I
didn't like before, is restricted to giving an example of good practice
that is not imposed on the individual encoder.


So, despite my earlier nitpicks and Sebastian's remarks on why it is not
necessary from the point of view of the system to prefix all IDs, I tend
to agree that 2 and 3 above may swing the balance towards Martin's
uniform solution.

  P.


On 23/04/12 23:43, Martin Holmes wrote:
> On 12-04-23 02:15 PM, Piotr Bański wrote:
>> Hi Martin,
>>
>> This may be a classic case of my missing the point, but something
>> doesn't click in my mind, so I'd better comment on that, just in case:
>>
>> On 23/04/12 20:48, Martin Holmes wrote:
>>> I'm planning to make a significant number of changes to Sebastian's
>>> stylesheets to implement our decision to preface all ids in the web
>>> output with "tei_", in an effort to avoid the problems caused by AdBlock
>>> Plus. Before I do, having looked at the code, there are a couple of
>>> things I wanted to get some feedback on:
>>>
>>> 1. Generated ids. There are some places in which ids in the output are
>>> generated using the XPath generate-id() function:
>> [..]
>>> These result in ids that look like random sequences of characters. Do we
>>> need to preface these with "tei_"? My instinct says yes -- after all,
>>> such a random sequence is perfectly likely to end up with content which
>>> might trigger an AdBlock filter, so we might as well protect it in the
>>> normal way.
>>
>> I understood that "msad" triggered the false positive, and the (arguable
>> and, IIRC, argued) decision was for the TEI to adjust to AdBlock.
> 
> Yes, the false positive was triggered by "msad".
> 
>> But if the problem was the matching of the entire string "msad" only (as
>> opposed to "tei_msad"), then i am led to conclude that a substring match
>> does not trigger the false positive. If that is true, then
>> generated-ids() can't within reasonable probability generate an
>> offensive string, and only that string (without extra characters).
> 
> It is possible to use regular expressions in Adblock Plus lists, so 
> partial matches may trigger a block. If this happens, one advantage of 
> our tei_ prefix will be that we can ask filter list writers to refine 
> their regexp to exclude our ids, using the prefix, while still remaining 
> able to block their original target. I think given that option, they'll 
> be more likely to respond to bug reports from us.
> 
>> As to your second question, I'd say the prefix introduces a layer of
>> safety indeed. But I'm not sure to what extent we as the Council need to
>> bother of someone using silly @n atts as the basis for a silly algorithm
>> of @n->@id. We might end up being overprotective here.
> 
> True. On consideration, I think we should add the prefix in all cases, 
> for the sake of consistency.
> 
> Cheers,
> Martin
> 
>> Best,
>>
>>    P.
>>
>>
>>
>>>
>>> 2. @n attributes. There's one place where the @n attribute can be used
>>> to create an id attribute in the output (in textstructure.xsl, code
>>> below). Should this also be prefaced by "tei_"? I'm not sure about this,
>>> because depending on the contents of the @n, the result might be
>>> puzzling. On the other hand, I don't think that @n can be relied upon to
>>> work as an id attribute anyway, can it?
>>>
>>> <xsl:variable name="identifier">
>>>            <xsl:text>App</xsl:text>
>>>            <xsl:choose>
>>> 	<xsl:when test="@xml:id">
>>> 	<xsl:value-of select="@xml:id"/>
>>> 	</xsl:when>
>>> 	<xsl:when test="@n">
>>> 	<xsl:value-of select="@n"/>
>>> 	</xsl:when>
>>> 	<xsl:otherwise>
>>> 	<xsl:number count="tei:app" level="any"/>
>>> 	</xsl:otherwise>
>>>            </xsl:choose>
>>>         </xsl:variable>
>>>
>>>         <xsl:choose>
>>>          <xsl:when test="$footnoteFile='true'">
>>> 	<a class="notelink" href="{$masterFile}-notes.html#{$identifier}">
>>> 	<sup>
>>> 	<xsl:call-template name="appN"/>
>>> 	</sup>
>>> 	</a>
>>>          </xsl:when>
>>>          <xsl:otherwise>
>>> 	<a class="notelink" href="#{$identifier}">
>>> 	<sup>
>>> 	<xsl:call-template name="appN"/>
>>> 	</sup>
>>> 	</a>
>>>          </xsl:otherwise>
>>>         </xsl:choose>
>>>
>>> Cheers,
>>> Martin
>>
> 



More information about the tei-council mailing list