[tei-council] Prefacing ids with "tei_"

Martin Holmes mholmes at uvic.ca
Mon Apr 23 17:43:46 EDT 2012


On 12-04-23 02:15 PM, Piotr Bański wrote:
> Hi Martin,
>
> This may be a classic case of my missing the point, but something
> doesn't click in my mind, so I'd better comment on that, just in case:
>
> On 23/04/12 20:48, Martin Holmes wrote:
>> I'm planning to make a significant number of changes to Sebastian's
>> stylesheets to implement our decision to preface all ids in the web
>> output with "tei_", in an effort to avoid the problems caused by AdBlock
>> Plus. Before I do, having looked at the code, there are a couple of
>> things I wanted to get some feedback on:
>>
>> 1. Generated ids. There are some places in which ids in the output are
>> generated using the XPath generate-id() function:
> [..]
>> These result in ids that look like random sequences of characters. Do we
>> need to preface these with "tei_"? My instinct says yes -- after all,
>> such a random sequence is perfectly likely to end up with content which
>> might trigger an AdBlock filter, so we might as well protect it in the
>> normal way.
>
> I understood that "msad" triggered the false positive, and the (arguable
> and, IIRC, argued) decision was for the TEI to adjust to AdBlock.

Yes, the false positive was triggered by "msad".

> But if the problem was the matching of the entire string "msad" only (as
> opposed to "tei_msad"), then i am led to conclude that a substring match
> does not trigger the false positive. If that is true, then
> generated-ids() can't within reasonable probability generate an
> offensive string, and only that string (without extra characters).

It is possible to use regular expressions in Adblock Plus lists, so 
partial matches may trigger a block. If this happens, one advantage of 
our tei_ prefix will be that we can ask filter list writers to refine 
their regexp to exclude our ids, using the prefix, while still remaining 
able to block their original target. I think given that option, they'll 
be more likely to respond to bug reports from us.

> As to your second question, I'd say the prefix introduces a layer of
> safety indeed. But I'm not sure to what extent we as the Council need to
> bother of someone using silly @n atts as the basis for a silly algorithm
> of @n->@id. We might end up being overprotective here.

True. On consideration, I think we should add the prefix in all cases, 
for the sake of consistency.

Cheers,
Martin

> Best,
>
>    P.
>
>
>
>>
>> 2. @n attributes. There's one place where the @n attribute can be used
>> to create an id attribute in the output (in textstructure.xsl, code
>> below). Should this also be prefaced by "tei_"? I'm not sure about this,
>> because depending on the contents of the @n, the result might be
>> puzzling. On the other hand, I don't think that @n can be relied upon to
>> work as an id attribute anyway, can it?
>>
>> <xsl:variable name="identifier">
>>            <xsl:text>App</xsl:text>
>>            <xsl:choose>
>> 	<xsl:when test="@xml:id">
>> 	<xsl:value-of select="@xml:id"/>
>> 	</xsl:when>
>> 	<xsl:when test="@n">
>> 	<xsl:value-of select="@n"/>
>> 	</xsl:when>
>> 	<xsl:otherwise>
>> 	<xsl:number count="tei:app" level="any"/>
>> 	</xsl:otherwise>
>>            </xsl:choose>
>>         </xsl:variable>
>>
>>         <xsl:choose>
>>          <xsl:when test="$footnoteFile='true'">
>> 	<a class="notelink" href="{$masterFile}-notes.html#{$identifier}">
>> 	<sup>
>> 	<xsl:call-template name="appN"/>
>> 	</sup>
>> 	</a>
>>          </xsl:when>
>>          <xsl:otherwise>
>> 	<a class="notelink" href="#{$identifier}">
>> 	<sup>
>> 	<xsl:call-template name="appN"/>
>> 	</sup>
>> 	</a>
>>          </xsl:otherwise>
>>         </xsl:choose>
>>
>> Cheers,
>> Martin
>

-- 
Martin Holmes
University of Victoria Humanities Computing and Media Centre
(mholmes at uvic.ca)


More information about the tei-council mailing list