dhcs minutes: 10/17

From: Andrea K. Laue (akl3s@cms.mail.virginia.edu)
Date: Thu Oct 18 2001 - 09:44:07 EDT

Next message: Andrea K. Laue: "dhcs: stand-off markup"

Previous message: andrea laue: "dhcs reminder: meeting 10/17"
Next in thread: Jerome McGann: "Re: dhcs minutes: 10/17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Business:

AL: Sowa, Bringsjord and Kirschenbaum have all confirmed. Trevor Harris
supposedly will. We can invite one more person. Who?

Proposed Speakers / Topic

Kathy Ball -- Georgetown, nlp <http://www.georgetown.edu/cball/cball.html>
Niels Finnemann -- history of computer/social impacts
Lev Manovich -- new media aesthetics, <http://www.manovich.net/index.html>
Martha Blodgett -- information communities in library context, UVa
Kathy Ryall -- Mitsubishi Lab, interface,
<http://www.merl.com/people/ryall/>
Jim French -- navigation and information retrieval, UVa
<http://www.cs.virginia.edu/~french/>
Dave Luebke -- graphics, UVa
<http://www.cs.virginia.edu/brochure/profs/luebke.html>
Dave Brogan -- graphics, UVa
<http://www.cs.virginia.edu/brochure/profs/brogan.html>
Aldona Towner -- is there an nlp person at IBM?
Ben Schneiderman -- information visualization
<http://www.cs.umd.edu/~ben/>
Bill Buxton -- interface <http://www.billbuxton.com/>
David Noble -- electronic communities, coming here anyway in November,
maybe just sit-in

Topic: Data Structures
Leader(s): John Unsworth and Daniel Pitti

JU: What we're not talking about today: not talking about very basic data
structures that computers use to manage data--lists, arrays, hash tables,
etc.; character sets (?);

DP: Usually use the term structured information rather than data
structures.

AL: To what extent to we want to step back from actual implementations--to
talk about classification systems, ontologies, logic, etc before we talk
about DTD's and databases.

JM: Historicization is important. We need to talk about classification
systems from Aristotle on . . . It's a misnomer to state that we're
talking about structured information; really, we're talking about
re-structured information.

JD: How are we to understand the development of database structures in the
history of set theory? Relational databases first appeared in the
late-60's and early-70's. Why is this?

JU: The structures we're talking about assume important things about the
data: hierarchy, atomical units, relations.

GR: Would you consider a character set a data structure?

DP: Computers weren't really interesting to humanists until computers were
able to name things--to specify their own semantics. To name things and
to specify structures. Two technologies most frequently used--databases
and markup languages.

Not talking about visually or aurally structured information here, just
text.

Markup. The _Gentle Introduction_ seems quite ancient now. Written about
SGML without any knowledge of XML; talks about many things which have now
been eliminated in XML.

Email between Allen Renear and Jerry McGann.

Encoding is an interpretation. A DTD could not be written to capture all
aspects of a text.

Important distinction: procedural vs. declarative markup

The _Gentle_ doesn't delve into the philosophical issues.

Primary assumption that differentiates markup from relational databases:
hierarchy. Markup assumes that text is inherently hierarchical.

JM: Could you say something about standoff markup?

JD: What kinds of information are appropriate for markup, for databases,
and for a third category of structures?

RD: Lets take a step back and talk about theories of representation. It's
just as important to talk about what's not represented in any structure.
Let's look at a wider range of knowledge representation, theories of this
beyond applications in the computer.

DP: The introduction of computers forces us to concentrate on sturctures
that are processable.

What types of information best fit our two tools, databases and markup?

Database:
info. w/ repeating patterns

Markup Language:
ordered information--is the order of the objects important

WM: What is the "is" of text? Renear says that "text is." How well
accepted is that?

WM: Are we relativists or positivists? Are we theorists or practitioners?

AL: I would like to suggest that we first attempt to be theoreticians.
And then second, practitioners.

JU: I suggest that we work first as practictioners and second as
theoreticians. We should start with constraints.

TH: Fundamental conflict: representing things for what they are vs.
representing things for a purpose.

JU: Ask students to model their families as XML, object-oriented database,
relational database.

JD: classification aspect is one thing we're talking about here, but
aren't we also talking about metalanguages. How do we talk about the
languages that we are using.

JU: What do you mean by metalanguage? Grammars? Semantics?

JD: We should be aware of what it means to talk about a system of
representation as a system of representation. These tools assume that
we're making a system of representations about a system of
representations. Should we talk about meta structures and what they
"mean"?

Certain ways of assuming that you enter data into a database or a markup
language. How do we describe the way we use this?

JU: Rules of integrity. DTD--you must parse a document according to rules
of integrity. Relational database--relational algebra enforces rules of
integrity. Object-oriented databases don't enforce rules of integrity.

What's the syntax here? And where are the semantics of the syntax
expressed.

In object-oriented systems, perhaps you hide the syntax in the methods.
You don't declare at the beginning very many rules. There's no way to
tell if the objects are internally consistent.

GR: Process of structuring. How do you go about trying to find or force
structure on data. There is a moment when you're trying to discover
structure when exploratory markup is very helpful. That is when you're
still trying to find the structure. Markup languages like CoCa. (TACT
uses CoCa.) Developed by Susan Hockey. Used in linguistics.

TH: CoCa was a late '70's markup language that was developed on a
particular computer.

JM: Has a brief description of it in her book.

GR: Key philosophical thing. "1" is true of "act" until you get to "2."

<act 1> xxx xxx xxx xxx <act 2> xxx xxx xxx

SR: Suspicious of distinction between procedural and descriptive.

JU: Procedures are, in fact, descriptive. But only implicitly
descriptive.

SR: XML is a procedural language that doesn't have its procedures defined
yet.

DP: Much of what people actually markup are implicitly procedures. You
want someting to happen to the text.

TH: You do this with a purpose. You want to call a procedure at the last
minute, maybe, but you do have an intention.

DP: 125 different occasions for which italics were used in the OED.
Should we make 125 tags?

SR: Almost the same as saying that typography is not semantic. Or layout
is not semantic. That's just wrong.

DP: Maybe distinction should be noun vs. verb.

GR: This method--CoCa--describes a very humanities oriented-method of
processing the text. XML encourages another, a distanced evaluation of
the entire structure before entering into it. CoCa describes a linear
method of reading. It's very hard to do document analysis in a
hierarchical manner. CoCa is serial.

TDBSGML -- with a lookup table, you can take a tact database and convert
to a TEI-lite document.

JU: Two questions. milestone vs. continuation of truth until stated
otherwise

GR: Instead of "closing" tags, you change the value of the variable to
"off" or "stop."

Good exploratory markup language doesn't enforce hierarchy from the
beginning.

JU: Is this pointing to a different between a deductive and an inductive
approach. The inductive approach would use exploratory markup.

GR: How do we do document analysis as a practice?

JD: Pedagogical methods. CoCa might be a very good pedagogical tool.
Doesn't require a pre-existing DTD.

JU: This points to how comfortable we are about some positivist notions of
text. We laugh about turning metaphor "off," but yet we have a sense of
metaphor as a bounded thing.

What differnt models or tools might be more appropriate at differnt
moments of analysis, different places in the process.

Or maybe have students markup a text, feed to "Fred" or other agent that
will return the DTD implicit, and see how you structured.

RD: Look at theories of text. Then have students try to markup texts
according to these different theories.

JU: In February we'll have Paul Eggert visit and talk about
JIT--just-in-time markup.

Email URL about this.
<http://idun.itsc.adfa.edu.au/ASEC/PWB_REPORT/choice.html>

Next message: Andrea K. Laue: "dhcs: stand-off markup"
Previous message: andrea laue: "dhcs reminder: meeting 10/17"
Next in thread: Jerome McGann: "Re: dhcs minutes: 10/17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b30 : Thu Oct 18 2001 - 09:44:14 EDT