Good morning all,
In our final meeting of the semester, we will discuss a "straw man"
curriculum proposed by Geoff and Worthy as well as the topics of
digitization and sampling. Below you'll find the proposed
curriculum. Soon Geoffrey will post a message detailing the readings
available on toolkit.
best,
Andrea
________________________________________
Knowledge Representation: A Straw Curriculm (for burning)
1. The context of this straw curriculum
In our discussions so far we have looked at a number of topics that
could be covered in the KR course for the Digital Humanities M.A.
program. Many of these topics are so interesting that they could
consume the whole year for which reason we decided to devote the Dec.
12th meeting to reviewing the course as a whole. The questions we can
consider are:
1.1 What are the objectives of the KR course within the whole
program? What absolutely needs to be covered in the course and what
might be covered in other courses or not at all?
1.2 Of the topics that can be covered, what is the best order? What
is the logic of the order?
1.3 What should be the relationship between hands-on training in the
course and theoretical/historical discussion? Should there be a
training component and if so how should it connect to the content of
the course?
1.4 What are the implications of the design of the KR course on the
other courses (the Design course and the Software Engineering course)?
1.5 What remaining issues are there that have come out of the
discussion that need to be kept in mind?
1.6 How can the work of the KR seminar be appropriately gathered and
made available to those who teach the course or take it?
What follows are possible answers to these questions:
2. Proposed objectives
One of the objectives of the program is to produce students who can
design, implement, and report on a humanities computing project of
the sorts that IATH has pioneered. To do this they need to be aware
of the variety of projects that are interesting hc projects; they
need to have the theoretical background to understand and discuss
them; they need to be aware of the tools and techniques typically
used in such projects, they should be able to appropriately use a
subset of tools and techniques; they need to be able to design a
project and work with others to see it through; they need to be able
to talk and write about a project; and they need to have a sense of
they hc community in which they will participate.
In other words the program is project oriented and the courses should
be designed to lead them to the point where they can do a thesis
project that uses humanities computing in their field. The logic of
the courses, order of courses, and order of topics within the KR
course could be structured around the development cycle of an ideal
hc project. In this model the courses would work like this:
2.1 The Design course would (among other things) expose the students
to a variety of projects so that they have a sense of what can be
done. The course would give them a sense of the outcomes since well
designed projects have to keep outcomes in mind. When looking at the
case studies the students would review not just the final product,
but also how the project developed and what problems were encountered
in order to give them a sense of realistic project cycles.
2.2 The KR course would then be designed to take the students through
one ideal project in great detail. The order of topics would follow
the order of activities that one does in a project. (More on this
later.)
2.3 The Software Engineering course would deal explicitly with the
design, management and development of projects reflecting back on the
projects studied in the design course and the ideal project followed
in the KR course in order to prepare students to successfully do a
digital humanities project for their thesis.
In effect this curriculum is designed to take students through a
project cycle repeatedly before they get to their thesis. In the
first course they look at a number of projects, in the KR course they
go through the steps looking at the issues in detail and in the SE
course they reflect on the process.
3. The Knowledge Representation Course
The KR course would follow (or run in tandem with) a design or case
studies course where students are exposed to a representative sample
of humanities computing projects - both their design and their
implementation history. The KR course would recapitulate the order of
implementation of an ideal project. For each topic there would be
readings that provide the theoretical, historical and technical
background needed. Certain topics would accompanied by technical
training.
The order of topics could be:
3.1 Knowledge Representation - An introduction to thinking about what
one can represent digitally about a phenomenon of interest to
humanities research. Students would think about what are the objects
of study, how they are related, and how the knowledge can be
represented digitally. Students would write project proposals and
specifications that describe the purpose and outcomes of a digital
humanities project.
3.2 Digitization - How to represent the artefacts of interest to the
Humanities on a computer in digital form. Students would learn about
digitizing texts, images, audio and video materials for use in a
project. Special attention would be paid to the issue of digitizing
texts including OCR and data entry. Students would scan a work of
interest to produce digital images and use OCR to produce an ASCII
e-text.
3.3 Markup and Enrichment - How to enrich digitized materials with
knowledge about the materials. How to add markup to enrich the
digitized materials. Special attention to TEI markup of electronic
texts. Students would markup the e-text produced in 3.2 following the
TEI guidelines.
3.4 Data Structures - How to structure collections of digital
representations using different data structures. How to use databases
to organize information. Students would design a database to manage
the scanned images and textual information.
3.5 Transformation of Data - How to transform structured data using
databases and style sheets. Students would use their databases to
manipulate the information and XSL as an alternative way to transform
XML texts.
3.6 Programming - How to write programs to process information and
analyze it. What can we learn from digital information by processing
it. Students would write cgi-programs to access the databases they
created or programs to process the e-texts they created. They would
learn a language like Python or Ruby.
3.7 Interface Design - How to create effective interfaces to digital
research information. Students would learn to use Flash as an
interface design tool that can display xml passed from a database or
server-side program.
3.8 Testing, Maintenance and Documention - How to test a digital
humanities system. How to make sure it can be maintained and how to
document it so that it can be used and maintained. Students would
write reports and documentation for users as part of the interface.
It is possible that by the end of this course students will have
digitized texts, entered the texts into a database, designed an
interface in xhtml or Flash, written cgi programs to connect the
interface to the database/e-texts and written a report documenting
the process.
I have left out of this outline the broader intellectual issues that
would be discussed in the context of these topics. For example, in
the programming module the students should be reading about the
history of programming, logic and algorithms, types of programming
languages and so on.
4. Technical Training
We have discussed the question of what do the students need to know
about programming and other technical subjects. In this model the
technical subjects covered in associated workshops would be:
4.1 Digitizing (and OCR) - 1 day
4.2 XML Markup with TEI - 5 days
4.3 Databases with mySQL - 3 days
4.4 XSL and CSS - 2 days
4.5 Programming with Ruby - 10 days
4.6 Flash - 5 days
This assumes they know XHTML, are comfortable with computers in
general and know enough Unix to get around.
On the question of programming the students need to know the
following:
4.7 The language of programming (including the variety of programming
languages). This is so that they can interact with programmers.
4.8 What programs and databases can do. This can be taught by example
(here are cases of things that particular combinations of programs
and data can do for humanists.) Or it can be taught by teaching
students to program.
While it would seem that we can actually get students to the point
where they can participate in a "Management of Large Design Projects"
course by doing only 4.7 and 4.8 - without in depth programming, I
would like to offer other reasons for teaching programming in
sufficient depth:
4.9 Without sufficient programming experience students could not do
interesting projects without the money to hire a programmer. We want
to produce students who can, if unfunded, still implement projects.
This recognizes the reality that part of the culture of the
humanities is to be self-sufficient.
4.10 Programming skill would give these students a valuable asset and
an alternative career path. A responsible program should, wherever
possible, maximize the career paths open to graduates.
4.11 Programming is actually one of the few things that a typical
humanities graduate student could not learn on their own. What we
really are teaching when we teach programming is how to learn to
program so that they can keep on learning if they need to. A humanist
has learned how to learn things like markup languages, history of
computing, digitization, and transformation. They typically don't
know how to learn something like programming. By teaching them
programming once we are preparing them to be able to learn new
computing skills as they need to of a different sort.
Other Technical Skills
In addition to the skills that accompany the course workshops on
other topics could be arranged. They might cover:
Photoshop for Digital Image Manipulation, Setting Up A Linux Box as a
DH Server, Digital Audio and Video,
5. What is missing from this curriculum?
The outline of topics above does not deal with a number of topics in
the original list, but many of the topics not covered could be woven
in if there is time. The following topics might be difficult to
integrate either because of the logic of the proposal or because they
would take too long to do:
5.1 What is a computer? A discussion of computers, OSes, and so on.
This could go between 3.1 and 3.2.
5.2 Multimedia Manipulation and Design - the module on interface
would not cover image processing, visualization, and digital video,
though it could introduce them.
5.3 MOOs and Internet Communities - A MOO could be set up as an
extension of the course to create a community of students, alums, and
faculty.
5.4 Natural Language Processing and AI
5.5 Games and Instructional Technology
5.6 Time Modelling
5.7 Print as KR
6. Implications for other courses.
We discussed the Software Engineering course. Worthy Martin suggested
it could be called Management of Large Design (MLD) projects course.
As such it would nicely follow the KR course (setting aside the
programming issue) in that KR would take them through a typical
project slowly so that they could make mistakes on the way which
would become fodder for the MLD course. In the MLD course they would
sit back and think about how projects should go, especially when the
projects are big.
On the issue of programming, this proposal is a compromise between
the let-them-learn-it-on-their-own view and a full semester
programming course. The idea is to introduce mySQL, Ruby and Flash in
appropriate doses in the KR course. Java could be taught in a
non-credit course in the summer or the Fall of the second year.
We discussed the possibility of a MLD course that could be attractive
to both CS and DH students. The idea would be that students would be
formed into teams with the DH students acting as clients, content
experts or managers/designers and the CS students acting as
programmers and software engineers. The course could thus benefit CS
by providing appropriate interdisciplinary experience while also
giving the DH students a chance to actually practice working with
computer scientists.
_____________
--
This archive was generated by hypermail 2b30 : Sat Dec 08 2001 - 11:33:20 EST