Re: June 11 Meeting--Reading

From: Daniel Pitti (dpitti@virginia.edu)
Date: Mon Jun 10 2002 - 14:36:34 EDT

  • Next message: Sarah Parsons Wells: "Re: June 11 Meeting--Reading"

    Melinda,

    Thanks for the update. A quick look suggests that the body of the paper
    (preceding the appendices) is about 5 pages longer than the August 2001
    draft. I cannot determine without a close reading if there any major
    changes. But I'll try to find time before the meeting tomorrow to do so.

    Daniel

    At 01:26 PM 6/10/2002 -0400, you wrote:
    >For those of you who, like me, are just getting to this reading, I've
    >found that RLG has put up the final report of the ATDR at
    >http://www.rlg.org/longterm/repositories.pdf . I presume this document is
    >the one we should be looking at rather than the draft. See you all tomorrow.
    >
    >Melinda
    >
    >--On Thursday, May 16, 2002 2:55 PM -0400 Daniel Pitti
    ><dpitti@virginia.edu> wrote:
    >
    >>All,
    >>
    >>After reading a lot of the latest literature on digital repositories, I have
    >>made some progress in getting us moving once again on developing draft
    >>digital library policies.
    >>
    >>In preparation for our meeting in June, I would like all of you to read
    >>carefully Attributes of a Trusted Digital Repository (ATDR), a RLG-OCLC
    >>report: http://www.rlg.org/longterm/attributes01.pdf
    >>
    >>This joint RLG/OCLC report is inspired by the "Reference Model for an Open
    >>Archival Information System (OAIS)," [not to be confused with OAI] a
    >>framework for digital repositories developed by the space community (NASA and
    >>others). Though OAIS was developed by the space community, it has been well
    >>received by the archive, library, and museum communities, and with support
    >>from them, is nearing approval as an ISO standard. OAIS is very dense, but if
    >>you a feeling ambitious or suffering from insomnia, you will find the latest
    >>draft (July 2001) at http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-2.pdf
    >>
    >>The ATDR uses OAIS terminology (which is quickly becoming the standard
    >>terminology for discussing digital repositories). It defines the OAIS terms,
    >>at least minimally, and so you can read it without reading OAIS.
    >>
    >>In addition to the readings, I have also drafted an outline organized around
    >>the statements of Repository Responsibilities in the ATDR report, in
    >>particular the "responsibility relies on" lists given under each major
    >>responsibility. I have tried to organize these around the three principle
    >>areas of responsibility and activity outlined in OAIS, which are submission,
    >>archiving, and dissemination. In front of these is simply a list of "policy
    >>areas" given in ATDR (I). The policy areas overlap with the submission,
    >>archiving, dissemination categories (II-IV). I nevertheless included this
    >>section because I wanted to make sure that we did not overlook anything in
    >>the "policy areas" that might not come up under the lists.
    >>
    >>My intention is that we go over each responsibility and what the
    >>responsibility relies on, and then begin to develop draft policies for each.
    >>As you can see, I am trying to approach this systematically. I think the
    >>first thing we will notice is that while the categories in the ATDR report
    >>are useful, they are insufficiently detailed. And so developing them will be
    >>the first order of business.
    >>
    >>After the "Organization of Digital Collecting Policy" below is a list of
    >>working assumptions that seem to me to provide a context and some guidance in
    >>our deliberations. These are, of course, open for debate, and, in fact,
    >>should be debated. And added to as well.
    >>
    >>That's all for now,
    >>Daniel
    >>
    >>Organization of Digital Collecting Policy
    >>
    >>Trusted Digital Repository
    >>I. Policy
    >>
    >>Follows documented policies and procedures that ensure the information is
    >>preserved against all reasonable contingencies and enables the information to
    >>be disseminated as authenticated copies of the original or as traceable to
    >>the original. A. Policies for collections development (e.g., selection
    >>and retention) that link to technical procedures about how and at what level
    >>materials are preserved and how access is provided both short and long term.
    >>
    >>B. Policies for access control to ensure all parties are protected,
    >>including authentication of users and disseminated materials.
    >>
    >>C. Policies for storage of materials, including service-level agreements
    >>with external suppliers.
    >>
    >>D. Policies that define the repository's designated community and
    >>describe its knowledge base.
    >>
    >>E. A rigorous system for updating policies and procedure in accordance
    >>with changes in technology and in the repository's designated community.
    >>
    >>F. Explicit links between these policies and procedures, allowing for
    >>easy application across heterogeneous collections.
    >>
    >>
    >>
    >>II. SIP/submission information package/intake or receipt
    >>
    >>A. Works closely with the repository's designated community to advocate
    >>the use of good and (where possible) standard practice in the creation of
    >>digital resources; this may include an outreach program for potential
    >>depositors.
    >>
    >>B. Negotiates for and accepts appropriate information from information
    >>producers and rights holders. a. Well-documented and agreed-on policies
    >>about what is selected for deposit, including, where appropriate, specific
    >>required formats.
    >>
    >>b. Effective procedures and workflows for obtaining copyright clearance
    >>for both short-term and immediate access, as necessary, and preservation.
    >>
    >>c. A comprehensive metadata specification and agreed-on standards for
    >>its implementation. This is critical for federated or networked repositories
    >>and includes standards for the provision of rights metadata from content
    >>providers and for representing technical metadata.
    >>
    >>d. Procedures and systems for ensuring the authenticity of submitted
    >>materials.
    >>
    >>e. Initial assessment of the completeness of the submission.
    >>
    >>f. Effective record keeping of all transactions, including ongoing
    >>relationships, with content providers.
    >>
    >>III. AIP/archive information package/care and feeding
    >>Obtains sufficient control of the information provided to support long-term
    >>preservation:
    >>
    >>A. Detailed analysis of an object or class of objects to assess its
    >>significant properties. Analysis should be automated as much as possible and
    >>informed by the collections management policy, rights clearances, the
    >>designated community's knowledge base, and policy restrictions on specific
    >>file formats.
    >>
    >>B. Verification and creation of bibliographic and technical metadata and
    >>documentation to support the long-term preservation of the digital object
    >>according to its significant properties and underlying technology or abstract
    >>form, with monitoring and updating of metadata as necessary to reflect
    >>changes in technology or access arrangements. This involves understanding how
    >>strategies for continuing access, such as migration and emulation, influence
    >>the creation of preservation metadata.
    >>
    >>C. A robust system of unique identification.
    >>
    >>D. A reliable method for encapsulating the digital object with its
    >>metadata in the archive.
    >>
    >>E. A reliable archival storage facility, including an ongoing program of
    >>media refreshment; a program of monitoring media; geographically distributed
    >>backup systems; routine authenticity and integrity checking of the stored
    >>object; disaster preparedness; response, and recovery policies and
    >>procedures; and security.
    >>
    >>IV. DIP/dissemination information package/delivery
    >>A. Determines, either by itself of with others, the users that make up
    >>its designated community, which should be able to understand the information
    >>provided. Analysis and documentation of the repository's designated
    >>community; for federated or cooperating repositories, a shared understanding
    >>of the designated community.
    >>
    >>B. Ensures that the information to be preserved is "independently
    >>understandable" to the designated community; that is, that the community can
    >>understand the information without needing the assistance of experts. a.
    >>Well-maintained and documented technical metadata that is kept aligned with
    >>the knowledge base of the designated community and with changing
    >>technologies.
    >>
    >>b. A "technology watch" to manage the risk as technology evolves and to
    >>provide continuing access and updated methods of access as necessary, such as
    >>new migrations or emulators.
    >>
    >>C. Makes the preserved information available to the designated
    >>community.
    >>a. A system for discovery of resources.
    >>
    >>b. Appropriate mechanisms for authentication of the digital materials.
    >>
    >>c. Access control mechanisms in accordance with licenses and laws, and
    >>an "access rights watch."
    >>
    >>d. Mechanisms for managing electronic commerce. User support programs.
    >>
    >>---------------------------------------------------------------------
    >>---------------------------------------------------------------------
    >>
    >>Assumption: (from TDR/RLG/OCLC)
    >>Preservation reqires active management that begins at creation ... [p.18]
    >>----------------------------
    >>Assumption:
    >>
    >>Digital collecting or long-term preservation and access of digital resources
    >>(hereafter referred to as (digital preservation and access: DPA) is and will
    >>be a responsibility shared with other respositories (libraries, archives,
    >>museums, and related non-profit and for-profit organizations and
    >>institutions. This assumption is based on three interrelated assumptions:
    >>
    >> 1) the long-term preservation and access of digital resources will
    >> be expensive, requiring that the burden of remembering the digital
    >> cultural artifacts deemed worth remembering be shared. No one
    >> repository will be able to effectively remember all worth
    >> remembering.
    >>
    >> 2) the authenticity and reliability of the "remembered" (the
    >> accuracy of our memory) and our judgement with respect to what
    >> is to
    >> be remembered will necessary be subjected to evaluation. Such
    >> evaluation will necessarily be conducted by an authoritative
    >> body or
    >> bodies that arise from the cultural heritage repository
    >> communities,
    >> and, while users will rely on in large part on the evaluation
    >> of the
    >> authoritative bodies, users will also, ultimately, be the arbiters
    >> of the quality of our memory, and the extent to which we can be
    >> trusted.
    >>
    >> 3) remembering, especially shared remembering, will necessarily
    >> require a large number of hardware, software, communication, and
    >> intellectual and procedural standards. In that standards are
    >> necessarily the product of communities sharing common interests and
    >> objectives, digital collecting will necessarily involve
    >> participating in the development of standards and mastering them.
    >>
    >>Assumption:
    >>
    >>In order to take responsibility for DPA, the repository must "control" that
    >>which is collected. In other words, the repository must have control over the
    >>files (both content and, if necessary, software) in order to be able to
    >>manange the DPA. Therefore access to digital content that is licensed, or
    >>licensed access software, cannot be "collected." As a long-term strategy, the
    >>respository needs to work with other respositories and with licensed content
    >>provider on a strategy for the development of "DPA-friendly" content, and for
    >>arrangments for transfer of control of such content to a trusted respository.
    >>(See e-journal Mellon project at www.clir.org/diglib/preserve/ejp.htm
    >>
    >>Assumption:
    >>
    >>There are no existing, proven methods for DPA. There several competing
    >>theoretical models that are being tested. No one of these may emerge as THE
    >>method, and a combination of methods may well emerge, with different
    >>production methods, technology and standards (or lack thereof), and
    >>publication content and functional objectives and different known and
    >>anticipated user requirements being taken into account.
    >>
    >>Assumption:
    >>
    >>The mutability of the technology, the growing interdependency of the various
    >>participants in scholarly communication (creators, producers-publishers,
    >>repositories, and users) and the lack of an effective political
    >>infrastructure to promote and develop cooperation and collaboration among
    >>them leads to economic uncertainty, but also the need to develop policy that
    >>reflects both what is known and understand, and what is uncertain and
    >>changing.
    >>
    >>Assumption:
    >>
    >>The Reference Model for an Open Archive Information System (OAIS), originally
    >>developed by the space research community, has gained wide international
    >>acceptance as a "framework" for DPA. OAIS is currently being considered by
    >>the International Standards Organization, largely at the urging of the
    >>international archive and library communities. Virginia policy, as a member
    >>of the international library community, will work within the broad framework
    >>of OAIS, and will work within and participate in the ongoing international
    >>application of OAIS to the cultural heritage repository communities. At the
    >>level of respository administration, Virginia needs to in particular to
    >>follow the OCLC/RLG report "Attributes of a Trusted Digital Repository."
    >>
    >>Assumption:
    >>
    >>Inspired in part by OAIS, there are several metadata initiatives which need
    >>to be followed. Some of these initiatives deal only with semantics, but
    >>others with both semantics and syntax. In the semantic category, particular
    >>attention needs to be paid to two reports by the OCLC/RLG Working Group on
    >>Preservation Metadata, "A Recommendation for Preservation Description
    >>Information," and "A Recommendation for Content Information." These two
    >>metadata initiatives are addressing essential OAIS requirements.
    >>
    >>Addressing descriptive data: Metadata Object Description Schema (MODS), an
    >>initiative led by the Library of Congress.
    >>
    >>METS ...
    >>
    >>Assumption:
    >>
    >>The current DL literature reflects two implicit (and sometimes almost
    >>explicit) assumptions: 1) for large scale collections, digital publications
    >>collected will be relatively simple (or discrete or close to it: one file, or
    >>at most only a "few" files), and either created in or migrated to a hand full
    >>of representations (or formats). It is assumed that large, complex
    >>publications, with many interrelations between objects and/or or many
    >>signficant functional properties will be too expensive for archives/libraries
    >>to collect. It is assumed that the complexity of collecting the complex will
    >>be best addressed by emulation. SDS does not share this assumption. SDS, with
    >>its emphasis on behaviors, is "banking" on declarative standards (such as XSL
    >>and XQuery) as making the replication of behaviors over time affordable. This
    >>will need (humble) justification and argument.
    >>
    >>-----------
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >
    >
    >
    >
    >Melinda Baumann
    >Director, Digital Library Production Services
    >University of Virginia Library
    >PO Box 400155
    >Charlottesville VA 22904-4155
    >baumann@virginia.edu (434) 243-8785



    This archive was generated by hypermail 2b30 : Mon Jun 10 2002 - 14:36:41 EDT