Re: June 11 Meeting--Reading

From: Melinda Baumann (mjb7q@cms.mail.virginia.edu)
Date: Mon Jun 10 2002 - 13:26:03 EDT

  • Next message: Daniel Pitti: "Re: June 11 Meeting--Reading"

    For those of you who, like me, are just getting to this reading, I've found
    that RLG has put up the final report of the ATDR at
    http://www.rlg.org/longterm/repositories.pdf . I presume this document is the
    one we should be looking at rather than the draft. See you all tomorrow.

    Melinda

    --On Thursday, May 16, 2002 2:55 PM -0400 Daniel Pitti <dpitti@virginia.edu>
    wrote:

    > All,
    >
    > After reading a lot of the latest literature on digital repositories, I have
    > made some progress in getting us moving once again on developing draft
    > digital library policies.
    >
    > In preparation for our meeting in June, I would like all of you to read
    > carefully Attributes of a Trusted Digital Repository (ATDR), a RLG-OCLC
    > report: http://www.rlg.org/longterm/attributes01.pdf
    >
    > This joint RLG/OCLC report is inspired by the "Reference Model for an Open
    > Archival Information System (OAIS)," [not to be confused with OAI] a
    > framework for digital repositories developed by the space community (NASA and
    > others). Though OAIS was developed by the space community, it has been well
    > received by the archive, library, and museum communities, and with support
    > from them, is nearing approval as an ISO standard. OAIS is very dense, but if
    > you a feeling ambitious or suffering from insomnia, you will find the latest
    > draft (July 2001) at http://www.ccsds.org/documents/pdf/CCSDS-650.0-R-2.pdf
    >
    > The ATDR uses OAIS terminology (which is quickly becoming the standard
    > terminology for discussing digital repositories). It defines the OAIS terms,
    > at least minimally, and so you can read it without reading OAIS.
    >
    > In addition to the readings, I have also drafted an outline organized around
    > the statements of Repository Responsibilities in the ATDR report, in
    > particular the "responsibility relies on" lists given under each major
    > responsibility. I have tried to organize these around the three principle
    > areas of responsibility and activity outlined in OAIS, which are submission,
    > archiving, and dissemination. In front of these is simply a list of "policy
    > areas" given in ATDR (I). The policy areas overlap with the submission,
    > archiving, dissemination categories (II-IV). I nevertheless included this
    > section because I wanted to make sure that we did not overlook anything in
    > the "policy areas" that might not come up under the lists.
    >
    > My intention is that we go over each responsibility and what the
    > responsibility relies on, and then begin to develop draft policies for each.
    > As you can see, I am trying to approach this systematically. I think the
    > first thing we will notice is that while the categories in the ATDR report
    > are useful, they are insufficiently detailed. And so developing them will be
    > the first order of business.
    >
    > After the "Organization of Digital Collecting Policy" below is a list of
    > working assumptions that seem to me to provide a context and some guidance in
    > our deliberations. These are, of course, open for debate, and, in fact,
    > should be debated. And added to as well.
    >
    > That's all for now,
    > Daniel
    >
    > Organization of Digital Collecting Policy
    >
    > Trusted Digital Repository
    > I. Policy
    >
    > Follows documented policies and procedures that ensure the information is
    > preserved against all reasonable contingencies and enables the information to
    > be disseminated as authenticated copies of the original or as traceable to
    > the original. A. Policies for collections development (e.g., selection
    > and retention) that link to technical procedures about how and at what level
    > materials are preserved and how access is provided both short and long term.
    >
    > B. Policies for access control to ensure all parties are protected,
    > including authentication of users and disseminated materials.
    >
    > C. Policies for storage of materials, including service-level agreements
    > with external suppliers.
    >
    > D. Policies that define the repository's designated community and
    > describe its knowledge base.
    >
    > E. A rigorous system for updating policies and procedure in accordance
    > with changes in technology and in the repository's designated community.
    >
    > F. Explicit links between these policies and procedures, allowing for
    > easy application across heterogeneous collections.
    >
    >
    >
    > II. SIP/submission information package/intake or receipt
    >
    > A. Works closely with the repository's designated community to advocate
    > the use of good and (where possible) standard practice in the creation of
    > digital resources; this may include an outreach program for potential
    > depositors.
    >
    > B. Negotiates for and accepts appropriate information from information
    > producers and rights holders. a. Well-documented and agreed-on policies
    > about what is selected for deposit, including, where appropriate, specific
    > required formats.
    >
    > b. Effective procedures and workflows for obtaining copyright clearance
    > for both short-term and immediate access, as necessary, and preservation.
    >
    > c. A comprehensive metadata specification and agreed-on standards for
    > its implementation. This is critical for federated or networked repositories
    > and includes standards for the provision of rights metadata from content
    > providers and for representing technical metadata.
    >
    > d. Procedures and systems for ensuring the authenticity of submitted
    > materials.
    >
    > e. Initial assessment of the completeness of the submission.
    >
    > f. Effective record keeping of all transactions, including ongoing
    > relationships, with content providers.
    >
    > III. AIP/archive information package/care and feeding
    > Obtains sufficient control of the information provided to support long-term
    > preservation:
    >
    > A. Detailed analysis of an object or class of objects to assess its
    > significant properties. Analysis should be automated as much as possible and
    > informed by the collections management policy, rights clearances, the
    > designated community's knowledge base, and policy restrictions on specific
    > file formats.
    >
    > B. Verification and creation of bibliographic and technical metadata and
    > documentation to support the long-term preservation of the digital object
    > according to its significant properties and underlying technology or abstract
    > form, with monitoring and updating of metadata as necessary to reflect
    > changes in technology or access arrangements. This involves understanding how
    > strategies for continuing access, such as migration and emulation, influence
    > the creation of preservation metadata.
    >
    > C. A robust system of unique identification.
    >
    > D. A reliable method for encapsulating the digital object with its
    > metadata in the archive.
    >
    > E. A reliable archival storage facility, including an ongoing program of
    > media refreshment; a program of monitoring media; geographically distributed
    > backup systems; routine authenticity and integrity checking of the stored
    > object; disaster preparedness; response, and recovery policies and
    > procedures; and security.
    >
    > IV. DIP/dissemination information package/delivery
    > A. Determines, either by itself of with others, the users that make up
    > its designated community, which should be able to understand the information
    > provided. Analysis and documentation of the repository's designated
    > community; for federated or cooperating repositories, a shared understanding
    > of the designated community.
    >
    > B. Ensures that the information to be preserved is "independently
    > understandable" to the designated community; that is, that the community can
    > understand the information without needing the assistance of experts. a.
    > Well-maintained and documented technical metadata that is kept aligned with
    > the knowledge base of the designated community and with changing technologies.
    >
    > b. A "technology watch" to manage the risk as technology evolves and to
    > provide continuing access and updated methods of access as necessary, such as
    > new migrations or emulators.
    >
    > C. Makes the preserved information available to the designated community.
    > a. A system for discovery of resources.
    >
    > b. Appropriate mechanisms for authentication of the digital materials.
    >
    > c. Access control mechanisms in accordance with licenses and laws, and
    > an "access rights watch."
    >
    > d. Mechanisms for managing electronic commerce. User support programs.
    >
    > ---------------------------------------------------------------------
    > ---------------------------------------------------------------------
    >
    > Assumption: (from TDR/RLG/OCLC)
    > Preservation reqires active management that begins at creation ... [p.18]
    > ----------------------------
    > Assumption:
    >
    > Digital collecting or long-term preservation and access of digital resources
    > (hereafter referred to as (digital preservation and access: DPA) is and will
    > be a responsibility shared with other respositories (libraries, archives,
    > museums, and related non-profit and for-profit organizations and
    > institutions. This assumption is based on three interrelated assumptions:
    >
    > 1) the long-term preservation and access of digital resources will
    > be expensive, requiring that the burden of remembering the digital
    > cultural artifacts deemed worth remembering be shared. No one
    > repository will be able to effectively remember all worth
    > remembering.
    >
    > 2) the authenticity and reliability of the "remembered" (the
    > accuracy of our memory) and our judgement with respect to what is to
    > be remembered will necessary be subjected to evaluation. Such
    > evaluation will necessarily be conducted by an authoritative body or
    > bodies that arise from the cultural heritage repository communities,
    > and, while users will rely on in large part on the evaluation of the
    > authoritative bodies, users will also, ultimately, be the arbiters
    > of the quality of our memory, and the extent to which we can be
    > trusted.
    >
    > 3) remembering, especially shared remembering, will necessarily
    > require a large number of hardware, software, communication, and
    > intellectual and procedural standards. In that standards are
    > necessarily the product of communities sharing common interests and
    > objectives, digital collecting will necessarily involve
    > participating in the development of standards and mastering them.
    >
    > Assumption:
    >
    > In order to take responsibility for DPA, the repository must "control" that
    > which is collected. In other words, the repository must have control over the
    > files (both content and, if necessary, software) in order to be able to
    > manange the DPA. Therefore access to digital content that is licensed, or
    > licensed access software, cannot be "collected." As a long-term strategy, the
    > respository needs to work with other respositories and with licensed content
    > provider on a strategy for the development of "DPA-friendly" content, and for
    > arrangments for transfer of control of such content to a trusted respository.
    > (See e-journal Mellon project at www.clir.org/diglib/preserve/ejp.htm
    >
    > Assumption:
    >
    > There are no existing, proven methods for DPA. There several competing
    > theoretical models that are being tested. No one of these may emerge as THE
    > method, and a combination of methods may well emerge, with different
    > production methods, technology and standards (or lack thereof), and
    > publication content and functional objectives and different known and
    > anticipated user requirements being taken into account.
    >
    > Assumption:
    >
    > The mutability of the technology, the growing interdependency of the various
    > participants in scholarly communication (creators, producers-publishers,
    > repositories, and users) and the lack of an effective political
    > infrastructure to promote and develop cooperation and collaboration among
    > them leads to economic uncertainty, but also the need to develop policy that
    > reflects both what is known and understand, and what is uncertain and
    > changing.
    >
    > Assumption:
    >
    > The Reference Model for an Open Archive Information System (OAIS), originally
    > developed by the space research community, has gained wide international
    > acceptance as a "framework" for DPA. OAIS is currently being considered by
    > the International Standards Organization, largely at the urging of the
    > international archive and library communities. Virginia policy, as a member
    > of the international library community, will work within the broad framework
    > of OAIS, and will work within and participate in the ongoing international
    > application of OAIS to the cultural heritage repository communities. At the
    > level of respository administration, Virginia needs to in particular to
    > follow the OCLC/RLG report "Attributes of a Trusted Digital Repository."
    >
    > Assumption:
    >
    > Inspired in part by OAIS, there are several metadata initiatives which need
    > to be followed. Some of these initiatives deal only with semantics, but
    > others with both semantics and syntax. In the semantic category, particular
    > attention needs to be paid to two reports by the OCLC/RLG Working Group on
    > Preservation Metadata, "A Recommendation for Preservation Description
    > Information," and "A Recommendation for Content Information." These two
    > metadata initiatives are addressing essential OAIS requirements.
    >
    > Addressing descriptive data: Metadata Object Description Schema (MODS), an
    > initiative led by the Library of Congress.
    >
    > METS ...
    >
    > Assumption:
    >
    > The current DL literature reflects two implicit (and sometimes almost
    > explicit) assumptions: 1) for large scale collections, digital publications
    > collected will be relatively simple (or discrete or close to it: one file, or
    > at most only a "few" files), and either created in or migrated to a hand full
    > of representations (or formats). It is assumed that large, complex
    > publications, with many interrelations between objects and/or or many
    > signficant functional properties will be too expensive for archives/libraries
    > to collect. It is assumed that the complexity of collecting the complex will
    > be best addressed by emulation. SDS does not share this assumption. SDS, with
    > its emphasis on behaviors, is "banking" on declarative standards (such as XSL
    > and XQuery) as making the replication of behaviors over time affordable. This
    > will need (humble) justification and argument.
    >
    > -----------
    >
    >
    >
    >
    >
    >
    >
    >
    >

    Melinda Baumann
    Director, Digital Library Production Services
    University of Virginia Library
    PO Box 400155
    Charlottesville VA 22904-4155
    baumann@virginia.edu (434) 243-8785



    This archive was generated by hypermail 2b30 : Mon Jun 10 2002 - 13:38:55 EDT