EAD ENCODING GUIDELINES

 FOR THE NATIONAL GALLERY OF THE SPOKEN WORD PROJECT

 

The following pages describe a prototype markup scheme based on recommendations found in the EAD Application Guidelines, the EAD Tag Library, those presented at the Society of American Archivists EAD Workshop taught by Kris Kiesling and Michael Fox, the EAD Cookbook by Michael Fox, and decisions made specifically for the NGSW project.  I have included the reasons behind the decisions about the individual encoding recommendations, and instructions for applying the individual tags used.  The template with which you will work and a sample completed finding aid, encoded according to these guidelines, are also included.

 

In addition, you are encouraged to learn about EAD in general.  Three volumes published by the Society of American Archivists make a good start: the EAD Application Guidelines (chapter 3 in particular), the EAD 1.0 Tag Library, and Encoded Archival Description: Context, Theory and Case Studies.  These sources are located in both DSC offices.  Additional information may be found in the EAD Cookbook  (available from Michael Fox, Minnesota Historical Society) and on the official EAD web pages (maintained by the Library of Congress) <http://www.loc.gov/ead/> and the EAD Help web pages (maintained by the EAD Working Group) <http://jefferson.village.virginia.edu/ead/>.

 

 

The EAD Encoding Template

 

EAD offers great flexibility in how one might encode a finding aid.  The particular markup choices described her are designed to facilitate the transport of data between systems, the sharing of data through union catalogs, the reuse of data for different purposes, navigation within inventories.  The following principles serve as the basis for the encoding template used in this project.  A text version of the template is located at http://www.lib.msu.edu/digital/vincent/Ead/Ead/VVLTemplate.htm

 

1.      The NGSW EAD encoding template is based on the conventions of archival and description, particularly collection-level arrangement and multi-level description, as well as adhering to standard library bibliographic practices.  Accordingly, the selection and structure of the template elements are based on descriptive standards, such as Archives, Personal Papers and Manuscripts (APPM), Anglo-American Cataloguing Rules, the Rules for Archival Description (RAD), and the General International Standard for Archival Description (ISAD(G)).

 

2.      The EAD-encoded finding aids created for the VVL materials will use indexing terms drawn from standardized vocabularies, such as the Library of Congress Subject Headings and the Getty Art & Architecture Thesaurus, in controlled elements to improve retrieval across metadata systems and facilitate conversion to MARC records for the MSU OPAC.

 

3.      The description embodied in the EAD-encoded finding aid will convey the multi-level, hierarchical nature of the materials themselves.  This approach reflects a fundamental and intrinsic characteristic of archival materials and their interrelationships.  The finding aids function as a resource discovery tool for the original sound recordings.  Description will proceed from the general to the specific, and reflect the hierarchical organization of the materials.  The creation of EAD-encoded archival finding aids is viewed as a practical and flexible solution to the problem of creating intellectual and physical control over large bodies of materials.  .

 

4.      The EAD markup in this project will use XML syntax.  As web browsers develop the capacity for displaying XML encoded documents directly, the NGSW finding aids will be viewable directly from the XML, without the extra step of XML to HTML conversion.  Internet Explorer 5.0 and the beta version of Netscape 6.0 look  promising as XML display mechanisms.  We may, however, in the short term, need to convert our XML documents to HTML either on demand (on-the-fly) or in batch mode.

 

5.      The EAD template used in this project contains the set of core data elements for finding aids as recommended by the EAD Working Group of the Society of American Archivists.  These elements include the four elements listed in Appendix A of the EAD Application Guidelines:  the top-level Descriptive Identification, Biography or History, Scope and Contents, and Controlled Access tags.  Administrative Information elements such as access and user restrictions are also included for the entire collection and for individual items, as appropriate.  There are also certain elements necessary for the identification of the electronic file that carries the encoded finding aid, elements required by the EAD DTD, particularly in the EAD Header.

 

6.      The template contains certain data elements to facilitate display, internal navigation, data reuse, and the future migration of data to new systems.  For example, the template specifies ID attributes to facilitate internal hyperlinks between a Table of Contents generated for display purposes and the body of the finding aid.

 

7.      In order to provide a display document that is self-explanatory, for example, the template also specifies display labels using <head> elements for blocks of text and groups of related elements, default values for LABEL attributes, and explanatory text at specific points in order to assist remote users who, for example, might not readily understand the purpose of elements such as Controlled Access terms.

 

8.      The template specifies encoding analogs for the appropriate MARC21 fields to facilitate reuse of the information in catalog records (MAGIC) and for the creation of Dublin Core metadata in the HTML output.  These encoding analogs may be revised after further discussion with Innovative Interfaces, Inc. 

 

9.      No significance is implied by the sequence in which elements appear in the templates except insofar as the EAD DTD enforces a given order.  The EAD DTD requires that the <eadheader> sub-elements follow a particular order and that the <did> element come first within <archdesc>.  However, one of the advantages of EAD encoding is that with the use of the stylesheets written in the XSL Transformation (XSLT) language, the arrangement of information upon output to the web or to paper is largely independent of the order of data in the source document. 

 

 

Template Details

 

This section describes the encoding elements we will use in the order they appear in the template and the rationale behind them.  These choices and data values are assumed by and incorporated into the template loaded onto our XMetal software.  Arbitrary values are assigned to ID attributes only to simplify the creation of a table of contents and other internal hyper-links.  Standard text is included in the template for certain elements, particularly <accessrestrict>,  <userestrict>, <scopecontent>, and <head> elements. 

 

Document Declaration

 

XML declaration

Every finding aid begins with an XML document declaration <?xml version="1.0"?> which specifies that XML syntax will be used for the EAD tags.  All finding aids will therefore end with the .xml extension.

 

Document Type Declaration (DTD)

Every finding aid must declare its document type, which in our case is EAD.  The specific declaration is <!DOCTYPE ead PUBLIC "-//Society of American Archivists//DTD ead.dtd (Encoded Archival Description (EAD) Version 1.0)//EN" "ead.dtd">, which specifies the version of EAD being used and the name of the DTD file.

 

Header

 

EAD <ead>

An EAD-encoded document MUST start with this tag.

The RELATEDENCODING attribute is “MARC21”. 

 

EAD Header <eadheader>

This is also a required tag.

The AUDIENCE  attribute is set as “external”. 

The LANGENCODING attribute is set as “ISO 639-2”.

The FINDAIDSTATUS attribute is set as “EDITED-PARTIAL-DRAFT” .

The ID attribute is set as “a0”.

 

EAD Identifier  <eadid>

The SYSTEMID attribute is set as a coded value for the repository to ensure that the <eadid> number assigned is unique among the worldwide universe of EAD records.

We use the MSU Library’s OCLC code, EEM.

The SOURCE attribute is set as OCLC.

The TYPE attribute is set as file.

The ENCODINGANALOG attribute is set as “850”.

Record an identification number or name for the EAD record. [Ignore at present]

 

File Description <filedesc>

Information recorded here refers to the EAD record itself, NOT to the audio recordings being described.

 

Title Statement <titlestmt>

Each EAD record requires a title. Our formula for the title is:

<titlestmt><titleproper>Main Speaker’s Name, <date>birth date – death date</date>: </titleproper><subtitle>An Inventory of Spoken Word Audio Recordings in the Vincent Voice Library, Michigan State University. 

The formula creates a statement that contains five critical informational elements -- the name of the creator of the materials, his/her birth and death dates, the type of document (an inventory), the nature of the records, and the repository’s name.  This data is used as the <title> element in the HTML output for display in the browser and serves as metadata for web search engines. 

 

The Subtitle <subtitle> element is a set statement with the following syntax -- An Inventory of Characterization of the materials from the Name of the Repository.

 

Publication Statement  <publicationstmt>

The name of the repository is set in the Publisher <publisher> element to indicate which institution is responsible for the creation of the finding aid. <publisher

 

Profile Description <profiledesc>

Information recorded here refers to the creation of the EAD finding aid.

The AUDIENCE attribute is set as “internal” since this information is used for tracking the history of the inventory.

 

Creation <creation>

Record the name of the individual(s) responsible for the encoding. 

Record the date encoding begun in the Date <date> element.

Record the name of the individual responsible for the editing. [Ignore unless editing]

Record the date of that editing. [Ignore unless editing]

 

Language Usage  <langusage>

A statement about the language of the finding aid is set in the <langusage> tag, with the name of the language set in the Language <language> element.

The AUDIENCE attribute is set to “external”.

 

THIS ENDS THE <EADHEADER> section.

 

Collection Level Description

 

Archival Description <archdesc>

Information within the <archdesc> tags refers to the collection materials being described, first as a collection as a whole, then as individual items.

 

The required LEVEL attribute is set as “collection”, the organizational level of the entire body of described materials.

The LANGMATERIAL attribute is set as the language(s) of the materials in the collection as appropriate.   The default value in the template is “eng” for English.

The TYPE attribute is set as “inventory”.

 

Content appears in at least the following four elements:  Descriptive Identification <did>, Biography or History <bioghist>, Scope and Content <scopecontent>, and Controlled Access <controlaccess>, as recommended in Appendix A of the EAD Application Guidelines.

 

Descriptive Identification <did>

The EAD DTD requires that the Descriptive Identification <did> element come first. 

The ID attribute is set as “a1”.

A Head element <head> is set as “Collection Summary”.

 

Origination <origination>

This tag contains information about the creator of the materials being described.

Record the name of the Main Speaker in <persname> as “Last Name, First Name.  Note that this is the ONLY time you enter the main speaker’s name in reverse order.

The ENCODINGANALOG attribute for <persname> is set as “100$a”.

The LABEL attribute is set as “Main Speaker:”

 

Unit Identification <unitid>

Record the assigned identification number of the collection, which will serve as the MAGIC call number. [Ignore at present]

The ENCODINGANALOG attribute is set as “099”.

The LABEL attribute is set as “Identification:”

The COUNTRYCODE attribute is set as found in ISO 3166 as “US”.

The REPOSITORYCODE attribute is set as EEM, the same value found in the SYSTEMID attribute in <eadid>.

 

Unit Title <unittitle>

Record the name of the Main Speaker as “First name Last Name” in <unittitle>.

Do not include separating punctuation between the Unit Title and the Unit Date.  The style sheets will supply a comma where appropriate.

The ENCODINGANALOG attribute is set as “245$a”.

The LABEL attribute is set as “Title:”

 

Unit Date unitdate>

Record the date span of the materials in <unitdate> as “19XX – 19XX”.  

The TYPE attribute is set as “inclusive”.

The ENCODINGANALOG attribute is set as either “245$f”.

The LABEL attribute is set as “Dates:”

 

Physical Description <physdesc>

Record the total number of audio recordings in <physdesc>.

The ENCODINGANALOG is set as “300$a”.

The LABEL attribute as “Quantity:”

 

Repository <repository>

The name of the repository is set in <repository> even if its display is to be suppressed.

The ENCODINGANALOG is set as “852”.

The LABEL attribute is set as “Repository:”.

We may include a pointer to the repository’s address and/or contact information at a later date.

 

Biography or History <bioghist>

This element contains contextual information about the recordings in the form of a chronology of the Main Speaker’s life.

The ID attribute is set as “a2”.

Record the Main Speaker’s name (First Name Last Name) in the <head> element.

The body of the element is encoded as a chronology list <chronlist> in the form of dates <date></date> and events <event></event>. You may delete or add dates/events as appropriate. Record the significant dates and events in the Main Speaker’s life.  Take this information from a standard source, such as Who Was Who in America, or the Biographical Directory of the United States Congress http://bioguide.congress.gov/.

 

Record the birth and death years of the Main Speaker, if known.  This information is needed to make decision concerning the copyright status of the materials.  The ENCODINGANALOG attribute is set as 100$d to indicate that this date information will go in the d subfield of the MARC 100 field.

 

Scope and Contents <scopecontent>

This element contains contextual information about the creation, arrangement, and organization of the materials.

 

The ENCODINGANALOG is “520”.

The ID attribute is set as “a3”.

A <head> element is set as “Scope and Contents”.

The body of the element is contains <p> tags into which you should describing the provenance of the recordings (i.e. where the item came from) and its arrangement within the collection.

 

Controlled Access Terms

The ID attribute is set as “a12”.

The <head> element is set as “Index Terms”.  It also includes a sentence (encoded as <p>) describing the purpose of index terms. (“The following terms may be useful for searching the MSU Library online catalog (MAGIC) for related sources.”)

 

Record <persname>, <corpname>, <geogname>, <subject>,  <genreform>, and <title>, terms in the form determined by an appropriate authority source. 

These sub-elements are organized into groups by type of element with set explanatory <head> elements. Each sub-element may be repeated as needed. Delete empty sub-elements that are not needed.

 

The ENCODINGANALOG attributes are set to default values for the appropriate MARC fields to enable conversion of the encoded information   of the information in catalog records and for the creation of Dublin Core metadata in the HTML output. 

The appropriate values and authority sources are:

<persname>           (600  as a subject or 700 as a speaker; 600 is the default) (lcnaf)

<corpname>           (610 as a subject or 710 as a speaker; 610 is the default) (lcnaf)

<geogname>          (651) (lcsh)

<subject>               (650) (lcsh)

<genreform>          (655)  (lcsh or aat) 

<title>                    (630 as a subject or 730 as part of a collection)

 

You may need to change the ENCODINGANALOG attribute in accordance with the MARC field number used in the MAGIC record.  (For example, a ship’s name is a <corpname>, but its encoding analog is 611, rather than 610.)

 

The SOURCE attribute is set as “lcnaf” (Library of Congress Name Authority File) or “lcsh” (Library of Congress Subject Headings), whichever is the name of the authority file from which the heading was derived.  However, take these terms EXACTLY AS WRITTEN from the MARC record if one exists.

 

Administrative Information <admininfo>

The sub-elements of <admininfo> are described below.  To increase flexibility in authoring, each of these elements is nested inside a separate <admininfo> element rather than all nested within a single <admininfo> parent element.

 

Restrictions on Access <accessrestrict>

The ENCODINGANALOG is set as “506”.

The ID attribute is set as “a14”.

The <head> element is set as “Restrictions on Access”.

The body of the element is set as a text describing the generally applicable access restrictions.  Individual items with access restrictions differing from this statement will have their own <accessrestrict> elements within their item-level container tags.

 

Restrictions on Use  <usesrestrict>

The ENCODINGANALOG is set as “540”.

The ID attribute is set as “a15”.

The <head> element is set as “Restrictions on Use”.

The body of the element is set text describing the conditions of use of the materials and indicating from whom permission for use should be sought.

 

Linguistic Information <note>

For now, record any notes on the linguistic information about the main speaker in the <note> element following the administrative information.  In the future, we need to determine exactly what information needs to be recorded, how it is to be recorded, and what tags would best encode it.  [Ignore at present]

 

THIS ENDS THE COLLECTION-LEVEL DESCRIPTION

 

 

Item Level Description

 

Description of Subordinate Components <dsc>

The ID attribute is set as “a23”.

The TYPE attribute is set as “in-depth” since we are providing item-level description.

The <head> element is set as “List of Recordings”

 

Component <c01>

The template assumes that the Component (First Level) element <c01> represents an item.  For collections requiring both series and item level description, see Lisa.

The LEVEL attribute is set as “item”.

 

For each item, record the appropriate <did> elements:

<unitid>

<unitdate>

<abstract>

<physdesc>

 

Descriptive Identification <did>

The Descriptive Identification <did> element comes after each container <c01> tag. 

 

Unit Identification <unitid>

Record the item’s M number or, if the item has been digitized, its VVL number.

 

Unit Date <unitdate>

Record the date of the item’s original broadcast or n.d. if date not known.

 

Abstract <abstract>

Record a brief summary of the individual recording.  Take this information from the Title and Note fields in the MARC record.

If you have notes taken during digitization, they may also be entered here.

 

Physical Description <physdesc>

Record the duration of the recording. [Ignore until digitization]

The LABEL attribute is set as “Duration:”

 

Digital Archival Object <dao>

Record the location of the audio file in the <dao> element. [Ignore until digitization]

 

Acoustic Information <note>

For now, record specific acoustic information about recording in the <note> element following the <dao>.  In the future, we need to determine exactly what information needs to be recorded, how it is to be recorded, and what tags would best encode it.  [Ignore until digitization]

 

Administrative Information <admininfo>

Record administrative information in the following sub-elements, where appropriate.  Delete these tags when not needed.

 

Restrictions on Access <accessrestrict>

The <head> element is set as “Copyright Information”.

Record in the body of the element <p> text describing the item’s copyright status, using the VVL Copyright Guidelines.

If the item’s access restriction differs from the collection-level access restrictions statement, record that exception.

 

Alternative Form Available <altformavail>

The body of the element is set as “Transcript Available”.

The element is followed by a Digital Archival Object <dao> element containing a pointer to the file location of the transcript.

 

Acquisition Information <acqinfo>

Record the name of the donor, if applicable.

 


Using Xmetal Authoring Tool

 

XMetaL, a software application for creating EAD-encoded finding aids, is available from SoftQuad Ltd. 161 Eglington Avenue East, Suite 400, Toronto, ON, Canada M4P 1J5.  It runs on various Windows platforms.  The following tools were created to be compatible with version 1.2 of the software.

 

While XMetaL may create either SGML or XML output, we will use it configured to produce XML files.

 

Installation

The default installation of XMetaL places the application in an XMetaL sub-directory of the Program Files folder on the C: drive.   Review the key features of the software, especially the “Tags On” and “Plain Text” displays, and the use of the “Element List” and “Attribute Inspector” features for markup.

 

The installer program also will suggest that you add version 5.0 of Microsoft Internet Explorer browser.  Ignore this prompt since IE 5.0 is already installed on our computers.

 

When you are asked during the installation process or upon first opening the application if you wish to preserve whitespace, select that option that indicates you do NOT wish to do so.

 

Install the following helper files.

 

Rules

XMetaL uses a binary version of the EAD DTD called a “rules” file for enforcing compliance with the DTD during document creation and final validation.   This file is called

                                    ead.rlx

 

Install this file in the Rules sub-directory of the XMetaL folder.

 

Template

Templates provide a standardized file that can be reused over and over to provide the default markup and “boilerplate” text that you wish to include in every document.  Our template is named

VVL Template.xml.

 

Install this file in the General sub-directory of the Templates directory of the folder in which you have installed XMetaL. 

 

To Create a New Document

Open XMetaL, chose File on the menu bar, and then select New from the pull-down menu.  You will be presented with a list of templates including these files.  When you select the VVL Template, the default tags, attributes and boilerplate text will appear for you to fill in and add to.  Note that XMetaL includes a special feature wherein grayed-out text appears within the element as a prompt to the data entry operator, the contents of which disappears when something is entered into that element.  

 

To Save a Document

When you finish entering information into the template, name and save the document.  The name of the document should be the last name and first name of the main speaker (i.e. smithwilliam.xml). 

 

At the end of each day, please save all your work on a floppy disk as a backup.

 

 

Created by: Lisa Robinson

Last updated: October 1, 2000