Image Description Practices for Digital Archives Projects

Margot Note

July 06, 2018

Formal standards, such as Describing Archives: A Content Standard (DACS), Graphic Materials, and Rules for Archival Description (RAD), have been developed over time for the description of archival materials. While descriptive standards offer consistency, archival repositories employ descriptive systems suited to their holdings, not universal access, and description continues to be idiosyncratic.

For example, the ISAD (G): General International Standard of Archival Description defines multilevel principles, such as moving from the broad to specific, linking hierarchical levels, and basic descriptive elements. These elements include creator names, titles, dates, administrative history, scope and content, and locations of originals and copies.

In the absence of universal standards and the subjective nature of description, institutions have many options for describing their holdings to make them accessible.

MARC Records

With the advent of computers, some institutions used MARC records to provide subject indexing for large collections through individual collection-level records. The MARC records point users to a finding aid for a particular collection to obtain more detailed information. Since the finding aids were generally paper-based, and often only available locally at the institution, users would have to view them in person.

Enter the EAD

Item-level MARC cataloging of images, while in some cases desirable, was often neither warranted nor economically feasible. The hierarchical format and electronic access capabilities of the Encoded Archival Description (EAD) finding aid, however, offer the possibility of a more powerful, flexible alternative.

EAD was developed as a way of marking the data contained in finding aids so that they can be searched and displayed online. EAD promised a more sophisticated way not only to produce searchable text but to eventually provide descriptions in an environment that would facilitate sophisticated cross-collection searching.

EADs index image collections by providing access points at the collection or item level, depending on the needs of the institution, collection, and users. As the tools for accessing finding aids become more sophisticated, EADs’ content-specific indexing capabilities makes them a powerful resource for standardized, integrated access to primary source collections.

Dublin Core

The Dublin Core Metadata Element Set arose from discussions at a 1995 workshop sponsored by Online Computer Library Center (OCLC) and the National Center for Supercomputing Applications (NCSA). As the workshop was held in Dublin, Ohio, the element set was named the Dublin Core. The continuing development of the Dublin Core and related specifications is managed by the Dublin Core Metadata Initiative (DCMI).

Dublin Core is designed for ease of use and is less expensive to implement than more complex metadata schemes. The core set of 15 data elements describe and facilitate discovery of images—capturing information regarding the title, identifier, creator, contributor, publisher, language, description, subject, coverage, date, type, relation, format, source, and rights of a digital image. None of the elements are mandatory and all can be repeated and expanded if needed.

While Dublin Core metadata serves as a functional framework for exposing metadata, the standard is open to interpretation. Use of the elements may vary among institutions. Its simple nature is not suited to capturing descriptions with a high-level of granularity, but it focuses on interoperability and international consensus.

A Most Versatile Format

Dublin Core is especially attractive to cultural heritage institutions, because of the number of commercial systems that have adopted and supported it. Dublin Core is also Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) compliant, which allows for repository interoperability by enabling institutions to export their records in Dublin Core for inclusion in search services based on metadata systems of varying types. The development of Dublin Core benefited from feedback drawn from an international community of archives, libraries, museums, government agencies, and corporations.

In my consulting practice, I have found Dublin Core to be the most flexible, accessible, and adaptable format for image description. It is usable by both non-experts, as well as description specialists. It is also extensible for richer descriptions. Dublin Core allows just enough elucidation to be helpful to users, without being too laborious for image catalogers.

Keywords

Current practices for structuring image collections include lists, indexes, directories, catalogs, thesauri, taxonomies, ontologies, typologies, metadata, templates, or topic maps. Retrieval systems and digital asset management (DAM) software are based on language, particularly keywords, because words are extractable from documents.

For images, there is no language to extract, only language to apply. Keywords provide content-based access points to images because they label the objects being photographed. To a lesser extent, they can also be concept-based, detailing an image’s features, attributes, and characteristics.

Rather than designing more effective language-based algorithms, retrieval system designers should reinterpret keyword searches based on information-seeking behavior, cognition, and memory. Newer approaches like tagging and algorithmic or heuristic browsing provide more search versatility. Browsing based on both content and concept and on images alone remains on the edge of discovery.

Semantic Keywords

Online collections offer structuring ingenuity because digital images can belong to multiple categories simultaneously, whereas physical images cannot. Collections have evolved from mutually exclusive categories, often arranged in hierarchies, to digital images with any number of labels, allowing users to focus on inter-relationships and cognition. With online collections, folksonomy, or social tagging, allows viewers to apply semantic keywords to images, which could cultivate deeper associations between the multiple meanings of the images.

Contextual Information

Thorough, informative description is a key to improving the representation of historical images. The better the cataloging, the richer the contextualizing information that surrounds the photographs—and the better able users are to appreciate them in their historical context. Digital images require sufficient descriptive data to render them available, understandable, and usable for as long as they have continuing value. The types of information needed to describe digital images will differ from, and may exceed, that needed to describe analog images, but the basic purpose of description remains the same.

Pre-Project Indexing

For digital projects, it is usually assumed that indexing is already completed to an adequate level before digitization, but this has rarely been the case in my experience. Description is often just being applied or considerably improved upon as part of the project. Much of the data required for image records appears as annotations on the original images. Therefore, no matter what the form of the access records, information usually is assembled from various sources. Since description is not the main outcome of a digital project, it is often done in a perfunctory way. Inadequate description does a disservice to the amount of labor and resources that a digital project requires. The complexity of description, along with the work that goes into its production, is often under appreciated. Access records for digital surrogates involve far more than simple digital conversion.

Seeing as Indexing

Digital collections with thumbnail presentation of images and metadata provide access to hidden collections. Systems like this depend on the most sophisticated classification system by far—the human eye. The electronic era holds out the promise of richer descriptive systems that are incorporated into the design of automated applications and implemented as records are created. Until then, our eyes, and how visual-literate we are, will assist us in accessing images for researching, learning, and teaching.