A Guide for Archiving Web Pages

header image

Dublin Core Meta Tags

The Dublin Core is a fifteen-element metadata format that provides descriptive, technical, administrative, structural, and preservation information for archival management of web pages and also permits users to identify, locate, and retrieve archived pages. When used with consistency and thoroughness it provides significant benefits to web sites whose pages are to be archived. Like a catalog record, it can provide administrative as well as retrieval functions, that is to say it can be an effective tool for organizing and maintaing archived sets of pages as well as serving user nees to identify, locate, and retrieve them.

Two key objectives of the the Dublin Core Initiative are
Simplicity of creation and maintenance
The Dublin Core element set has been kept as small and simple as possible to allow a non-specialist to create simple descriptive records for information resources easily and inexpensively, while providing for effective retrieval of those resources in the networked environment.
Commonly understood semantics
Discovery of information across the vast commons of the Internet is hindered by differences in terminology and descriptive practices from one field of knowledge to the next. The Dublin Core can help the "digital tourist" -- a non-specialist searcher -- find his or her way by supporting a common set of elements, the semantics of which are universally understood and supported. For example, scientists concerned with locating articles by a particular author, and art scholars interested in works by a particular artist, can agree on the importance of a "creator" element. Such convergence on a common, if slightly more generic, element set increases the visibility and accessibility of all resources, both within a given discipline and beyond.
{source: Dublin Core User Guide}
The following table names and gives a brief description of each element.
Element Element description

Creator

Person or organisation primarily responsible for creating the intellectual content of the resource, e.g. authors in the case of written documents, artists, photographers, etc. in the case of visual resources.

Publisher

The entity (e.g. agency including unit/branch/section) responsible for making the resource available in its present form, such as a publishing house, a university department, or a corporate entity.

Contributor

Person or organisation not specified in a Creator element who has made significant intellectual contributions to the resource but whose contribution is secondary to any person or organisation specified in a Creator element, e.g. editor, transcriber, illustrator.

Rights Management

A rights management statement, an identifier that links to a rights management statement.

Title

The name given to the resource, usually by the creator or publisher. Can be the same as the title of the resource, or may be more descriptive

Subject

The topic of the resource. Typically, will be expressed as keywords or phrases that describe the subject or content of the resource. Controlled vocabularies and formal classification schemes are encouraged.

Date

A date associated with the creation or availability of the resource.

Identifier

A string or number used to uniquely identify the resource. Examples for networked resources include URLs, Purls and URNs. ISBN or other formal names can be used.

Description

A textual description of the content of the resource, including abstracts in the case of document-like objects or content descriptions in the case of visual resources.

Source

The work, either print or electronic, from which this object is derived, if applicable. Source is not applicable if the present resource is in its original form.

Language

The language of the intellectual content of the resource.

Relation

Relationship to other resources, e.g. images in a document, chapters in a book, items in a collection.

Coverage

Spatial locations and temporal duration characteristic of the resource.

Type

The category of the resource, such as home page, novel, poem, working paper, technical report, essay, dictionary.

Format

The data format of the resource, used to identify the software and possibly hardware that might be needed to display or operate the resource, e.g. postscript, HTML, text, jpeg, XML.

The National Library of Australia's Guidelines for the Creation of Content for Resource Discovery Metadata gives help in locating the information needed to enter for each Dublin Core element. For example, it gives this guidance for the Description element:

  1. This element provides a free-text summary that describes the resource. It is the least precise method of searching, but it can be useful for picking up terms not included in the SUBJECT element. It is also often used in the display of a search results list, helping the user to identify whether or not the resource appears relevant.
  2. The DESCRIPTION element should provide objective information about the resource, not an evaluation or review.
  3. Do not simply repeat the title in this element.
  4. Use an abstract or other structured description included within the resource if available (but do not use quotation marks).
  5. Otherwise, provide a brief outline of the content of the resource. It should supply enough information for a user to decide if the item is relevant.
  6. Use the description to highlight any significant aspects of the resource.
  7. Details of the format and date of the original version may also be included.
  8. As some resource discovery systems display only a limited number of lines from the DESCRIPTION element, describe the most significant aspects of the resource in the beginning of the tag. A description of less than 255 characters is recommended to cater for as many systems as possible. Additional text may be useful for retrieval, but may not display.

Some resources on the Dublin Core meta tags

Top Page >>A Guide for Archiving Web Pages>>Best Practices for Archiving Web Pages>>Meta Tags for Description of Page Contents >> Dublin Core Meta Tags