APPENDIX E METADATA
Defined simply ‘metadata’ is ‘data’ about ‘data’. In essence this is information that describes characteristics or attributes of an information item or resource. This is a concept that pre-dates electronic information systems. For example, libraries used to use ‘index cards’ to provide information about books. Information about a book would be recorded on a card and stored alphabetically by author in a filing cabinet. Most libraries would use a double filing system with the index card duplicated and stored alphabetically by title.
In the example shown below, the index card records amongst other things, the author of the book, its title, its publisher and the year of publication. This is ‘metadata’: it is information about an information item, in this case a book.
The principle that is applied to the ‘index card’ in the library to record ‘metadata’ is applied in exactly the same way to a piece of information stored in an electronic system. An item of information stored on an electronic system, be it a page on a website or a word document on a personal computer, will have ‘metadata’ associated with it. This metadata will normally be placed in pre-defined metadata fields, effectively containers for specific types of information about the stored piece of information.
Metadata can describe the ‘physical’ attributes of the stored information (the size of the file, format of the file (e.g. Microsoft® Word, Microsoft® Excel, Adobe® Acrobat), when the file was created, etc.). In most electronic information systems this type of metadata will be added automatically when a file is created, modified or loaded within the system. Metadata can also describe the content of a piece of information in terms of its subject matter (e.g. title, keywords, etc.) and assist in retrieval. Metadata can also be used to manage a piece of information (e.g. security access, copyright, retention, etc.).
In the example shown below, a summary of the metadata associated with an Adobe® Acrobat pdf document is displayed.
The values that are applied as metadata will vary depending upon the nature of the metadata field that is being populated. Ideally the values that can be used to populate metadata fields will be controlled to improve the consistency of the tagging that is associated with an item of information. For example, dates will be always structured in the same fashion (e.g. DD-MM-YYYY); the abbreviations used to represent the language a document is written in will be drawn from a pre-defined list or standard (e.g. ISO 639-3; ‘eng’ for English, ‘deu’ for German).
Not all metadata fields, particularly if the information is broad in scope, can be so easily managed. For example, the title of a document, an example of natural language metadata, may not fully convey the subject matter of a document. Applying metadata selected from a ‘controlled vocabulary’ or ‘taxonomy’ can compensate for this. Returning to the example of a book, as can be seen from the image above, although the title of the book may not describe the content of the book, by assigning genre ‘categories’ to the ‘metadata’, a potential Amazon customer is given information as to its nature.
Thus by categorising the genre, the customer can easily narrow down a search for a particular type of content. Furthermore, in Amazon for example, a book can appear in several different places (e.g. Science Fiction and Fantasy, Horror, Crime, Thrillers and Mystery). With electronic access and management systems, pertinent and accurate metadata about the books becomes of paramount importance to ensure the customer can locate any book (and related types of books) they are seeking within a few clicks of the mouse.

