Simone Pringle, Enterprise Architect, Information Technology Division
Metadata is frequently defined as "data about data". Though at first it may seem like an almost circular definition, it accurately captures the fact that data does not exist in isolation, but rather is part of an information exchange “context”. Information typically captured in metadata includes a description of data content and structure, as well as the activities and processes for which the data is used. A better understanding of the metadata can help all participants maximize the benefits derived from the direct use of the data.
We rely on metadata in our daily lives, without even thinking about it! Take for instance, drivers' licenses. The business context for the use of a driver’s license is typically to provide proof that an individual has been authorized by an official entity, to operate certain types of motor vehicles, given possible constraints such as the use of prescription eye glasses. There are expectations for the information that should be contained in a driver's license, such as a recent photo of the driver, his/her name, date of birth, validation period for the license, name of issuing entity, etc. This combination of the expected data attributes and purpose of use is the metadata for the driver's license. The standardization of driver's license metadata facilitates all processes that use the license as a means to assert one's legal right to drive.
Metadata has long been used in the context of data warehousing and business intelligence (BI) to describe the structure of the data - also known as the schema - and to facilitate data correlation and report generation. More recently, internet-based resources, such as web services and humanly-readable web pages, have started relying on metadata as an effective means of improving the quality of search results, as well as content management.
Since metadata can be effectively used to describe the content and context of digital information, it makes it possible to create self-describing systems. Consider, for instance, that certain types of data access require higher levels of security protections. Documents can be tagged with metadata that indicates that access is restricted and that certain security policies apply for the distribution, storage and disposal of these documents. By using the metadata, security infrastructure can automatically enforce policies even when the data itself is protected, e.g. encrypted.
Common applications of metadata include:
- Inclusion of keywords in web pages, improving search engine results;
- Definitions of web services using open standards such as the Universal Description Discovery and Integration (UDDI) and Web Services Definition Language (WSDL), facilitating service discovery and reuse;
- Security level tagging, to enforce and manage security policies for the entire lifecycle of the data, including, access, storage and disposal;Version tagging, to allow for applications and information content to evolve in support of improved business processes, while continuing to support existing usage;
- Mapping of data elements across information models, enabling system integration;
There are a number of standardization efforts underway to support broad adoption of metadata-aware business models. In particular, individuals interested in gaining a better understanding of metadata and current harmonization efforts, should consider looking into the Dublin Core Metadata Initiative (DCMI) - http://dublincore.org/ .
The use of metadata provides a powerful tool to describe, catalog, find, and manage information. It is not surprising that information management experts rely more and more on metadata-enabled tools and processes to do their work. In the words of a colleague, "I never met a data I didn't like"…