11.4 Metadata - providing information on the properties of statistical data

11.4 Metadata - providing information on the properties of statistical data#

Statistical metadata is commonly defined as data about data and is critical to ensuring the quality, interpretability and usefulness of datasets.

The first and most fundamental purpose of metadata is to help users of statistical data to interpret, understand, and analyse statistical data.

It is vital that an NSO provides sufficient metadata to ensure quality and add value to statistics. Metadata provides information about the background, purpose, content, collection, processing, quality, and related information of a statistical dataset that a user needs to find, understand and manipulate statistical data. Simply put, the metadata for a statistical dataset increases the number of people who can successfully use a data source once it is released.

11.4.1 Types of metadata#

The two main categories of metadata are reference metadata and structural metadata. These are described in the Eurostat โ€˜statistics explainedโ€™ glossary (๐Ÿ”—) in the text below:

  • Structural metadata

    Structural metadata is used to identify, formally describe or retrieve statistical data, such as dimension names, variable names, dictionaries, dataset technical descriptions, dataset locations, keywords for finding data, etc. For example, structural metadata refer to the titles of the variables and dimensions of statistical datasets, as well as the units employed, code lists (e.g., for territorial coding), data formats, potential value ranges, time dimensions, value ranges of flags, classifications used, etc.

  • Reference metadata

    Reference metadata (sometimes called explanatory metadata) describes the contents and the statistical data quality from a semantic point of view. They include explanatory texts on the context of the statistical data, methodologies for data collection and data aggregation as well as quality and dissemination characteristics.

    Metadata may appear alongside the data in the form of graph labels and footnotes or may be compiled as explanatory notes that contain information such as a definition and description of the population, the source of the data, and the methodology used. Metadata can include documentation of definitions, relationships among variables, specifications, procedures, classification schemes, and instructions. It can be used to assist search and navigation on a website and assist post-processing such as downloading data and statistical tools for analysis. In a dissemination system, metadata can be present at all data collections levels, from footnotes at the individual record level, information at dimension level, and the dataset level.

Standards and guidelines for statistical metadata have been developed and are already applied in practice by national and international statistical organizations.

!
Links to guidelines, best practices and examples:
  • Data Documentation Initiative (DDI).

  • Dublin core โ€“ metadata definitions (๐Ÿ”—).

  • Eurostat Terminology on Statistical Metadata (๐Ÿ”—).

  • Eurostat metadata guidelines, methodologies and definitionsย (๐Ÿ”—).

  • Metadata: basic concepts and definitions and the role of the Statistical Data and Metadata Exchange (๐Ÿ”—).

  • OECD Data and Metadata Reporting and Presentation Handbook (๐Ÿ”—).

  • Statistical Data and Metadata Exchange (SDMX) Glossaryย (๐Ÿ”—).

  • Statistics Botswana - the importance of sound metadata managementย (๐Ÿ”—).

  • Statistics Canada - Metadata: An Integral Part of Data Quality Framework (๐Ÿ”—).

  • UNECE Guidelines for statistical data on the internet (๐Ÿ”—).

  • UNECE Statistical Metadata in a Corporate Context: A guide for managers (๐Ÿ”—).