14.5 Managing archives

14.5 Managing archives#

Data archiving is the process of moving data that is no longer actively used to a separate storage device for long-term retention. Archive data is older data that remains important to the organization or must be retained for future reference or regulatory compliance reasons.

In GSBPM, archiving has been incorporated into the process of data and metadata management, to reflect the view that archiving can happen at any stage in the statistical production process, so archiving is not considered to be necessarily a final stage.

Many archives exist for the purpose of making data actively available, whereby data and metadata are systematically stored, protected, and made available according to agreed rules. Laws may require content to be kept for specific periods. Internal and external audits may require document retention. Archiving ensures the continued viability and usability of data now and in the future and provides access to these data within the framework of the national legislation by ensuring confidentiality and protecting privacy.

Microdata sets contain information on individual persons, households or business entities collected through a census, survey, interview or administrative recording systems. NSOs and other producers of official statistics generate vast amounts of microdata from surveys and censuses, and these data represent a significant investment by an NSO and have considerable value for both existing and future users for the production of national statistics and research. An NSO should therefore have a policy and the mechanisms in place to ensure that microdata and metadata can be archived.

When archiving data, the need to keep information online should be balanced with the fact that keeping too much old information available online consumes valuable storage which could be better used for newer information. It can also increase the number of irrelevant search results returned and adds to the effort required to maintain, migrate, and reclassify content. Long-term storage of this material adds to storage costs and to the security burden of safeguarding the confidentiality of the records.

Archives should be indexed and searchable so that files can be easily located and retrieved. A microdata catalogue and a document management system can help to manage the creation, capture, indexing, storage, retrieval, and disposition of the records and information assets of an NSO. Records management addresses the issues of knowing what data the NSO has, where it is stored, how long it should be kept and how secure it is.

Archiving rules for specific statistical processes depend on the general statistical legislation and any archiving policy of the NSO. These rules include consideration of the medium, location of the archive, and the requirement for keeping duplicate copies. They should also consider the conditions (if any) under which data and metadata should be disposed of.

The Generic Longitudinal Business Process Model (GLBPM) has been derived from the GSBPM and provides a generic model that can serve as the basis for informing discussions across organizations conducting longitudinal data collections, and other data collections repeated across time. The model is intended to serve as a reference model against which implemented processes are mapped, for the purposes of determining where they may be similar to or different from other processes in other organizations. It may also prove useful to those designing new longitudinal studies, providing reminders of steps which may need to be planned.

GLBPM covers the following phases:

  • Evaluating and specifying needs;

  • Design and re-design of data collection instruments;

  • Build and re-build of data collection instruments;

  • Data collection;

  • Processing and analysis;

  • Archive/preserve/curate;

  • Dissemination and discovery;

  • Research and publish;

  • Retrospective evaluation.

!
Links to guidelines, best practices and examples:
  • The NADA Microdata Cataloguing Tool (πŸ”—).

  • UK Office of National Statistics Code of Practice Protocol on Data Management, Documentation and PreservationΒ (πŸ”—).

  • ICPSR Guide to Social Science Data Preparation and Archiving (πŸ”—).

  • Philippine Statistics Authority – PSA Data Archive (πŸ”—).

  • Philippine Statistics Authority Data Archive (PSADA).