10.1 Introduction#

10.1.1 What is analysis?#

In general terms, data analysis is the process of developing answers to questions through the examination of data. The basic steps in analysis consist of identifying issues, determining the availability of suitable data, deciding which methods are appropriate for answering the questions of interest, applying the methods and evaluating, summarizing and communicating the results.

This chapter refers to the data analysis conducted by a national statistical office (NSO) immediately prior to, and/or possibly following, the dissemination of statistical outputs based on the data. It is distinct from policy analysis conducted by users. Where the context is clear, it is referred to as simply analysis, as in the Generic Statistical Business Process Model (GSBPM).

More specifically, according to the GSBPM (see Chapter 6.3.1 — Administrative structure and finance of the national statistical office – the GSBPM and Chapter 15.4.3 — Generic Statistical Business Process Model (GSBPM)) the analysis phase of a statistical process includes preparing statistical content (including commentary, technical notes, etc.), and ensuring outputs are ‘fit for purpose’ prior to dissemination to users. The preparation of maps, GIS outputs and geostatistical services may be included to maximise the value of the statistical information and capacity to analyse it. This phase also includes the sub-processes and activities that enable statistical analysts to understand the data and the statistics produced.

10.1.2 Why a national statistical office performs analysis#

The primary function of an NSO is to disseminate statistical information (comprising data and explanatory notes) for the benefit of users. In the process of using this information, users, especially the more sophisticated users, may do a great deal of analysis. However, although, in these broad terms, the NSO produces and disseminates data, and the users analyse data, the NSO itself should undertake a significant amount of statistical analysis in order to better understand the quality of the data and the data production processes. More specifically, the reasons for data analysis are as follows:

User perspective

By analysing the data, the NSO puts itself in the position of a user. It becomes a surrogate user. It comes to understand more about how users may view the data. It finds the ‘stories’ in the data, and it may find errors that have previously escaped notice.

Data coverage

The NSO gets a better feel for the coverage and content of the data, and the limitations of the data in these respects.

Data accuracy

The NSO gets a better understanding of the accuracy and reliability of the data and the limitations of the data in these respects.

Data consistency

The NSO learns more about the internal coherence (consistency) of the data, and the coherence of the data with respect to other datasets, i.e., the ease with which they can be jointly understood and analysed.

Process limitations

The NSO is better positioned to identify the limitations of the process by which the data are generated and how the process could be improved.

Data confidentiality

The NSO must assure itself that confidentiality is preserved when the data are disseminated (as discussed in Chapter 12.8.5 — Confidentiality and disclosure control).

Data adjustments

The NSO may identify seasonal components of sub-annual data and disseminate seasonally adjusted data when appropriate as discussed in Chapter 12.8.4 — Seasonal adjustment and time series analysis.

Data commentary

Finally, the NSO is in a better position to write the commentary and/or explanatory notes that may accompany data when they are disseminated.

In summary, there are considerable benefits to an NSO in analysing data before their dissemination.

10.1.3 What is an analytical framework?#

Some users may be interested in a single dataset, for example, a dataset containing the consumer price index. However, many are interested in several datasets relevant to a particular topic or domain, say health inputs and outcomes. Typically, different datasets are produced by different statistical processes. For them to be coherent, i.e., easily analysed jointly, they need to have been produced using common standards for scope, definitions, classifications and units.

In some domains, for example, health, the relevant standards are brought together within a single analytical framework, for example, the System of Health Accounts (described in Chapter 10.9 — Health accounts). In some cases, for example, the System of National Accounts, the framework may span multiple domains.

An analytical framework, which may also be referred to as an integrated framework or integrating framework, may be summarised along the following lines.

  • It is a model relevant to a particular statistical domain that defines the scope, definitions, classifications, units, and relationships between them, for that domain.

  • It aims to guide and facilitate understanding and to help logical thinking systematically.

  • It ensures that the data are structured in such a way that analysis has tangible outcomes, for example, to answer questions such as ‘what are key priority needs?’

  • Defining an analytical framework requires selection amongst the possible options. It means deciding what data items are most important and informative, therefore limiting the analysed information.

  • Analysis conducted using a framework is systematic, transparent, and with known coverage. It reduces the possible impacts of selection and procedural biases in the sense that multiple analysts are obliged to use the same concepts, definitions and classifications.

  • It provides a basis for review of data outputs. For example, supply-use tables may be used to check consistency and completeness of data being provided to the national accounts.

In the specific context of an NSO, an analytical framework is a model relating all the units, concepts, data items and classifications pertinent to a particular topic or domain. It enables data originating from different sources (surveys, censuses, administrative records, etc.) to be combined and analysed consistently.

Use of an analytical framework, wherever available, is highly recommended.