2.9 Data for official statistics

2.9 Data for official statistics#

three ways of obtaining data

Official statistics are based on information, mostly numerical – statistical data – that the NSO and other producers collect or acquire from various sources and in various ways. Basically, there are three ways in which data is obtained:

  • by direct enquires – surveys – among individuals, households, businesses and institutions;

  • by the acquisition of administrative data from government and other administrative sources; and

  • by utilisation of other data sources, such as commercial data streams from businesses, geospatial data, data from sensors, and social media data.

The first category is termed survey data and is defined as primary data as it is obtained for the specific purpose of statistical compilation. Administrative data is collected by administrative authorities for their administrative operations but made available to the NSO for statistical purposes. Hence, administrative data is defined as secondary data as the primary purpose for its collection is administrative but not statistical. The third category, commercial data streams from businesses and data from other sources, is also secondary data. Most so-called “Big Data” falls within this category.

features

A traditional method for collecting data for statistical purposes is to obtain the data directly from people, businesses and institutions – respondents – by surveys, i.e. by requesting the respondents to submit information for specific statistical purposes. The surveys can either be total counts termed censuses in which the whole of a given population is surveyed, or they can be based on representative samples of the population to be surveyed – sample surveys. The best known and oldest censuses are the population and housing censuses which are conducted primarily to obtain information on the population of a country or a given territory, its size and composition, living conditions, gainful activity, work etc. Population and housing censuses may also be carried out for obtaining benchmark information for renewing frames for household surveys. In the economic sector, censuses may also be conducted to map the level and composition of economic activity. These are referred to as economic censuses. Survey-based censuses are both large and expensive undertakings. Sample surveys are much lighter and less expensive than censuses, and they are, therefore, the preferred method of surveying as their results are sufficient to gauge developments, trends, and situations. Sample surveys, however, may not be sufficient to satisfy the demand for detailed information on small areas or population sub-groups. Hence, it may be inevitable to resort to census taking to provide detailed disaggregation at the required quality level.

Until quite recently, censuses and sample surveys used paper questionnaires for collecting data. However, this practice has been reduced, even discontinued, in many countries and replaced by modern technology – using digital questionnaires uploaded on laptops, tablets, or mobile phones, or via the Internet. Paper questionnaires require, besides paper and printing, that the information collected by them is coded and classified, and checked for errors before it is, often manually, entered into a digital database. Using digital questionnaires carries hardware and software costs, but this is nowadays lower than the cost of paper and printing. The use of digital questionnaires has also involved significant improvements in survey technology. Thus, the digital questionnaires are usually augmented by automatic coding and logical checks which greatly enhance the quality and consistency of the data. Another important development is that this technology often makes it possible – and feasible – to capture the exact geolocalisation of the surveyed statistical unit – such as household, dwelling, plot, or establishment. Finally, the data entered in the digital questionnaire is sent over the internet or the telephone network and uploaded onto the database in the NSO. All of this has greatly enhanced the efficiency, quality and richness of surveys. Using digital questionnaires and digital means of collecting data has, therefore, become the preferred method of data collection in censuses and surveys.

Although much lighter than censuses, sample surveys are still quite costly, particularly for countries with small populations, due to the relatively large sample size required for obtaining representative results. Survey-based censuses are, as mentioned above, always extensive and expensive operations.

features

For these reasons, NSOs and other producers of official statistics in some countries, particularly in Northern Europe, started a few decades ago to acquire data for statistical purposes by utilising administrative data. As the experience of utilising such data grew, and the methods and procedures became established and known, this practice has been adopted in many countries. This development has been partly a direct result of cuts made in the budgets of NSOs in many countries simultaneously as the demand for regular and timely statistics has been growing rapidly. Examples of administrative data utilised in many countries are tax data for economic statistics, including national accounts, customs data for foreign trade statistics, social security data for statistics on living conditions, civil registration data on demographic changes, business registration data for establishing and maintaining business registers, and administrative data on migration, education, health, labour, transport, and tourism.

Much of the administrative data is used directly for statistical purposes, but it may also be used for creating frames for sample surveys. A good example of this is the statistical business register (SBR) which is usually based on administrative data on businesses. It may also be based on mixed sources, an economic census and administrative data. The SBR is a structured database on businesses, maintained on a regular basis, supported by specific software. It is used by the NSO to create frames for business surveys and sometimes as a direct source of information on the number and kind of businesses, by location, size, economic activity and more. Another example is the statistical farm register (SFR) which in many countries is based on an agricultural census (often taken every ten years) and administrative data. Yet another example is a household address register which may be generated from census or administrative information. All such registers may be augmented by additional information collected through sample surveys with the registers as frames.

features

Concomitant with the digitisation of economic activities and transactions, possibilities have opened for capturing very large volumes of data from businesses and other sources. Such data is often termed Big Data. Technically, such data can be captured by accessing and tapping data from the databases of firms and institutions by electronic means. This has been done in a few countries in the last few years for obtaining very detailed data on inputs, outputs, prices and business transactions. Such data has been utilised for economic statistics and price statistics, e.g. price indices. A recent source of data on land and land use, earth observations collected by satellite imagery and requiring Big Data methodology for processing, is a potentially rich source of data for environment statistics, agricultural statistics, transport statistics, etc. A very promising Big Data source is cell phone information for compiling various statistics on mobility and communication such as for transport and tourism statistics. Related developments in this respect involve capturing data from social media content and transactions.