12.2 Register-based statistics#
12.2.1 Description of Register-based statistics#
‘Register-based statistics’ mean statistics that are fully based on administrative registers or partly based on them complemented with data from other administrative sources and direct surveys. This definition is rather wide and may include almost all statistics production. The focus in this section is on those statistics which have links to a register-based production model.
The term ‘register-based statistical system’ is used in this section even though it is by nature indicative rather than an accurate description of any statistical system. As a pragmatic approach the handbook Using Administrative and Secondary Sources for Official Statistics: A Handbook of Principles and Practices (🔗) published by the UNECE (2011) defines register-based statistical system as a system based primarily on administrative data that have been organized into linked statistical registers. This definition considers that in practice, all NSOs, independently of their data collection policies and production models, need to conduct regular statistical surveys to produce a full range of statistics. However, in a register-based statistical system, the basic data infrastructure is organized around linked statistical registers based on administrative sources. With such an infrastructure, administrative data is considered as a main source of information for the compilation of official statistics. Administrative data has been used in modern statistics production for decades and a remarkable milestone in producing register-based statistics was the Danish population and housing census of 1981 and its underlying register-based production model. The production model was based on the linking of several administrative registers and other administrative data sources. Since then, several NSOs have applied this approach to the production of register-based statistics.
The above mentioned UNECE Handbook summarises several countries’ experiences in collecting and using administrative data at that time. A common understanding among statisticians about the possibilities and constraints of using administrative data has grown remarkably during the recent decade. At the same time, statistical methods to improve possibilities and overcome problems have developed. It has become a good international practice to add a specific section to the use of administrative sources to domain-specific guidelines, such as guidelines for population census or integrated business statistics, to remind the NSOs about the possibilities of administrative data as a source for compiling statistics.
Administrative sources, access and acquisition of administrative data are dealt with in Chapter 9.3 — Administrative sources. This section deals with utilising administrative registers and other administrative data effectively in the statistical production. The focus is on building an administrative and statistical register infrastructure to be used in the production of official statistics - including technical and organizational structures and facilities - which can facilitate the use of register data.
The utilisation of administrative sources varies greatly from country to country. The extent of the development of the basic infrastructure in the direction of intensive utilisation of administrative sources depends on the existence of suitable administrative sources, ease of access to them, general facilitators like common identifiers and, in the end, on government policy and the acceptance of the general public. The development phase is long, and a step-by-step approach is needed.
12.2.2 Registers#
Administrative registers
A register can be simply a basic list of all units in the target population and nothing more. In practice, most registers include additional attributes for each unit. In a population register, these attributes are related to natural persons: name, identification code, address, domicile code, sex, and family relations. In a business register, a registered entity may have attributes like name, address, identification code, type of company and activity class.
An administrative register’s function is typically to identify the registered units, keep stock of the population (like natural persons in a population register and businesses in a business register) and keep track of any information changes related to coverage (new units to be added and exiting units removed) and to events (changes in the attributes of the units). Management of these changes is built into the updating system of the register.
All countries have administrative registers. Administrative registers are seen increasingly often as a resource and facilitator in the effective and proper functioning of a society. Governments are giving ever-growing attention to the development and management of the administrative registers and the information management of public administration in full including information systems, information security, operating processes and technical interfaces. As part of the public sector information systems, administrative registers are needed to serve not only government administration in its daily operations but also to help people, businesses, and other organizations understand their lawful rights, access benefits, and meet their obligations in the society.
Among the most important administrative registers widely used for statistical purposes are population registers or civil registration and birth and death registration systems, business registers, registers of social security and health care systems, taxation and customs registers and registers from building and property registration systems.
The way the registers and other governmental information systems are organized and developed varies from country to country and depends on the culture, policies and structure of the governmental sector as well as on the legislative framework of a society. The degree of centralisation of administrative registers, the identification systems and the extent to which the registers and the identification systems are regulated by law vary remarkably across the countries.
Identifiers play a vital role in maintaining administrative registers and their use for statistics production, which usually means linking data from various sources. Identification codes (ID codes) should ideally not be changed during the period a unit exists. This is the method usually applied in the countries with a common identification system. The codes are used in administrative data across the whole administration.
Registers and identification codes are usually regulated by national laws in the countries with centralised management of public sector’s information systems. Laws regulate the responsible authorities and their rights and obligations as well as the content, access to and use of these registers. The creation and the use of the personal ID code and the business ID code are also regulated by law.
The national legislation on protecting private life and other basic rights, which safeguards an individual right to privacy, governs society. For these reasons, the number of authorities with access to the population register and the right to handle any information with personal ID codes is very limited. Within this legislative framework, the NSO has the right to get access to and use record-level administrative data are guaranteed, and provisions to safeguard statistical confidentiality are included. By contrast to the population registers and other administrative registers with private information, in some countries, the business register’s information about ID code, like address, place of residence and main activity are public by law and available on the web.
The legislation and organization of the registers, the updating systems, and the identification systems used in them, create the framework in which the national statistical system operates. Indeed, a centralised register-based administrative system with unique identification codes facilitates the acquisition of administrative data for the production of official statistics. When the administrative registers are more decentralised between government and administration entities, and especially if a common identification system used across the administration does not exist, an NSO has to make more effort to acquire data and process them. NSOs have made efforts for matching data from various administrative sources without a common identification system, but these efforts are very laborious, and the results may be quite weak in the context of regular statistical production.
Chapter 9.3 — Administrative sources discusses administrative sources and related quality issues in general. Due to the diversity of administrative register practice, data quality assessment frameworks for administrative data are bespoke. The use of administrative registers in statistical production requires that the NSO assess each administrative register’s relevance, suitability, and quality as a data source. The coverage, data content and updating system of the administrative register should fit the statistical purposes and meet the overall data quality requirements. The updating system for an administrative register has a significant impact on its coverage and data content at a certain point in time. Therefore, it is vital for statistical usability that the register contains reliable time references for registrations and updates. Administrative registers and systems to manage them have to be stable over time to meet the continuity requirements for statistics. The data should be available in the form agreed between the producer of data and the NSO.
Statistical registers and register-based approach to production
The NSOs keep statistical registers exclusively for statistical purposes. They are needed to organize the NSO’s data collection activities by providing suitable frames for target populations and sampling. Statistical registers are usually created by processing data from several administrative sources, with or without combining them with survey data. Administrative registers are very seldom directly suitable as statistical registers, but they form an excellent source. A statistical register typically plays the role of a data coordination tool integrating data from several administrative and statistical sources. The data sources are integrated primarily by linking record-level data with common identifiers, occasionally by using matching techniques.
The generic model for register-based statistics is discussed in ‘Using Administrative and Secondary Sources for Official Statistics: A Handbook of Principles and Practices (🔗) published by UNECE. Chapter 10.3 — Methods and systems of analysis of this Handbook illustrates the links between administrative sources, statistical registers and statistical output.
The importance of statistical registers became obvious from the time when countries started to develop register-based censuses. Based on the available selection of administrative registers, the first step in the development process is to define a priority order in which the administrative registers play a role in the production system and which registers offer the possibility of identifying the units and links between the registers. This leads to selecting the most important ones to be used as a basis for statistical registers needed in the census system. Created statistical base registers, originally for census purposes, allow for implementing a meaningful system of internal links between these registers and the links to other registers and administrative data.
A statistical population register, a statistical register for buildings and dwellings or addresses and a statistical business register are usually defined as statistical base registers. Depending on the country’s administrative registers and identification system, the statistical base registers may also include other statistical registers. The statistical base registers serve as a basis for the register-based system to statistical production. This is the main feature in the ‘register-based production model’. The updating systems of the underlying administrative registers make it possible to choose the optimal updating dates for statistical base registers regularly updated at least annually. Suppose administrative records or other administrative registers than the main administrative source register are used in the statistical production; in that case, these additional data sets are linked to the corresponding statistical base registers. This ‘centralized’ register-based approach offers a basis for coordinating and integrating the production systems of population and social statistics based on administrative data, survey data or a combination of both.
The statistical business register cannot be based totally on administrative registers because it also contains information on establishments and enterprise groups, usually not available in administrative registers. The statistical business registers are dealt with more in detail in Chapter 12.3 — Statistical Business Register. Once the statistical business register with an updating system has been created, it serves as one statistical base register in statistical production and is an essential element in integrating business and economic statistics. The register-based approach to production may decrease the number of direct statistical surveys, but it does not diminish their importance. Many data items and variables are not found in any administrative register or data source. These data items include, for example, details of the activities of people and businesses, and their opinions, expectations and behaviours. In all countries, there is always a great need for regular statistical surveys to produce a full range of official statistics.
Traditionally the statistical registers were created to serve as a target population for sampling frames. With the increased content and regular updating mechanisms, statistical registers are increasingly being sources of statistical data in their own right, particularly regarding population and business demography, and data for small areas or small populations sub-groups. The role of statistical registers has increased especially since NSOs have become aware of, and started to develop, their common data architecture and data warehouses.
12.2.3 Production of register-based statistics#
Main features of register-based statistics
The term ‘register-based statistics’ was first used in the 1980s in the context of register-based population census. At the same time, it was recognised that the same register-based approach used with established register-based production systems is also useful in producing many other statistics.
The basic principle of the register-based production model is that all source data, from administrative sources and direct surveys, use the same identification system and link to the statistical base registers.
The most common way to produce statistics using the register-based system is to combine data from different administrative sources or combine data from administrative sources and direct surveys.
The register approach allows the production of some register-based statistics without any complementary data from direct surveys. These statistics may include population and business demographic statistics, education statistics, crime statistics and housing statistics. For instance, in the register-based production model, demographic variables are produced in the statistical population register and are used in all kinds of population and social statistics.
Register-based population census
The “UN Principles and Recommendations for Population and Housing Censuses” (🔗) published in 2017 offers useful information on census methodology summarising in its Chapter 4 the methodological questions related to the three data collection methods: full-field enumeration, register-based census, and combined methodologies, and concluding with the advantages and disadvantages in them.
A register-based population census requires a complex data system mainly based on the linking of various administrative registers. The combination of administrative registers and data used varies across countries as do the methods used and the production models applied. Development of a register-based census has usually taken many years, or even decades, depending on the development of suitable administrative registers and infrastructure needed in the system. The missing administrative data have usually been complemented with new or existing data from direct surveys during the development phase.
Typically, a register-based population census requires record-level data from several registers with unified identification codes and an appropriate way to organize them into a register-based census system.
Population registers, registers for buildings and dwellings or addresses and business registers are usually the cornerstones in this system. These contain the links to other registers and provide links to other administrative sources. Depending on the national registration systems, other typical administrative registers used in register-based censuses are registers for taxation, employment, pensions, social welfare, job seekers and students. The register-based census system in four Nordic countries (Denmark, Finland, Norway and Sweden) is thoroughly explained in the publication “Register-based Statistics in the Nordic Countries – Review of Best Practices with Focus on Population and Social Statistics (🔗). The Dutch paper “The usability of administrative data for register-based censuses” (🔗) gives an illustrated description of the development process towards a fully register-based census in the Netherlands. It explains the country’s registration system and offers a good description of the solutions and methods developed to overcome some of the most important issues.
Administrative registers do not usually cover all required data or provide enough details to allow the production of all statistical variables for the census. To complement the information that is not available from registers, countries use various methods that best suit their respective national circumstances. The methods used in ten different register-based population censuses are described in the document Efficiency in Population Censuses – the situation of the European register-based 2011 Censuses (🔗).
Once the register-based census system has been created, the infrastructure often allows annual production of the main statistical data contained therein. The register-based population census system usually offers a good basis to produce geospatial data in the countries with interlinked registers. One way to organize the production is via the register of buildings in which buildings and dwellings are geolocalised with map coordinates. Using the register-based inter-operability features, the exact location of each statistical unit can be derived. Thus, most variables of the population and housing census can be geo-localised.
The register-based census system also offers a good basis for developing special services for researchers. These services have currently developed rapidly, containing a large amount of anonymised and interlinked micro-data on persons, households, and businesses. These may be offered to researchers as online services. However, in giving access to census data on a more detailed level (microdata), it is crucial to secure confidentiality and privacy as discussed in Chapter 10.3.1 — Methods of analysis.
Register based approach in the production of other statistics
Vast amounts of statistics in many countries are based on a combination of data from administrative sources and data from direct surveys. These may typically include annual and short term economic and business statistics, income statistics and statistics on social conditions of households as well as a growing amount of energy and environment statistics. These statistics are linked in the register-based production model to statistical base registers so that the domain-specific statistical systems can use administrative data more easily and combine administrative data with survey data. The aim is to increase coherence and consistency in statistics production. The infrastructure developed for register-based statistics offers possibilities in fostering data warehouses and data integration in official statistics production. “A Guide to Data Integration for Official Statistics“ (🔗), published by the UNECE, and the “Asia-Pacific Guidelines to Data Integration for Official Statistics”, issued by ESCAP describe data integration in detail regarding different source data, methods and tools for data linking and matching.
Detailed administrative records can be an excellent source of information to compile more complex statistical frameworks, such as the national accounts or environmental-economic accounts.
A register-based system with a vast amount of administrative and survey data stored and structured in a meaningful way in statistical registers and warehouses may allow for the production of complex statistics without launching new surveys. Further, this system increases the flexibility and agility to respond to new and emerging needs for statistics and indicators which is of growing importance in view of the monitoring the progress towards the SDGs. As for statistical surveys, statistical base registers function as sample frames.
A register-based statistical system also contributes to increasing sample surveys’ quality, making them more consistent with other statistics at the macro-level and serving as additional information in analysing the results. The statistical base registers contain important demographic data for units that can be used to define populations and select samples. Using statistical base registers, samples may also be drawn for desired sub-groups, such as employed persons or students. In the analysis phase, survey results may be compared to data of the desired reference group available in the statistical base registers and other sources linked to them.
Statistical registers are also used to mitigate the impact of non-response in sample surveys when register data is used for the imputation of missing or invalid values.
12.2.4 Infrastructure for production of register-based statistics#
Coordination and working with data providers
Production of register-based statistics usually requires a formal organization that supports the extensive co-operation between register authorities and an NSO. Ideally, the authority and coordination function of the NSO is recognized beyond the national statistical system, such as with administrative data holders, users, and other stakeholders. The organization model of coordination varies from country to country. A good practice has been that the NSO names a senior statistical expert to function as a coordinator for each of the most important producers of administrative registers and other data. Cooperation and coordination, as well as the follow-up, are often managed by a functional coordination unit at the NSO, which is also acting as an information hub between the different producers. The conclusion of formal written agreements (or Memoranda of Understanding/MoU) between the NSO and administrative data holders belongs to such a functional coordination unit’s essential task in a register-based statistical system. These contracts help all parties to understand their obligations. Some authorities with many administrative data systems, like tax authority, may prefer centralized coordination mechanisms which are helpful also to the NSO. The designated coordinator at the NSO and his/her counterpart or contact person in the administration prepare the MoU for the data delivery or access with detailed technical attachments for each separate data set to be signed at the top management level. Attachments which contain specific detailed information of each data set including technical details and delivery timetables need to be updated regularly. The expert(s) role in the coordination unit is to manage and update the overall contract system and liaise between the subject matter specialists in the NSO and the holders of administrative data.
NSOs operating in a register-based statistical system also need to actively participate in the development of public sector information systems to safeguard the basis and continuity of the data sources for the production of statistics and eventually find new possibilities for further development.
Therefore, it is a good practice to consult and invite the NSO to participate in any governmental initiatives that may have an impact on the accessibility, inter-operability, scope, content, periodicity, and timeliness of administrative data.
Other aspects of common infrastructure
Within a register-based statistical system, the general infrastructure of an NSO needs to be adapted to the growing flows of incoming administrative data complemented by survey results. There is often a need to re-engineer and integrate the processes and metadata systems, develop methods and quality frameworks, streamline the internal organization and policies, and develop staff capabilities. Complementing or substituting survey data with administrative data for the production of official statistics without adapting the statistical production process rarely works. At a general level, the statistical process does not change, but the use of administrative data in a register-based system impacts every phase of the process.
Wide methodological expertise is needed to utilise administrative data in statistics production, such as data matching and other methods used to integrate administrative sources into the production process. Increasing amounts of administrative data used and linked in the statistical process increase the need for proper disclosure methods and rules. It is also important that the quality frameworks, rules and practices are adapted to this production system. The staff working with register-based statistics may need additional competences and training. Those experts working with administrative registers and other administrative data in the production of domain-specific statistics need to know how to handle both, survey data and the administrative data they use. The register-based production system should be reflected in human resource policies and programs. A special function for administrative data, often as part of the data collection unit, may be useful. Its task would be to make the first validation and quality checks for incoming administrative data, review the production processes and acquire or develop software applications. This unit, as mentioned above, would also be responsible for managing the ToRs.
As discussed in Chapter 14 - Data, Information and Knowledge Management production of statistics needs well-functioning, uniform and standardised metadata systems. In register-based systems, it is important that administrative registers and other administrative data used in production be documented in the metadata systems. Careful documentation of administrative source data, record descriptions, and electronic access to questionnaires, including instructions should be stored in the metadata system. Also, all changes should be documented and stored. Processing rules and a possibility of tracing data to the source are an essential part of this documentation. The NSOs have developed over the years common databases for metadata used in the whole production process. These may include metadata for statistical units, for classifications with various levels, concepts, variables and characteristics with definitions as well as metadata for technical standards. This kind of standardised metadata repositories improves the consistency of statistics and streamline the production process. These metadata repositories should be planned and adapted to the multi-source production process.
Efficiency in Population Censuses – the situation of the European register-based 2011 Censuses (🔗);
A Guide to Data Integration for Official Statistics UNECE (🔗);
Using Administrative and Secondary Sources for Official Statistics. A Handbook of Principles and Practices. UNECE 2011 (🔗);
Principles and Recommendations for Population and Housing Censuses. Revision 3, UN 2017 (🔗);
The usability of administrative data for register-based censuses (🔗);
Register-based statistics in the Nordic countries. Review of best practices with focus on population and social statistics. UNECE 2007 (🔗).