This post aims to describe the ethnicity data available in the IDI and improve understanding about the challenges of using it.

Ethnicity is a measure of cultural affiliation and is different from concepts of ancestry, nationality or citizenship (Statistics New Zealand, 2005a). It is a widely used measure of cultural identity in New Zealand and is important in monitoring the health and well-being of ethnic groups as well as the equity impacts of policies and services.

A large body of research has been devoted to understanding the concept of ethnicity and how it is related to colonisation and to structural, interpersonal, and internalised racism. A conceptual understanding of ethnicity based on the literature is essential for robust study design and interpretation of analytical results for any study that examines ethnic differences within a population. This post does not review this literature, but researchers intending to use ethnicity data are encouraged to read Cormack (2010).

Our aims here are to:

  • describe the ethnicity information currently available in the IDI, and
  • discuss some of the challenges in using ethnicity data in the IDI.

 

How is ethnicity measured in New Zealand?

Statistics NZ publish a statistical standard for ethnicity that specifies how ethnicity information should be collected and aggregated. It was developed to ensure that ethnicity is collected consistently across all survey and administrative data collections within the official statistics system. The standard specifies a range of guidelines for collecting ethnicity, including:

  • ethnicity should be self-identified,
  • individuals should have the opportunity to specify multiple ethnicities (there should be space for at least three, but preferably six), and
  • the prioritising of ethnic responses to one per individual should be discontinued.

(For the full standard, see Statistical Standard for Ethnicity)

The statistical standard recommends using multiple response measures of ethnicity, rather than single or prioritised ethnic identification measures. This is to ensure that individuals who identify with multiple ethnic groups (more than 15% of the population in the 2023 census, and even higher for children and young adults [Statistics New Zealand, 2025a]) are fairly represented.

Ethnicity can be recorded and output at different levels of detail. The standard ethnicity classification is a hierarchical classification of four levels. Level 1 of the classification has six categories: European; Maori; Pacific; Asian; MELAA (Middle Eastern, Latin American and African); and Other. Level 2 has 21 categories, Level 3 has 36 categories and Level 4 has 180 categories (see Statistical Standard for Ethnicity).

 

Where can I find ethnicity information in IDI?

Personal detail table

The personal detail table is the key source for Level 1 ethnicity information. We recommend that users source their Level 1 ethnicity information from the personal detail table unless they have a specific reason to use one of the other sources.

Ethnicity information from different IDI collections is combined in the Personal Detail table. The method used is the ‘source ranked ethnicity’ method, in which the different ethnicity data sources (for example health, ACC, MSD) are ranked according to quality, and each individual is assigned the ethnic profile from the single highest ranked data source available. The personal detail table records Level 1 ethnicity in total response format, so an individual can have more than one Level 1 ethnic group recorded. The quality ranking order for the sources has been selected on the basis of how well the ethnicity information in each agency’s dataset agrees with census data. Census ethnicity data is given the highest priority, followed by DIA data, Ministry of Health data, Ministry of Education data, and then others (the full ranking can be found in the ID commons: https://cdck-file-uploads-us1.s3.dualstack.us-west-2.amazonaws.com/flex019/uploads/idcommons/original/2X/7/7e3c61cd4682baa62dcfe9c924eb2887725e8ea6.pdf.).

The six Level 1 ethnic groups are labelled snz_ethnicity_grp1_nbr to snz_ethnicity_grp6_nbr and are (in order from grp1 to grp6): European; Māori; Pacific; Asian; MELAA; and Other.

This ethnicity information contained in the personal detail table has several strengths:

  • it simplifies the use of ethnicity data in IDI by providing a single ethnic profile for each individual, and
  • the ethnic profiles generated from this method have moderate to high agreement rates with the census.

However, it also has some limitations:

  • Ethnicity information is provided at Level 1 only so for example it cannot be used to identify different Asian or Pacific sub-group ethnicities (eg Indian, Samoan).
  • Ethnicity is assigned from a single source, so it may underestimate the number of people with multiple ethnicities, particularly if the selected source limits the number of ethnicities that can be listed, or if an individual reports different ethnicities in different contexts. Also, the single source may involve agency-recorded ethnicity rather than self-identified ethnicity.
  • The ranking of sources is based on how well they match the census, which may not suit all purposes. For example, researchers may want to give higher priority to sources that match the context in which they are working (eg health for a health context), or to sources that were collected recently (for example by de-prioritising birth record ethnicity for adults).

Further detail about the ethnicity method used in the personal detail table can be found in ID Commons: https://idcommons.discourse.group/t/derived-ethnicity-data-in-the-personal-details-table/451

Administrative population census

The experimental administrative population census (APC) provides ethnicity information for the resident population of New Zealand. It provides ethnicity information at all levels from level 1 to level 4. APC tables are available in the data schema, and ethnicity can be located in the data.apc_constants table.

The method used to derive APC ethnicity is similar to the method used in the personal details table, with the key difference being that data from the most recent census is not used, and data from historical censuses are only used if there is no ethnicity information available in other administrative data sources. Sources are ranked separately for each level of ethnicity (so, for example, the source used for level 4 ethnicity might be different to the source used for level 1 ethnicity). A ‘source’ variable is included so that users can see where each ethnicity profile was sourced from. See Experimental administrative population census: Data sources, methods, and quality (second iteration) | Stats NZ and Experimental administrative population census (third iteration): Changes to data, methods, and quality | Stats NZ for more information on APC methods.

A strength of the APC ethnicity is that it provides ethnicity information at more detailed levels (level 2 and higher). This is useful for users who need more detailed ethnic groupings than the level 1 groupings provided in the personal detail table, such as Asian and Pacific sub-groupings. Level 2 ethnicity is available in APC for 99.6% of the estimated residential population from 2012-2022, and proportions of level 1 and 2 ethnicities in the 2023 APC are similar to those in the 2023 census at the national level (Statistics New Zealand 2025b). At level 3 and 4 there is more missing data and an undercount of many ethnic groups in the APC compared to census.

However, there are also some limitations to note:

  • APC ethnicity deprioritises ethnicity information from census sources. This is likely to decrease the quality of the ethnicity information in APC because the ethnicity information in census is of high quality: it is standardised, self-identified, and not connected to the provision of a service.
  • The sources available for detailed ethnicity levels (3 and 4) are limited, and as a result the ethnicity recorded at these levels may have been collected a long time ago or may not be self-identified (eg DIA births). We recommend users check the source of the ethnicity data for their cohort when using the detailed ethnicity data from APC.

It is possible to combine APC and Census data to improve the quality for users requiring Level 2 to 4 ethnicity information. For example, Level 2 ethnicity information can be taken from census data if available, and if not it can be taken from APC (eg, Teng et al. 2024, Satherley et al. 2024).

Specific data collections

Ethnicity data is available from a range of data collections in the IDI, including: census (2013, 2018 and 2023); health; education; ACC; births; mortality; MSD; and various surveys. These collections vary in how ethnicity information is collected, the level of detail and quality of the data, and the context in which data were provided. We will not review these in detail here and suggest that users wishing to use agency-specific ethnicity information refer to metadata documents for further information.

Ministry for Ethnic Communities’ code module

In June 2025, the Ministry for Ethnic Communities produced the Ethnic Communities’ ethnicities code module table in the IDI. The module provides more detailed ethnic groupings than are available in the personal detail table (eg some level 2 groups), and some custom groupings that are not provided in other sources (eg Continental European, African+). It uses similar methods to the source ranked ethnicity method (used in the personal details table). For more details see https://idcommons.discourse.group/t/ethnic-communities-ethnicities/3931

 

Issues to consider when working with ethnicity data in IDI

Ethnicity data is complex and there are several issues that researchers should keep in mind when using IDI ethnicity data.

  • Ethnicity reporting is influenced by context. Individuals are likely to give a different response to an ethnicity question depending on the setting, the way they are asked, and for what reason they are being asked. For example, if a person is concerned an agency may discriminate against them because of their ethnicity (or one of their ethnicities) they may choose to miss-report or withhold reporting an ethnicity. This is especially true when information is being recorded while an individual is trying to access an essential service (such as healthcare, housing, or education).
  • Timing and change. Ethnic identity can change over time (ethnic mobility). Yet most of the ethnicity data in the IDI is not currently time stamped. Some collections such as Census and surveys were collected on a fixed date and thus have a time stamp, but many other collections do not have a date attached to the collection of the ethnicity information. This means that any changes to ethnicity information will overwrite the old information, making it very difficult to examine change in ethnicity identification over time.
  • Self-identification. In some administrative collections ethnicity data may not be self-identified and is instead filled in by a staff member. This is especially true in emergency situations (eg emergency department admissions) and after death (mortality records). It is also often true for children and adolescents whose parents may fill out the information for them.

 

Summary

The IDI contains ethnicity data from multiple sources and is a valuable tool for analyses that include ethnicity. The choice of which ethnicity data to use will depend on the specific research question being asked, and the data available. In general users needing Level 1 ethnicity information should use the Personal Detail table as a default. Users needing Level 2 or more detailed information have several options with different costs and benefits. In all cases it is important that researchers understand the broader issues around ethnicity data in New Zealand and consider this in their analyses.

 

References

Bycroft, C, Reid, G, McNally, J, Gleisner, F (2016). Identifying Māori populations using administrative data: A comparison with the census. Available fromhttps://www.stats.govt.nz/research/identifying-maori-populations-using-administrative-data-a-comparison-with-the-census-2

Crengle S, Lay-Yee R, Davis P, Pearson J. 2005. A Comparison of Māori and Non-Māori Patient Visits to Doctors: The National Primary Medical Care Survey (NatMedCa): 2001/02. Report 6. Wellington: Ministry of Health.

Cormack D (2010). The practice and politics of counting: ethnicity data in official statistics in Aotearoa/New Zealand. Wellington: Te Rōpū Rangahau Hauora a Eru Pōmare. Cormack, D, & McLeod, M (2010). Available from http://www.otago.ac.nz/wellington/otago600095.pdf

Cormack D & McLeod M. (2010). Improving and maintaining quality in ethnicity data collections in the health and disability sector. Te Rōpū Rangahau Hauora a Eru Pōmare: Wellington. Available from http://natlib.govt.nz/records/22945073?search%5Bpath%5D=items&search%5Btext%5D=Improving+and+maintaining+quality+in+ethnicity+data+collections+in+the+health+and+disability+sector

Goodyear, RK (Statistics New Zealand) (2009). The differences within, diversity in age structure between and within ethnic groups. Wellington: Statistics New Zealand. (link)

Kukutai, Tahu (Tahatū Consulting), Statistics New Zealand (2008). Ethnic Self-prioritisation of Dual and Multi-ethnic Youth in New Zealand, Wellington: Statistics New Zealand. https://ndhadeliver.natlib.govt.nz/delivery/DeliveryManagerServlet?dps_pid=IE1632044

Reid, G, Bycroft, C, Gleisner, F (2016). Comparison of ethnicity information in administrative data and the census. Available from https://www.stats.govt.nz/research/comparison-of-ethnicity-information-in-administrative-data-and-the-census

Satherley, N., & Sporle, A. (2024). Reporting health outcomes for distinct Pacific populations in New Zealand: Assessing the Integrated Data Infrastructure as a resource for detailed ethnic population outcomes. Available from inzight.co.nz/projects/pacific-health-reporting

Statistics New Zealand (2005a). Statistical standard for ethnicity. Available from www.stats.govt.nz.

Statistics New Zealand (2005b). Ethnicity New Zealand Standard Classification. Available from http://aria.stats.govt.nz/aria/?_ga=2.41804391.472874614.1606854906-850314034.1554417710#ClassificationView:uri=http://stats.govt.nz/cms/ClassificationVersion/YVqOcFHSlguKkT17

Statistics New Zealand (2006). The Impact of Prioritisation on the Interpretation of Ethnicity Data.

Stats NZ (2025a). Aotearoa Data Explorer: 2023 Census – Number of ethnic groups by age. Available from https://explore.data.stats.govt.nz/vis?fs[0]=2023%20Census%2C0%7CEthnicity%252C%20culture%2C%20and%20identity%23CAT_ETHNICITY_CULTURE_AND_IDENTITY%23&fs[1]=Number%20of%20ethnic%20groups%20specified%2C0%7CTotal%20-%20number%20of%20ethnic%20groups%20specif

Statistics New Zealand (2025b). Quality of ethnicity data in the experimental administrative population census (APC): High level summary Census Transformation Statistics New Zealand. Wellington, New Zealand, Statistics New Zealand.

Teng, A., Underwood, L., & Milne, B. J. (2024). Statistical code from manuscript: How does the level of functional impairment vary in individuals with non-communicable disease and comorbidity? Cross-sectional analysis of linked census and administrative data [Statistical code]. Available from https://hdl.handle.net/10523/22912

 

 

By Andrea Teng, Sheree Gibb, Andrew Sporle, Barry Milne

Version: Original 4 September 2017. Updated 28 November 2025.