Details

This three-day course provides a practical computer-based introduction to some of the key skills and knowledge needed to undertake data analysis in the Integrated Data Infrastructure (IDI). Participants work within a real Datalab environment with real IDI data. The first day covers induction and confidentiality training with StatsNZ, for those that have not completed these previously. to enable safe access the IDI environment. If you have previously completed this training, you may skip the first half day of the course and join after lunch.

The remainder of the course covers:

    • Logging in and navigating the IDI environment
    • How to link tables together
    • How to choose and create a base population
    • Adding demographic and other information to your population
    • Understanding missing data in the IDI
    • Removing people from the population due to death or out-migration
    • Finding and using metadata
    • Preparing data for output
    • Using the output checking tool and submitting output.

Prerequisites

In order to enrol in this course, you must have completed VHIN Introduction to Research in the IDI whether in the same programme or in a previous year of courses. You must also have experience with writing statistical programming code, at least to the level of filtering data, creating and recoding variables, and creating data sets.

We believe that a solid understanding of IDI structure, processes, and Māori data sovereignty is essential for working safely with IDI data, and this material is covered in the one-day introductory course. It is not covered in this practical course due to time constraints.

You must also have some experience with writing statistical programming code. The minimum level required is being able to write code to perform basic tasks such as selecting and filtering data, creating new variables, and creating a data set. This experience can be in R, SAS, Stata, SQL, or similar software, but experience with point-and-click analysis only, e.g. SPSS, is not sufficient.

The course requires some basic programming in SQL, but prior SQL experience is not required. If you are new to SQL, you will be sent some short introductory SQL exercises to complete prior to the course.

Audience

This course is appropriate for researchers and analysts who are planning to use IDI data in a Datalab environment. It covers basic skills and is best suited to those with no or very little IDI experience. Supervisors, managers, and others who will not be working directly with IDI data will get more benefit from our theory-based course, VHIN Introduction to Research in the IDI.