Longitudinal and International Study of Adults (LISA)
Detailed information for 2016 (Wave 3)
Every 2 years
The Longitudinal and International Study of Adults collects information from people across Canada about their jobs, education, health and family. The study is interested in how changes in these areas affect people's lives. This study aims to help improve education, employment, training and social services in Canada.
Data release - December 4, 2018
The Longitudinal and International Study of Adults (LISA) is a study that examines changes in Canadian society over time. LISA uses household interviews to collect information from approximately 34,000 Canadians age 15+ years from more than 11,000 households. LISA aims to improve our understanding of what is happening in the lives of Canadians so we can see what services they require, and what kinds of information they need to support their decision-making about today and the future. LISA results could shed light on:
- Long-term impacts of postsecondary education;
- Transitions in the workplace and across the labour force;
- Impacts of complex issues such as job loss and poor health;
- Standards of living for retirees and changes that may occur over time.
The following LISA Waves currently exist:
- 2012 (Wave 1)
- 2014 (Wave 2)
- 2016 (Wave 3)
Data users include all levels of government, researchers, educators, learning institutions and organizations. These data are used to influence the development of services, and to ensure the development of effective policies and service provision to the people who require it the most.
- Education, training and learning
- Families, households and housing
- Income, pensions, spending and wealth
Data sources and methodology
The Longitudinal and International Study of Adults (LISA) covers the population living in Canada's ten provinces as of the first Wave of the survey (2011), plus their future children. Excluded from the survey's coverage are those living in Canada's territories, as well as those who at the time of Wave 1 were: living on reserves and other Aboriginal settlements in the provinces; official representatives of foreign countries living in Canada and their families; members of religious and other communal colonies; members of the Canadian Armed Forces stationed outside of Canada; living full-time in institutions, for example, inmates of correctional facilities and chronic care patients living in hospitals and nursing homes; or living in other collective dwellings. Altogether these exclusions represent approximately 2% of the population.
The survey was conducted by a Statistics Canada interviewer via a Computer Assisted Personal Interview (CAPI). With CAPI, as the questions were developed, the associated logical flow into and out of the questions was programmed. This included specifying the type of answer required, the minimum and maximum values, online edits associated with the question, and what to do in case of item non-response.
The first Wave was conducted from November 2011 to June 2012 and the infrastructure was tested in 2011 by the Income Statistics Division, Special Surveys Division, Collection Planning and Management Division, and the Collection Systems and Infrastructure Division within Statistics Canada.
Content for future Waves consists of core content, rotating content, and theme content. Core content remains stable over time. Rotating content alternates from Wave to Wave. Theme content is only asked in one Wave.
Wave 2 was conducted from January to June 2014 via CAPI, which was tested prior to collection. Content added to Wave 2 was either derived from existing questionnaires or developed with experts. New content was qualitatively tested by Statistics Canada's Questionnaire Design Resource Centre (QDRC) in February and April 2013, in English and French, to ensure respondent understanding of the questions and to identify any errors.
Wave 3 was conducted from January to May 2016 via CAPI, which was tested prior to collection. Content added to Wave 3 was either derived from existing questionnaires or developed with experts. New content was qualitatively tested by Statistics Canada's QDRC in February 2015, in English and French, to ensure respondent understanding of the questions and to identify any errors.
This is a sample survey with a longitudinal design.
The LISA sample has a stratified multi-stage, multi-phase design. The LISA survey frame was constructed from dwellings containing households that responded to the 2011 Census of Population and that were not eligible for the National Household Survey (NHS), which was being run at the same time as LISA. It was necessary to restrict the frame to Census respondent dwellings because the household composition was required by the sampling plan. A total of eight 2011 Census Response Database (RDB) tables were extracted to create the frame. The first step was to create the NHS flag to identify all Census responding dwellings selected for the NHS. Some exclusions were then made, including inactive dwellings, dwellings with an invalid questionnaire, and non-respondent dwellings. Exclusions from the target population were then made. Finally, dwellings with only temporary or foreign residents remaining were excluded from the frame. Next, checks were done to ensure that there were no dwelling duplicates or person duplicates on the file.
In Wave 1, LISA was integrated with the Programme for the International Assessment of Adult Competencies (PIAAC), also known as the International Study of Adults (ISA). The two surveys shared a portion of their samples and the data collection activities were integrated, which impacted the LISA sample design. The target populations of the two surveys differed, in that the ISA target population covered only 16 to 65 year-olds, whereas the LISA target population covered individuals of all ages.
Consequently, the frame was first stratified by eligibility status for ISA. It was then further stratified by province and urban/non-urban status. The urban/non-urban boundaries were defined so that the communities defined as 'urban' were large enough to guarantee an ISA general population sample size of at least 15 dwellings.
In the stratum eligible for ISA, the two surveys partially shared their samples. In the non-urban strata, geographic clusters were selected at the first stage of sampling with probability proportional to their 16 to 65 year-old population. At the second stage, dwellings were selected in two phases. First, a sample of dwellings was selected with probability proportional to the anticipated number of 16 to 65 year-old household members. The ISA sample of individuals was selected from these dwellings. A subsample of these dwellings was then selected using a systematic sampling scheme to constitute the LISA sample. All members of the households residing in this final sample of dwellings became members of the LISA sample and formed the subsample called LISA/ISA. In urban strata, the sample design differed only in that there was no geographical clustering.
In the stratum ineligible for ISA, the provincial and urban/non-urban stratification and the geographic clustering were identical to that described above. The selection of dwellings in this stratum, however, was done in only one phase using simple systematic sampling. Again, all members of the households in the selected dwellings became members of the LISA sample and formed the LISA-only sample.
Data collection for this reference period: 2016-01-04 to 2016-05-30
Responding to this survey is voluntary.
Data are collected directly from survey respondents and extracted from administrative files.
The LISA data collection instrument contains five parts, comprising both survey components and administrative data components:
1. The household roster is a survey component collected at the first contact with a LISA responding household. It is a preliminary interview that captures basic demographic information (such as age, sex, and marital status) about all persons residing in the dwelling as well as their relationship to every other household member. Only one person in the household is required to complete the roster.
2. The questionnaire component is the main interview asked to every person aged 15 years or older who is part of the LISA sample. It contains the subject matter questions essential to the LISA dataset.
3. The PIAAC component includes the PIAAC assessments of literacy, numeracy and problem-solving in a technology-rich environment as well as a survey component that examines the skills used at work and in everyday life. This component is available exclusively in 2012 (Wave 1).
4. The income component contains selected income information retrieved from sample members' T1 income tax returns. An imputed income value is included in the event that an income value could not be retrieved for certain respondents. In LISA 2016 (Wave 3), this component includes figures from the 2015 income tax returns.
5. The historical administrative data component provides historical and contemporary information for respondents and their family members about their income (from their tax records in the T1FF), their earnings and employers (from their T4) and their pensions (from the PPIC file). For immigrants, it also includes information from the Longitudinal Immigration Database (IMDB). Additional years of administrative data are matched to LISA on a consistent basis.
SURVEY DATA COMPONENT
Collection method: By a Statistics Canada interviewer via a Computer Assisted Personal Interview. However, interviews could also be completed over the phone if the case met one of the following conditions: respondent refused to do a personal interview, respondent was not available for another personal visit to finish the already started interview, interviewer spent too many hours in the household and felt that further time spent in the house would jeopardize complete response, and/or high travel cost prohibited a return visit to the household (interviewer must have completed rostering the household). Other telephone interviews were possible, but they needed to be preapproved by the senior interviewer.
Capture method: Responses to survey questions are captured directly by the interviewer at the time of the interview using a computerized questionnaire. The computerized questionnaire reduces processing time and costs associated with data entry, transcription errors and data transmission. The response data are encrypted to ensure confidentiality and sent via modem to the appropriate Statistics Canada Regional Office. From there they are transmitted over a secure line to Ottawa for further processing.
Method of initial contact: Letter
Follow-up method: In-person at door/telephone
Use of proxy reporting: N/A
Language(s) offered to potential respondents: English and French
Average time required to complete interview/survey: 35 minutes per person
ADMINISTRATIVE DATA COMPONENT
Historical administrative data are available only for respondents who consented to link to their administrative data. For those who consented, data are linked to the T1 Personal Master File, T4 Summary and Supplementary Files, Pension Plan in Canada (PPIC) files, the T1 Family File (T1FF), and the Immigration Database.
T1 Family File (T1FF): 1982 - 2015
T4: 2000 - 2015
Pension Plans in Canada (PPIC): 2000 - 2015
Longitudinal Immigration Database (IMDB): 1980 - 2015
View the Questionnaire(s) and reporting guide(s).
The LISA processing begins by amassing the data from the collection process. To minimize the chance of introducing errors in the data, output data are validated after each processing step.
An initial cleaning of the data is performed to remove duplicate records. At the pre-edit step, data are modified at the variable level. Variables could be dropped, re-coded using standard codes, re-sized, and 'Mark all that apply' variables de-strung. Flow edits are applied that replicate the flow of the questionnaire. Valid skip values are assigned. Certain variables are coded by the Operations and Integration Division (OID).
During data processing, derived variables are created when the following situations occur:
1. Survey questions that are asked in an easy-to-answer format which requires adjustment to a common parameter. For example, the questions on the timeline of specific events where the respondent is asked to provide the date or age, whichever is easiest for the respondent. In cases where age is provided, the age is converted to the year.
2. A concept that requires a complex combination of data, such as the variables describing the economic and census family, or those related to labour market status.
3. Open-ended questions that have to be converted to a standard classification system, such as the National Occupational Classification (NOC) and the North American Industry Classification System (NAICS).
4. Questions where respondents select 'Mark all that apply'. In this case, the data file retains a derived variable for each possible response.
A certain number of derived variables are made available to researchers in the data file. Data users should consult the 2016 LISA codebooks for a detailed description of the derived variables.
Given the numerous administrative data sources (T1FF, T4, PPIC and IMDB) linked to LISA, there are minimal consistency edits applied to these files during processing, as such a large task is beyond the scope of this project.
For income data, all respondents are matched to the tax data file unless they refuse to have their information linked. Data obtained from the tax data file are complete and do not require imputation. Only in the absence of such data will income figures be imputed. For most of the observations that need imputation, imputation is done for totals.
For income data, donor imputation by the nearest neighbour method is used and performed primarily with Statistics Canada's Canadian Census Edit and Imputation System (CANCEIS). Amounts received through certain government programs, such as universal child care benefit and child tax benefits, are derived from other information (i.e. number of children in the household) using a deductive imputation method.
When missing, earnings variables related to a respondent's current job are imputed by the nearest neighbour method using CANCEIS.
In order for estimates produced from survey data to be representative of the covered population and not just the sample itself, data users must incorporate the survey weights in their calculations. A survey weight is given to each respondent included in the final sample. This weight corresponds to the number of persons in the entire population that are represented by the respondent.
The LISA weighting strategy has been updated for the release of LISA 2016 (Wave 3). This new weighting strategy applies to 2016 (Wave 3) data, future Waves of the survey, as well as retroactively for 2014 (Wave 2). None of the changes affect the 2012 (Wave 1) weights, and so they will remain unchanged.
Four different weights were produced at Wave 3 and a calibration was done for each. For the all-waves responding person weights, calibration was performed at a seven region level (Atlantic provinces, Quebec, Ontario, Manitoba, Saskatchewan, Alberta and British Columbia). Sixteen control totals were used: 12 age by sex control totals, two control totals related to economic entity size and two with respect to household size. For the all-waves PIAAC/ISA responding person weight, calibration was performed at a four region level (Atlantic provinces, Quebec, Ontario and Western provinces). Sixteen control totals were used: ten age by sex control totals, two household size control totals and four education level control totals. For the wave-t enumerated person weight (EPW), calibration was performed at a seven region level (Atlantic provinces, Quebec, Ontario, Manitoba, Saskatchewan, Alberta and British Columbia). Eighteen control totals were used: 14 age by sex control totals, two control totals related to economic entity size and two related to household size. For the wave-t responding person weights, calibration was done at the same geographical levels as the EPW, with the same age by sex, economic entity size and household size control groups, but with the addition of 'male 13 or 14 years old', 'female 13 or 14 years old', 'male 11 or 12 years old' and 'female 11 or 12 years old' to reflect the individuals who turned 15 years of age between Waves 1 and 3.
Education levels were used to calibrate the ISA responding person sample in order to have consistency between the estimates computed from that sample and from the full ISA survey sample. In addition, analysis performed with the ISA person sample is likely to concentrate on the use of test scores, which could be highly correlated with education levels. As a result, less detailed age and geographic control totals than for the LISA responding person sample had to be used to enable the use of the education levels. Education levels were not retained as control totals to calibrate the LISA responding person sample because the size of the sample did not support the use of the education control totals along with the more detailed age and geographic control totals.
Under the new weighting strategy, there will be four sets of weightstwo longitudinal weights and two cross-sectional weights. These apply to certain subgroups identified in LISA:
Enumerated person: All individuals in responding households, including children and non-responding individuals but excluding all temporary sample members (that is, household members that are neither original (Wave 1) sample members nor descendants of the original sample members).
Responding person: Responding individuals in responding Wave-t households.
ISA person: Individuals within the responding Wave 1 LISA households who were selected for, and responded to, the PIAAC/ISA.
LONGITUDINAL WEIGHTS (ALL-WAVES)
For the longitudinal weights, one set applies to all responding individuals (AWRPW) and the other applies to the subsample of Wave 1 PIAAC/ISA respondents (AWIRPW). The names for the longitudinal weights are unchanged under the new weighting strategythey continue to apply to individuals that respond in each Wave of the survey up to the current Wave, and are designed to make the sample representative of the 2012 Census population living in the ten provinces.
WAVE-SPECIFIC WEIGHTS (WAVE-T)
Wave-T weights are only assigned to persons who are considered to be respondents at the current Wave. For the cross-sectional weights, one set applies to all enumerated individuals (EPW), with the other applying to all responding individuals (RPW). As the cross-sectional weights are not designed to make the sample representative of the cross-sectional Canadian population in a given Wave-year, these weights are now renamed wave-t weights to reflect their application to only a single Wave of the survey. Using these weights makes the sample representative of the 2012 population plus their descendants in a given Wave-year.
Quality assurance measures were implemented at every collection and processing step. Measures included the recruitment of qualified interviewers, training provided to interviewers for specific survey concepts and procedures, observations of interviews to correct questionnaire design problems and instruction misinterpretations, procedures to ensure that data capture and coding errors were minimized, and edit quality checks to verify the processing logic. Data are verified to ensure internal consistency and are also compared to other sources.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Once processing is complete, all personal identifiers are removed from the file. That file will be held indefinitely by the Income Statistics Division.
All personal identifiers required to link to tax files will be retained for the life of the survey's processing, but are only accessible to those directly involved in processing. Once the processing lifecycle is complete, the file is deleted.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
The standard errors for LISA are estimated using the bootstrap method. For each of the four weights, a corresponding set of 1,000 bootstrap weights is also produced which can be used for estimating sampling variance. LISA uses a multi-stage, multi-phase survey design with calibration which means that there is no simple formula that can be used to calculate variance estimates. Therefore, an approximate method was needed. The Rao-Wu bootstrap method (Rao and Wu, 1987) is used because the sample design and calibration need to be taken into account when calculating variance estimates.
The quality of estimates produced with LISA is measured with the coefficient of variation (CV), produced using bootstrap weights. The CV magnitude will depend on the domain of interest and the prevalence of the characteristic.
NATIONAL COEFFICIENTS OF VARIATION FOR KEY VARIABLES, COMPUTED FROM THE LISA SAMPLES (Note 1), WAVE 3
Variable/Coefficient of variation - Using all-waves LISA responding person sample (weight AWRPW):
Number of persons with a university degree (Note 2): 2.08%
Number of employed persons in the reference week: 0.82%
Number of immigrants: 2.82%
Median personal total income before tax: 1.21%
Note 1: When using the full LISA responding person sample or the full LISA enumerated person sample, the domain for the estimates is the population aged 15 years and older.
Note 2: A person has a university degree if EHHL_Q05 in (10, 11, 12, 13, 14).
NATIONAL COEFFICIENTS OF VARIATION FOR KEY VARIABLES, COMPUTED FROM THE ISA PERSON SAMPLE (Note 3), WAVE 3
Variable/Coefficient of variation - Using all-waves ISA responding person sample (weight AWIRPW):
Number of persons with a university degree: 1.87%
Number of employed persons in the reference week: 1.17%
Number of immigrants: 4.02%
Median personal total income before tax: 2.00%
Note 3: When using the full ISA responding person sample, the domain for the estimates is the population 20 to 69 years old in Wave 3.