Longitudinal Survey of Immigrants to Canada (LSIC)

Detailed information for 2005 (Wave 3)





Record number:


This survey was designed to provide information on how new immigrants adjust to life in Canada and to understand the factors that can help or hinder this adjustment. The data will be used to evaluate the current services available and help improve them.

Data release - April 30, 2007


There exists a growing need for information on recent immigrants to Canada. As part of adapting to life in Canada, many immigrants face challenges such as finding suitable accommodation, learning or becoming more fluent in one or both of Canada's official languages, participating in the labour market or accessing education and training opportunities. The results of this survey will provide indicators of how immigrants are meeting these and other challenges. While integration may take many years, the LSIC is designed to examine the first four years of settlement, a time when newcomers establish economic, social and cultural ties to Canadian society. To this end, the objectives of the survey are two-fold: to study how new immigrants adjust to life in Canada over time; and, to provide information on the factors that can facilitate or hinder this adjustment.

Topics covered in the survey include language proficiency, housing, education, foreign credential recognition, employment, health, values and attitudes, the development and use of social networks, income, and perceptions of settlement in Canada.

Since immigration is a shared jurisdiction between national and provincial departments the information gathered through the LSIC will be beneficial to many groups including both federal and provincial government departments, immigrant settlement agencies, non-governmental organizations and researchers. Survey results will also play an important role in planning and developing programs that will assist future immigrants settling in Canada.

Note that a public use microdata file will not be created for this survey. Data from the survey may be accessed through Statistics Canada's Research Data Centres (RDC). For more information visit Statistics Canada's Research Data Centres site at http://www.statcan.ca/english/rdc/index.htm.


  • Educational attainment
  • Education, training and skills
  • Immigration and ethnocultural diversity (formerly Ethnic diversity and immigration)
  • Integration of newcomers
  • Labour market and income
  • Mobility and migration
  • Population and demography

Data sources and methodology

Target population

The target population for the survey consists of immigrants who meet all of the following criteria:

- arrived in Canada between October 1, 2000 and September 30, 2001;
- were age 15 years or older at the time of landing;
- landed from abroad, must have applied through a Canadian Mission Abroad.

Individuals who applied and landed from within Canada are excluded from the survey. These people may have been in Canada for a considerable length of time before officially "landing" and would therefore likely demonstrate quite different integration characteristics to those recently arrived in Canada. Refugees claiming asylum from within Canada are also excluded from the scope of the survey.

The LSIC population of interest is immigrants in the target population who are still living in Canada at the time of the interview.

Instrument design

Pilot Test Questionnaire:

The first step in the development of the LSIC was a pilot test, which took place in the spring of 1997. Wherever possible, to ensure comparability, questions for the pilot test were based on existing questions from other Statistics Canada surveys. Members of the LSIC team met with federal departments and a committee of external researchers and academics during the development of the survey instrument. There were external reviews conducted by Citizenship and Immigration Canada and Statistics Canada analysts, as well as by internal management. Focus group testing was conducted as well as consultations with an external advisory committee and project steering committee. By compiling and reviewing comments from the various sources the instrument was then modified and the remarks were incorporated into the final product.

Wave 1:

The pilot test questionnaire was used as a basis for Wave 1. Content was modified and expanded upon based on the results of the pilot test. New content was designed in consultation with other federal departments as well as experts in the field of immigration. Additional focus groups were conducted on the changes prior to Wave 1 collection. The questionnaire was put through a rigorous testing process including modular, integrative and end to end testing.

Wave 2:

The questionnaire for Wave 2 is very similar to that of Wave 1. Some questions were modified in order to improve the quality of the collected data. In addition, questions relating to experiences prior to arrival in Canada in Wave 1 were replaced with questions to obtain more detailed information on experiences after arrival in Canada. Focus groups were conducted on the changes prior to Wave 2 collection. The questionnaire was put through a rigorous testing process including modular, integrated and end to end testing.

Wave 3:

The questionnaire for Wave 3 is very similar to that of Wave 2. Some questions were modified in order to improve the quality of the collected data. Focus groups were conducted on the changes prior to Wave 3 collection. The questionnaire was put through a rigorous testing process including modular, integrated and end to end testing.

Please refer to Chapter 6.0 (Longitudinal Comparability) of the Wave 3 User Guide for detailed information.


This is a sample survey with a longitudinal design.

To adequately represent the different immigration patterns in Canada over a one-year period, the sample is made up of 12 cohorts, consisting of 12 independent monthly samples selected over a period of 12 consecutive months. The sampling frame for the LSIC is an administrative database of all landed immigrants to Canada that comes from Citizenship and Immigration Canada. The sample was created using a two-stage stratified sampling method. The first stage involved the selection of Immigrating Units (IU) using a probability proportional to size method. The second stage involved the selection of one IU member within each selected IU. The selected member of the IU is called the longitudinal respondent (LR). Only the LR is followed throughout the survey.

The survey involves a longitudinal design with immigrants being interviewed at three different times: at six months, two years, and four years after landing in Canada. The sample design has been developed using a "funnel-shaped" approach--i.e. a monotonic design--therefore only immigrants that responded to the Wave 1 interview were traced for the Wave 2 interview and only those that responded to the Wave 2 interview were traced for the Wave 3 interview.

The first stratification variable used was the month of landing in Canada. Within each month, two other stratification variables were used: the intended province of destination and the class of immigrant. The sample was divided into two components - the core and the additional samples. The core sample represents the target population, while the additional samples target specific sub-populations. The determination of the sample size for Wave 1 was based on several sample attrition hypotheses applied to the Wave 3 minimum sample size requirement. The complete sample for Wave 3 is a subset of the Wave 2 sample, consisting solely of immigrants responding in Wave 2.

Please refer to Chapter 7.0 (Sample Selection) of the Wave 3 User Guide for detailed information.

Data sources

Data collection for this reference period: 2004-11-01 to 2005-11-30

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

All of the information for the sampled individuals was collected in a face-to-face, or telephone interview when a face-to-face was not possible, using a computer-assisted interviewing (CAI) application. Interviews were conducted in one of 15 languages. The 15 languages selected cover approximately 93% of the new immigrant population in Canada. For Wave 3, interviews lasted approximately 65 minutes.

The Wave 3 LSIC population of interest consists of immigrants who have been in Canada for four years. For a variety of reasons, new immigrants are a highly mobile population during their first years in Canada. Respondent tracking is therefore necessary.

In each wave, the first contact was established with the selected respondents using the address and telephone number provided on the sample file by Head Office. Verification of respondent was done in two ways: matching of birth date and landing date. If the interviewer was unable to locate the respondent the case was transferred to a designated tracing team in the regional offices, for further follow up.

In the LSIC, proxy interviews are not allowed. The only exception is in the Income module, where the person most knowledgeable (PMK) regarding family income is asked to answer the questions.

Please refer to Chapter 8.0 (Data Collection) of the Wave 3 User Guide for detailed information.

View the Questionnaire(s) and reporting guide(s) .

Error detection

The information for the sampled individuals was collected in a face-to-face or telephone interview using a computer-assisted interviewing application. As such, it was possible to build various edits and checks into the questionnaire in order to ensure that high quality information was collected.

Flow pattern edits were automatically built into the CAI system. For example, for questions pertaining to a spouse/partner or child, the CAI system would automatically refer to the relationship information of all household members collected in the Entry Module to determine whether the longitudinal respondent had a spouse/partner or child living with them. Some general consistency edits were included as part of the CAI system, and interviewers were able to "slide back" to previous questions to correct for inconsistencies. Range edits were also built into the CAI system for questions asking for numeric values. If numbers entered were outside the range, the system generated a pop-up window.

The data were processed by applying edit rules to identify missing, invalid or inconsistent data. Each question was examined to verify the presence of a valid code. Non-response values from the CAI system were also recoded to standard non-response codes for refusals, don't know and not stated.

Consistency editing is carried out to verify the relationship between two or more variables. An example of a consistency problem could be when the personal income of the LR is higher than the total family income, of which it is only a part. The relationship edit step ensures a clean file and consistency in the relationships among members of the household. For example some respondents whose spouse had children reported being "unrelated" to the children.

Please refer to Chapter 9.0 (Data Processing) of the Microdata User Guide for detailed information.


For Wave 2 and Wave 3, the mass imputation strategy of Wave 1 could have been repeated. But, by doing so, longitudinal inconsistencies could have been introduced. These inconsistencies would have arisen for a couple of reasons: either a given longitudinal respondent (LR) could be complete in one wave and partial in the other; or, for a partial LR in both waves, a different donor might be chosen by independent imputation. These inconsistencies are of particular concern when imputing roster data (data file containing events, such as employment history), as they are used in the derivation of other variables.

In order to overcome these limitations and to save potential processing time a longitudinal mass imputation technique was established. The mass imputation at Wave 2 and Wave 3 was longitudinal in the sense that imputation was done simultaneously for data collected at all three waves.

The first step was to identify which modules had to be imputed longitudinally. For this purpose longitudinal completion codes were generated. At Wave 3, a LR was deemed as a longitudinal complete respondent if and only if the LR was a complete respondent in all three waves. Otherwise the LR was considered as a longitudinal partial respondent.

For longitudinal partial non-response in Wave 3, mass imputation for the incomplete modules was carried out using the nearest-neighbour donor technique. For a longitudinal partial respondent for whom more than one module was incomplete, the same donor record was used for all the incomplete modules. Note that only complete and edited records were used as potential donors. To keep consistency within modules, the complete set of variables for a given module of the donor was imputed into the recipient record.

Please refer to Chapter 11.0 (Imputation) of the Wave 3 User Guide for detailed information.


The first step of the weighting process was to predict for the unresolved units whether they would have been in the population of interest or not. Through models, using the information available on the frame, information collected in Wave 1 and Wave 2 and information on the resolved units in Wave 3, the status of the unresolved units was predicted.

The LSIC weighting strategy is based on a series of cascading adjustments. The final longitudinal weight is obtained by applying various adjustments to the initial weight. There are four weights involved in the weighting process which will compose the final weight; the initial weight, the non-response adjustment weight, the unresolved adjustment weight and finally the post-stratification weight.

For the Wave 3 weighting, the initial weight is the Wave 2 weight before post-stratification.

The non-response and the unresolved weight adjustment classes were derived using logistic regression models predicting respectively, the response probability and the resolution probability.

The purpose of post-stratification is to ensure consistency between the estimates produced from the survey and population estimates produced by an independent external source. Since the LSIC Wave 3 final weights give estimates of the Wave 3 population of interest and not the target population and since there is no independent external administrative source on this subject, the post-stratification totals must be estimated. For Wave 1, a post-stratification file was available (in other words, immigrant population sizes in the post-strata were known from an external source); the post-stratification totals for Wave 2 and Wave 3 had to be estimated. For the Wave 3 sample, the population of interest consists of all immigrants in the LSIC who are still in Canada four years after their arrival.

Please refer to Chapter 12.0 (Treatment of Total Non-response and Weighting) of the Wave 3 User Guide for detailed information.

Quality evaluation

For the LSIC, quality assurance measures were implemented at each phase of the data collection and processing cycles to monitor the quality of the data. These measures included precise interviewer training with respect to the survey procedures and questionnaire, observation of interviews to detect questionnaire design problems or misinterpretation of instructions, monitoring of final coding, and coding and edit quality checks to verify the processing logic.

Please refer to Chapter 13.0 (Data Quality and Coverage) of the Microdata User Guide for detailed information.

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

For the LSIC there will not be a public use microdata file (PUMF). The data for the survey may be accessed at one of Statistics Canada's Research Data Centres (RDC).

There will be a suppression of direct identifiers (name, address, etc.) and indirect identifiers (combination of variables identifying a respondent). Prior to release steps are taken to ensure that there is a minimum number of units in each cell, thus ensuring confidentiality and that the coefficient of variation is low enough for the data to be published.

Revisions and seasonal adjustment

This methodology does not apply to this survey.

Data accuracy

The variance of an estimate is a good indicator of the quality of the estimate. A high variance estimate is considered unreliable. In order to quantify large variance, a relative measure of the variability is used, namely the coefficient of variation (CV). The coefficient of variation is defined as the ratio of the square root of the variance over the estimate. The square root of the variance is also known as a standard deviation. The coefficient of variation, as opposed to the variance, allows the analyst to compare estimates of different magnitudes along the same scale. As a result, it is possible to assess the quality of any estimate with the CV.

Most importantly, variance or the CV is required for statistical tests such as hypothesis tests, which determine if two estimates are statistically different. Consequently, variance or CV calculation is mandatory.

It is almost impossible to derive an exact formula to calculate the variance for the LSIC due to the complex sample design, weight adjustments and post-stratification. A very good way to approximate the true variance is to use a replication method, namely the bootstrap method. This method is known to correctly approximate the true value of the variance. A file containing 1,000 bootstrap weights is available. Variance calculation using 1,000 bootstrap weights involves calculating the estimates with each of these 1,000 weights and then, calculating the variance of these 1,000 estimates.

There are a number of software and tools capable of producing bootstrap variance estimates. The use of one or more of these tools depends on the type of analysis and the level of precision required.

Please refer to Chapter 15.0 (Variance Calculation) of the Wave 3 User Guide for detailed information.


Date modified: