Longitudinal Survey of Immigrants to Canada (LSIC)

Detailed information for 2001 (Wave 1)





Record number:


This survey was designed to provide information on how new immigrants adjust to life in Canada and to understand the factors that can help or hinder this adjustment. The data will be used to evaluate the current services available and help improve them.

Data release - September 4, 2003


There exists a growing need for information on recent immigrants to Canada. As part of adapting to life in Canada, many immigrants face challenges such as finding suitable accommodation, learning or becoming more fluent in one or both of Canada's official languages, participating in the labour market or accessing education and training opportunities. The results of this survey will provide indicators of how immigrants are meeting these and other challenges. While integration may take many years, the LSIC is designed to examine the first four years of settlement, a time when newcomers establish economic, social and cultural ties to Canadian society. To this end, the objectives of the survey are two-fold: to study how new immigrants adjust to life in Canada over time; and, to provide information on the factors that can facilitate or hinder this adjustment.

Topics covered in the survey include language proficiency, housing, education, foreign credential recognition, employment, health, values and attitudes, the development and use of social networks, income, and perceptions of settlement in Canada.

Since immigration is a shared jurisdiction between national and provincial departments the information gathered through the LSIC will be beneficial to many groups including both federal and provincial government departments, immigrant settlement agencies, non-governmental organizations and researchers. Survey results will also play an important role in planning and developing programs that will assist future immigrants settling in Canada.

Note that a public use microdata file will not be created for this survey. Data from the survey may be accessed through Statistics Canada's Research Data Centres (RDC). For more information visit Statistics Canada's Research Data Centres site at http://www.statcan.ca/english/rdc/index.htm.


  • Educational attainment
  • Education, training and skills
  • Immigration and ethnocultural diversity (formerly Ethnic diversity and immigration)
  • Integration of newcomers
  • Labour market and income
  • Mobility and migration
  • Population and demography

Data sources and methodology

Target population

The target population for the survey consists of immigrants who meet all of the following criteria:

- arrived in Canada between October 1, 2000 and September 30, 2001;
- were age 15 years or older at the time of landing;
- landed from abroad, must have applied through a Canadian Mission Abroad.

Individuals who applied and landed from within Canada are excluded from the survey. These people may have been in Canada for a considerable length of time before officially "landing" and would therefore likely demonstrate quite different integration characteristics to those recently arrived in Canada. Refugees claiming asylum from within Canada are also excluded from the scope of the survey.

The LSIC population of interest is immigrants in the target population who are still living in Canada at the time of the interview.

Instrument design

Pilot Test Questionnaire:

The first step in the development of the LSIC was a pilot test, which took place in the spring of 1997. Wherever possible, to ensure comparability, questions for the pilot test were based on existing questions from other Statistics Canada surveys. Members of the LSIC team met with federal departments and a committee of external researchers and academics during the development of the survey instrument. There were external reviews conducted by Citizenship and Immigration Canada and Statistics Canada analysts, as well as by internal management. Focus group testing was conducted as well as consultations with an external advisory committee and project steering committee. By compiling and reviewing comments from the various sources the instrument was then modified and the remarks were incorporated into the final product.

Wave 1:

The pilot test questionnaire was used as a basis for Wave 1. Content was modified and expanded upon based on the results of the pilot test. New content was designed in consultation with other federal departments as well as experts in the field of immigration. Additional focus groups were conducted on the changes prior to Wave 1 collection. The questionnaire was put through a rigorous testing process including modular, integrative and end to end testing.


This is a sample survey with a longitudinal design.

To adequately represent the different immigration patterns in Canada over a one-year period, the sample is made up of 12 cohorts, consisting of 12 independent monthly samples selected over a period of 12 consecutive months. The sampling frame for the LSIC is an administrative database of all landed immigrants to Canada that comes from Citizenship and Immigration Canada. The sample was created using a two-stage stratified sampling method. The first stage involved the selection of Immigrating Units (IU) using a probability proportional to size method. The second stage involved the selection of one IU member within each selected IU. The selected member of the IU is called the longitudinal respondent (LR). Only the LR is followed throughout the survey.

The survey involves a longitudinal design with immigrants being interviewed at three different times: at six months, two years, and four years after landing in Canada. The sample design has been developed using a "funnel-shaped" approach--i.e. a monotonic design--therefore only immigrants that responded to the Wave 1 interview were traced for the Wave 2 interview and only those that respond to the Wave 2 interview will be traced for the Wave 3 interview.

The first stratification variable used was the month of landing in Canada. Within each month, two other stratification variables were used: the intended province of destination and the class of immigrant. The sample was divided into two components - the core and the additional samples. The core sample represents the target population, while the additional samples target specific sub-populations. The determination of the sample size for Wave 1 was based on several sample attrition hypotheses applied to the Wave 3 minimum sample size requirement.

Please refer to Chapter 5.0 (Survey Methodology) of the Wave 1 User Guide for detailed information.

Data sources

Data collection for this reference period: 2001-04-01 to 2002-03-31

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

All of the information for the sampled individuals was collected in a face-to-face, or telephone interview when a face-to-face was not possible, using a computer-assisted interviewing (CAI) application. Interviews were conducted in one of 15 languages. The 15 languages selected cover approximately 93% of the immigrant population in Canada. For Wave 1, interviews lasted approximately 90 minutes.

The Wave 1 LSIC target population consists of immigrants who have been in Canada for only six months. For a variety of reasons, new immigrants are a highly mobile population during their first six months in Canada. Respondent tracking is therefore necessary.

The first contact was established with the selected respondents using the address and telephone number provided on the sample file by Head Office. Verification of respondent was done in two ways: matching of birth date and landing date. If the interviewer was unable to locate the respondent the case was transferred to a designated tracing team in the regional offices, for further follow up.

In the LSIC, proxy interviews are not allowed. The only exception is in the Income module, where the person most knowledgeable (PMK) regarding family income is asked to answer the questions.

Please refer to Chapter 6.0 (Data Collection) of the Wave 1 User Guide for detailed information.

View the Questionnaire(s) and reporting guide(s) .

Error detection

The information for the sampled individuals was collected in a face-to-face or telephone interview using a computer-assisted interviewing application. As such, it was possible to build various edits and checks into the questionnaire in order to ensure that high quality information was collected.

Flow pattern edits were automatically built into the CAI system. For example, for questions pertaining to a spouse/partner or child, the CAI system would automatically refer to the relationship information of all household members collected in the Entry Module to determine whether the longitudinal respondent had a spouse/partner or child living with them. Some general consistency edits were included as part of the CAI system, and interviewers were able to "slide back" to previous questions to correct for inconsistencies. Range edits were also built into the CAI system for questions asking for numeric values. If numbers entered were outside the range, the system generated a pop-up window.

The data were processed by applying edit rules to identify missing, invalid or inconsistent data. Each question was examined to verify the presence of a valid code. Non-response values from the CAI system were also recoded to standard non-response codes for refusals, don't know and not stated.

Consistency editing is carried out to verify the relationship between two or more variables. An example of a consistency problem could be when the personal income of the LR is higher than the total family income, of which it is only a part. The relationship edit step ensures a clean file and consistency in the relationships among members of the household. For example some respondents whose spouse had children reported being "unrelated" to the children.

Please refer to Chapter 7.0 (Data Processing) of the Wave 1 User Guide for detailed information.


Partial and item non-response are corrected by various techniques of imputation. For item non-response, deterministic imputation was performed. Deterministic imputation is the process by which another source of data is used for a similar concept and from the exact same respondent. For the LSIC, if a respondent did not report information for certain pre-determined variables (i.e. mother tongue, date of birth, etc.), the information was imputed from the Field Operations Support System (FOSS). The FOSS is an administrative database of all landed immigrants to Canada that comes from Citizenship and Immigration Canada.

For partial non-response, mass imputation for the incomplete modules was carried out using the nearest-neighbour donor technique. For a partial respondent for whom more than one module was incomplete, the same donor record was used for all the incomplete modules. Note that only complete and edited records were used as potential donors. To keep consistency within variables, the complete set of variables for a given module of the donor was imputed into the recipient record. In total, mass imputation to complete partial responses was performed on 5% of all responding records.

Two imputation techniques were also performed specifically for the Income Module: nearest-neighbour donor imputation for some fields and median imputation for certain identified outliers.

Please refer to Chapter 9.0 (Imputation) of the Wave 1 User Guide for detailed information.


The first step of the weighting process was to predict for the unresolved units whether they would have been in the population of interest or not. Through models, using the information available on the frame and from the resolved units, the status of the unresolved units was predicted.

The LSIC weighting strategy is based on a series of cascading adjustments. The final longitudinal weight is obtained by applying various adjustments to the basic initial design weight. There are four weights involved in the weighting process which will compose the final weight; the design weight, the non-response adjustment weight, the resolved adjustment weight and finally the post-stratification weight.

At the time of selection, an initial design weight was assigned to the selected person. It is simply the inverse of the probability of selection of the selected immigrants.

The non-response and the unresolved weight adjustment classes were derived using logistic regression models predicting respectively, the response probability and the resolution probability.

The purpose of post-stratification is to ensure consistency between the estimates produced from the survey and population estimates produced by an independent external source. The post-stratification file still represents the target population. The file was created with the same definitions and criteria as the survey frame, but with more up-to-date files. For example, it included new units, excluded deaths and/or updated missing or improperly specified variables that were on the survey frame. The post-stratification variables used were: age group, sex, place of birth (collapsed by world area) and class of immigrant.

Please refer to Chapter 10.0 (Treatment of Total Non-response and Weighting) of the Wave 1 User Guide for detailed information.

Quality evaluation

For the LSIC, quality assurance measures were implemented at each phase of the data collection and processing cycles to monitor the quality of the data. These measures included precise interviewer training with respect to the survey procedures and questionnaire, observation of interviews to detect questionnaire design problems or misinterpretation of instructions, monitoring of final coding, and coding and edit quality checks to verify the processing logic.

Please refer to Chapter 11.0 (Data Quality and Coverage) of the Wave 1 User Guide for detailed information.

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

For the LSIC there will not be a public use microdata file (PUMF). The data for the survey may be accessed at one of Statistics Canada's Research Data Centres (RDC).

There will be a suppression of direct identifiers (name, address, etc.) and indirect identifiers (combination of variables identifying a respondent). Prior to release steps are taken to ensure that there is a minimum number of units in each cell, thus ensuring confidentiality and that the coefficient of variation is low enough for the data to be published.

Revisions and seasonal adjustment

This methodology does not apply to this survey.

Data accuracy

Two user-friendly tools, both using the bootstrap weights, have been developed to help users calculate the variance and the coefficients of variation (CV) for their estimates. SAS and STATA macros have been developed to calculate the variance using the bootstrap weights. The second tool available for users to obtain approximate coefficients of variation is the Excel based CV extraction module (CVEM). This application, developed with Excel macros and accessed through a user-friendly interface, allows user to extract the desired information in two ways. One is by describing the domain of interest with the nine available variables, and the other is by specifying the size of the domain.

Please refer to Chapter 13.0 (Variance Calculation) of the Wave 1 User Guide for detailed information.


Date modified: