National Longitudinal Survey of Children and Youth (NLSCY)

Detailed information for 2002-2003 (Cycle 5)




Every 2 years

Record number:


The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.

Data release - February 21, 2005


The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The NLSCY began in 1994 and is jointly conducted by Statistics Canada and Human Resources and Skills Development Canada (HRSDC), formerly known as Human Resources Development Canada (HRDC).

The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.

The survey covers a comprehensive range of topics including the health of children, information on their physical development, learning and behaviour as well as data on their social environment (family, friends, schools and communities).

Information from the NLSCY is being used by a variety of people at all levels of government, at universities, and policy-making organizations.


  • Child development and behaviour
  • Children and youth
  • Education
  • Education, training and learning
  • Health and well-being (youth)

Data sources and methodology

Target population

The target population comprises the non-institutionalized civilian population (aged 0 to 11 at the time of their selection) in Canada's 10 provinces. The survey excludes children living on Indian reserves or Crown lands, residents of institutions, full-time members of the Canadian Armed Forces, and residents of some remote regions.

Instrument design

All questionnaires were developed in coordination with HRSDC and an expert advisory group. All instruments were tested in focus groups and pilot surveys prior to collection.


This is a sample survey with a longitudinal design.

The NLSCY is a longitudinal survey consisting of several longitudinal and cross-sectional samples. The longitudinal samples are representative of the original longitudinal populations (i.e., the populations at the time of sample selection). Cross-sectional weights are provided when an age cohort can also be considered to be representative of a cross-sectional population.

The longitudinal sample at Cycle 5 consists of three cohorts. The first cohort consists of children aged 0 to 11 at the time of their selection at Cycle 1 in 1994, who are 8-19 at Cycle 5. They will remain in the survey until they reach the age of 25. The second cohort is made up of children aged 0 to 1 at the time of their selection at Cycle 3 in 1998, who are 4-5 at Cycle 5. It is their final cycle in NLSCY. The third cohort consists of children aged 0 to 1 at the time of their selection at Cycle 4 in 2000, who are 2-3 at Cycle 5. These children will be interviewed one more time in Cycle 6.

For Cycle 5, children aged 0-5 as of December 31, 2002 can be considered representative of the 2002 cross-sectional population. This cross-sectional sample consists of the sample of 0-1 year olds selected in 2002, the 0-1 year olds selected in 2000 and the 0-1 year olds selected in 1998.

Most samples were drawn from the Labour Force Survey's (LFS) sample of respondent households, with the exception of one-year-olds sampled in 1998 and the five year-olds sampled in 2000 who were selected using provincial birth registry data since the LFS did not have enough eligible children to meet the survey's needs.

The sample design for children sampled from birth registry data is as follows: each province was divided into urban and rural strata. A simple random sample was selected in the rural stratum, and a two-stage design was used in the urban strata. At the first stage, a sample of geographic areas was drawn; at the second stage, a sample of children within each selected areas was drawn.

At Cycle 5, the sample consists of about 30,800 children and youth.

Data sources

Data collection for this reference period: 2002-09-09 to 2003-06-13

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

Most questionnaires are administered by an interviewer using computer-assisted telephone interviewing (CATI) for children aged 0 to 3, and computer-assisted personal interviewing (CAPI) for older children.

The collection has many components (up to two children per household are surveyed):

1. Child component (for 0-15 year olds): the respondent is the person most knowledgeable (PMK) about the child.

2. Adult component: the respondent is the PMK and, if applicable, the spouse of the PMK.

3. Youth component (for 16-19 year olds): the respondent is the youth.

4. Cognitive tests
- Children aged 4-5 are administered three tests: the PPVT-R (revised Peabody Picture Vocabulary Test), the Who Am I? and the Number Knowledge.
- a mathematics test is administered to children in Grade 2 and above (for children aged 8-15).
- a cognitive test was given to youth age 16-17.

5. Self-completed questionnaire (10 -- 19 years olds): the respondent is the child or youth.

6. Education component (children in kindergarten): the respondent is the child's teacher who fills out a paper questionnaire.

View the Questionnaire(s) and reporting guide(s).

Error detection

For the household collection, edits are built into the computer application (e.g., range, flow and consistency edits). Edits are also performed after collection, for example during the capture of data from paper questionnaires, as well as outlier detection once the cross-sectional and longitudinal weights have been calculated.


Imputation is only performed when there are missing values for the following questions: adult income, youth income, household income; adult labour force and Motor and Social Development items. Hot deck donor imputation is used.

Missing values for all other questions are coded "Don't know", "Refusal" or "Not Stated" on the final data file. It is up to each data user to deal with partial non-response in a manner that is appropriate to the research being undertaken.

For more details, see the Microdata User Guide.


Three sets of weights are provided for point estimation: longitudinal weights for each of the three longitudinal populations; longitudinal weights (called funnel weights) for children from Cycle 1 who responded to all cycles and cross-sectional weights for the 0-5 year olds at Cycle 5 (i.e., in 2002). Cross-sectional weights are not produced for any older children since these older samples are no longer considered to be representative of cross-sectional populations, only the original longitudinal ones.

For longitudinal cohorts, the initial weight from the preceding cycle is used. A nonresponse adjustment is calculated within each homogeneous response group, where these groups are determined using survey data from the previous cycle. The weights are then post-stratified by province, age and sex on the basis of population totals at the time of sample selection.

For the cross-sectional cohort, the initial weight for children coming from the LFS is the LFS subweight (which is the inverse of the probability of selection for the LFS). Adjustments are then made for the number of LFS rotation groups selected, multiple economic families and the number of eligible children; and non-response based on geographic response groups. The weights are then post-stratified to counts for the 2002 cross-sectional population.

Approximate coefficient-of-variation Excel spreadsheets are available to users, as well as Bootstrap weights for sampling variance estimation.

Quality evaluation

Various analyses are performed to determine whether the survey data meet the initial objectives. Among them are coverage and non-response analyses. Adjustments are also made to the design weights for the non-random nature of non-response. In addition, consistency checks are performed.

Longitudinal weights for children introduced in a given cycle and cross-sectional weights of children for that given cycle represent the same population. For example, the Cycle 5 longitudinal weights for the children introduced in Cycle 1 and the Cycle 1 cross-sectional weights represent the same population. Estimates for Cycle 1 variables with each set of weights have been compared to make sure the estimates produced are similar. The same process has been done with the other sets of weights.

With no top-ups done for the children aged 2-5 in Cycle 5, it is known there is undercoverage for these ages when using the cross-sectional weights. However, this undercoverage is fairly small. However, for the 8-19 years old, this undercoverage is more important. This is one of the reasons explaining the non production of cross-sectional weights for the children aged 8 to 19.

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. Suppression of direct identifiers (name, address, etc.) and indirect identifiers (combination of variables identifying a respondent) is used.
The Quebec Institute of Statistics is engaged in a study of child development called the Quebec Longitudinal Study of Child Development (QLSCD). Past respondents of the National Longitudinal Survey of Children and Youth (NLSCY) are being asked to participate in the QLSCD. NLSCY respondents are being contacted in January and February 2010 to obtain their permission to release their contact information and collected survey data to the Québec Institute of Statistics.

Revisions and seasonal adjustment

In 2004, cross-sectional, longitudinal and bootstrap weights (for Cycles 1 through 4) were updated to reflect the 2001 Census population counts. The intended impact of updating the survey weights is to create accurate, representative estimates that reflect the growing population of Canadians.

Data accuracy

A set of Excel spreadsheets, with a user-friendly Visual Basic interface, is available to users to obtain approximate sampling variances of proportions. SAS and SPSS macros have also been developed to calculate the sampling variance of an estimate using Bootstrap weights. These macros and spreadsheets are available at Statistics Canada Research Data Centres (RDCs).

Users should keep in mind that the NLSCY is a general population survey and not designed for the analysis of rare characteristics or rare subpopulations within NLSCY which would yield small samples and result in high relative sampling variance.

Possible sources of nonsampling errors in the NLSCY include: response errors due to sensitive questions, poor memory, translated questionnaires, approximate answers, and conditioning bias; nonresponse errors; and coverage errors.

In the case of nonresponse errors, weight adjustments are performed to minimize the effect of bias due to total nonresponse. Some longitudinal respondents do not participate in every cycle. This is cycle non-response. When dealing with the longitudinal data for a respondent, data from every cycle is not necessarily available. For example, a child may be a respondent in Cycles, 1, 3, 4, and 5, but not Cycle 2. If data from every cycle is crucial, the analyst can limit himself to children without cycle non-response and use the longitudinal weights for this group, variable EWTCWd1L.

Regarding partial nonresponse, imputation is performed on the following variables: adult income, youth income, household income, adult labour force, and Motor and Social Development items.

In the case of coverage errors, the longitudinal and cross-sectional weights are post-stratified to population counts to minimize coverage bias. The NLSCY uses multiple frames, the main one being the Labour Force Survey's samples. Sources of coverage error arising from the use of the LFS include: only LFS respondents are selected by the NLSCY sample; and the NLSCY sample is selected based on the household's composition at the time of the LFS interview.

In Cycle 3, provincial birth registry data were used to sample one year-olds born in 1997, consequently one year olds born outside of Canada in 1997 were excluded; and some births may not have been registered until after the sample was selected. Similar bias applies to the five year-olds who were sampled in Cycle 4, using birth registry data for 1995 births.

Regarding coverage of the cross-sectional population for 2-5 year olds in Cycle 5, immigrants (including interprovincial migrants) who are members of the cross-sectional population, but were not present at the time of sample selection, are excluded from the sample.

The NLSCY has non-uniform coverage of children, by month of birth, for births in 1997 and 1998: it excludes births in January, February, March or April of 1997; and for children born in 1998 (4 years-old at Cycle 5), births in January, February and March are overrepresented. Also, there are no children aged 34, 35, or 36 months at the time of the Cycle 5 interview.


Report a problem on this page

Is something not working? Is there information outdated? Can't find what you're looking for?

Please contact us and let us know how we can help you.

Privacy notice

Date modified: