The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.
Data release – November 24, 2008
The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The NLSCY began in 1994 and is jointly conducted by Statistics Canada and Human Resources and Skills Development Canada (HRSDC), formerly known as Human Resources Development Canada (HRDC).
The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.
The survey covers a comprehensive range of topics including the health of children, information on their physical development, learning and behaviour as well as data on their social environment (family, friends, schools and communities).
Information from the NLSCY is being used by a variety of people at all levels of government, at universities, and policy-making organizations.
The target population comprises the non-institutionalized civilian population (aged 0 to 11 at the time of their selection) in Canada's 10 provinces. The survey excludes children living on Indian reserves or Crown lands, residents of institutions, full-time members of the Canadian Armed Forces, and residents of some remote regions.
All questionnaires were developed in coordination with HRSDC and an expert advisory group. All instruments were tested in focus groups and pilot surveys prior to collection.
This is a sample survey with a longitudinal design.
The NLSCY is a longitudinal survey consisting of several longitudinal and cross-sectional samples. The longitudinal samples are representative of the original longitudinal populations (i.e., the populations at the time of sample selection). Cross-sectional weights are provided when an age cohort can also be considered to be representative of a cross-sectional population.
All samples were drawn from the Labour Force Survey's (LFS) sample of respondent households.
The initial sample for Cycle 7 was comprised of 37,655 children and youths aged from 0 to 9 and 12 to 23 year-olds.
Data collection for this reference period: 2006-09-01 – 2007-07-31
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
Most questionnaires are administered by an interviewer using computer-assisted interviewing (CAI).
The collection has many components (up to two children per household are surveyed):
1. Child component (for 0-17 year olds): the respondent is the person most knowledgeable (PMK) about the child.
2. Adult component (for children for 0-17 year olds): the respondent is the PMK and, if applicable, the spouse of the PMK.
3. Youth component (for 16 and above): the respondent is the youth.
4. Cognitive tests
- Children aged 4-5 are administered three tests: the PPVT-R (revised Peabody Picture Vocabulary Test), the Who Am I? and the Number Knowledge.
- a mathematics test is administered to children in Grades 2 to 10 (for children aged 7-15).
- a problem-solving exercise was given to youth aged 16-17.
- a literacy assessment was administered to youth aged 18-19.
- a numeracy assessment was administered to youth aged 20-21.
5. Self-completed questionnaire (12 - 17 years olds): the respondent is the child or youth.
For the household collection, edits are built into the computer application (e.g., range, flow and consistency edits). Edits are also performed after collection, for example during the capture of data from paper questionnaires, as well as outlier detection once the cross-sectional and longitudinal weights have been calculated.
Imputation is only performed when there are missing values for the following questions: adult income, youth income, household income; adult labour force and Motor and Social Development items. Hot deck donor imputation is used.
Missing values for all other questions are coded "Don't know", "Refusal" or "Not Stated" on the final data file. It is up to each data user to deal with partial non-response in a manner that is appropriate to the research being undertaken.
For more details, see the Microdata User Guide.
Three sets of weights are provided for point estimation: longitudinal weights for each of the five longitudinal populations; longitudinal weights for children who responded to all cycles since they joined the survey (called funnel weights) and cross-sectional weights for the 0-9 year olds at Cycle 7 (i.e., in 2006). Cross-sectional weights are not produced for any older children since these older samples are no longer considered to be representative of cross-sectional populations, only the original longitudinal ones.
Each child's final survey weight has been adjusted for nonresponse, and post-stratified by province, age and sex to match known population totals at the time of sample selection.
Approximate coefficient-of-variation Excel spreadsheets are available to users, as well as Bootstrap weights for sampling variance estimation.
Various analyses are performed to determine whether the survey data meet the initial objectives. Among them are coverage and non-response analyses. Adjustments are also made to the design weights for the non-random nature of non-response. The weights are also post-stratified to correct for undercoverage. In addition, consistency checks are performed.
Since the original sample drawn at Cycle 1 has never been topped-up for immigrants, this sample is not considered to be representative of all children aged 12-23 at Cycle 7 (2006). It is for this reason that cross-sectional weights are no longer produced for this sample.
Longitudinal weights for children introduced in a given cycle and cross-sectional weights of children for that given cycle represent the same population. For example, the Cycle 7 longitudinal weights for the children introduced in Cycle 1 and the Cycle 1 cross-sectional weights represent the same population. Estimates for Cycle 1 variables with each set of weights have been compared to make sure the estimates produced are similar. The same process has been done with the other sets of weights.
Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. Suppression of direct identifiers (name, address, etc.) and indirect identifiers (combination of variables identifying a respondent) is used.
The Quebec Institute of Statistics is engaged in a study of child development called the Quebec Longitudinal Study of Child Development (QLSCD). Past respondents of the National Longitudinal Survey of Children and Youth (NLSCY) are being asked to participate in the QLSCD. NLSCY respondents are being contacted in January and February 2010 to obtain their permission to release their contact information and collected survey data to the Québec Institute of Statistics.
A set of Excel spreadsheets, with a user-friendly Visual Basic interface, is available to users to obtain approximate sampling variances of proportions. SAS and SPSS macros have also been developed to calculate the sampling variance of an estimate using Bootstrap weights. These macros and spreadsheets are available at Statistics Canada Research Data Centres (RDCs).
Users should keep in mind that the NLSCY is a general population survey and not designed for the analysis of rare characteristics or rare subpopulations within NLSCY which would yield small samples and result in high relative sampling variance.
Possible sources of nonsampling errors in the NLSCY include: response errors due to sensitive questions, poor memory, translated questionnaires, approximate answers, and conditioning bias; nonresponse errors; and coverage errors.
In the case of nonresponse errors, weight adjustments are performed to minimize the effect of bias due to total nonresponse. Some longitudinal respondents do not participate in every cycle. This is cycle non-response. When dealing with the longitudinal data for a respondent, data from every cycle is not necessarily available. For example, a child may be a respondent in Cycles 1, 3, 4, 5 , 6 and 7, but not Cycle 2. If data from every cycle is crucial, the analyst can limit himself to children without cycle non-response and use the longitudinal weights for this group, variable GWTCWD1L.
Regarding partial nonresponse, imputation is performed on the following variables: adult income, youth income, household income, adult labour force, and Motor and Social Development items.
In the case of coverage errors, the longitudinal and cross-sectional weights are post-stratified to population counts to minimize coverage bias. Sources of coverage error arising from the use of the LFS include: only LFS respondents are selected by the NLSCY sample; and the NLSCY sample is selected based on the household's composition at the time of the LFS interview.
The NLSCY also has non-uniform coverage by month of birth for children who were born in 2000, 2002, 2004 or 2006; there is undercoverage of babies born at the end of the year. (For more details, see the User's Guide.)