The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.
Data release – November 10, 2010
The National Longitudinal Survey of Children and Youth (NLSCY) is a long-term study of Canadian children that follows their development and well-being from birth to early adulthood. The NLSCY began in 1994 and is jointly conducted by Statistics Canada and Human Resources and Skills Development Canada (HRSDC), formerly known as Human Resources Development Canada (HRDC).
The study is designed to collect information about factors influencing a child's social, emotional and behavioural development and to monitor the impact of these factors on the child's development over time.
The survey covers a comprehensive range of topics including the health of children, information on their physical development, learning and behaviour as well as data on their social environment (family, friends, schools and communities).
Information from the NLSCY is being used by a variety of people at all levels of government, at universities, and policy-making organizations.
The target population comprises the non-institutionalized civilian population (aged 0 to 11 at the time of their selection) in Canada's 10 provinces. The survey excludes children living on Indian reserves or Crown lands, residents of institutions, full-time members of the Canadian Armed Forces, and residents of some remote regions.
All questionnaires were developed in coordination with HRSDC and an expert advisory group. All instruments were tested in focus groups and pilot surveys prior to collection.
This is a sample survey with a longitudinal design.
The NLSCY is a longitudinal survey consisting of several longitudinal and cross-sectional samples. The longitudinal samples are representative of the original longitudinal populations (i.e., the populations at the time of sample selection). Cross-sectional weights are provided when an age cohort can also be considered to be representative of a cross-sectional population.
All samples were drawn from the Labour Force Survey's (LFS) sample of respondent households.
The initial sample for Cycle 8 was comprised of 35,795 children and youths aged from 0 to 7 and 14 to 25 year-olds.
Data collection for this reference period: 2008-09-01 – 2009-07-31
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
Most questionnaires are administered by an interviewer using computer-assisted interviewing (CAI).
The collection has many components (up to two children per household are surveyed):
1. Child component (for 0-7 and 14 - 17 year olds): the respondent is the person most knowledgeable (PMK) about the child.
2. Adult component (for children for 0-7 and 14-17 year olds): the respondent is the PMK and, if applicable, the spouse of the PMK.
3. Youth component (for 16 and above): the respondent is the youth.
4. Cognitive tests
- Children aged 4-5 are administered three tests: the PPVT-R (revised Peabody Picture Vocabulary Test), the Who Am I? and the Number Knowledge.
- a mathematics test is administered to children in Grades 2 to 10 (for children aged 7 and 14-15).
- a problem-solving exercise was given to youth aged 16-17.
- a literacy assessment was administered to youth aged 18-19.
- a numeracy assessment was administered to youth aged 20-21.
5. Self-completed questionnaire (14 - 17 years olds): the respondent is the child or youth.
For the household collection, edits are built into the computer application (e.g., range, flow and consistency edits). Edits are also performed after collection, for example during the capture of data from paper questionnaires, as well as outlier detection once the cross-sectional and longitudinal weights have been calculated.
Imputation is only performed when there are missing values for the following questions: adult income, youth income, household income; and Motor and Social Development items. Hot deck donor imputation is used.
Missing values for all other questions are coded "Don't know", "Refusal" or "Not Stated" on the final data file. It is up to each data user to deal with partial non-response in a manner that is appropriate to the research being undertaken.
Three sets of weights are provided for point estimation: longitudinal weights for each of the four longitudinal populations; longitudinal weights for children who responded to all cycles since they joined the survey (called funnel weights) and cross-sectional weights for the 0-7 year olds at Cycle 8 (i.e., in 2008). Cross-sectional weights are not produced for any older children since these older samples are no longer considered to be representative of cross-sectional populations, only the original longitudinal ones.
Each child's final survey weight has been adjusted for nonresponse, and post-stratified by province, age and sex to match known population totals at the time of sample selection.
Various analyses are performed to determine whether the survey data meet the initial objectives. Among them are coverage and non-response analyses. Adjustments are also made to the design weights for the non-random nature of non-response. The weights are also post-stratified to correct for undercoverage. In addition, consistency checks are performed.
Since the original sample drawn at Cycle 1 has never been topped-up for immigrants, this sample is not considered to be representative of all children aged 14-25 at Cycle 8 (2008). It is for this reason that cross-sectional weights are no longer produced for this sample.
Longitudinal weights for children introduced in a given cycle and cross-sectional weights of children for that given cycle represent the same population. For example, the Cycle 8 longitudinal weights for the children introduced in Cycle 1 and the Cycle 1 cross-sectional weights represent the same population. Estimates for Cycle 1 variables with each set of weights have been compared to make sure the estimates produced are similar. The same process has been done with the other sets of weights.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
SAS and SPSS macros have been developed to calculate the sampling variance of an estimate using Bootstrap weights. These macros are available on the Statistics Canada website (refer to the "Additional documention" link below).
Users should keep in mind that the NLSCY is a general population survey and not designed for the analysis of rare characteristics or rare subpopulations within NLSCY which would yield small samples and result in high relative sampling variance.
Possible sources of nonsampling errors in the NLSCY include: response errors due to sensitive questions, poor memory, translated questionnaires, approximate answers, and conditioning bias; nonresponse errors; and coverage errors.
In the case of nonresponse errors, weight adjustments are performed to minimize the effect of bias due to total nonresponse. Some longitudinal respondents do not participate in every cycle. This is cycle non-response. When dealing with the longitudinal data for a respondent, data from every cycle is not necessarily available. For example, a child may be a respondent in Cycles 1, 3, 4, 5 , 6, 7 and 8, but not Cycle 2. If data from every cycle is crucial, the analyst can limit himself to children without cycle non-response and use the longitudinal weights for this group, variable HWTCWD1L.
Regarding partial nonresponse, imputation is performed on the following variables: adult income, youth income, household income, and Motor and Social Development items.
In the case of coverage errors, the longitudinal and cross-sectional weights are post-stratified to population counts to minimize coverage bias. Sources of coverage error arising from the use of the LFS include: only LFS respondents are selected by the NLSCY sample; and the NLSCY sample is selected based on the household's composition at the time of the LFS interview.
Because of the way that the NLSCY samples babies from the LFS, babies born at the end of the calendar year typically have a lower probability of selection than those born at the beginning of the year. This unequal distribution in the sample by birth month became pronounced at Cycles 6 and 7 and weight adjustments were performed: at Cycle 6, a uniform adjustment was added to the survey weights for 0 to 1 year-olds, at Cycle 7, the birth-month weight adjustment for 0 to 1 year-olds was refined. At Cycle 8, the methodology from Cycle 7 was used.