Survey of Young Canadians (SYC)

Detailed information for 2010-2011





Record number:


The Survey of Young Canadians provides nationally representative indicators on child development.

Data release - April 16, 2012


The Survey of Young Canadians provides nationally representative indicators on child development.

The objectives of the Survey of Young Canadians are:

- To determine the prevalence of various risk and protective factors for children.
- To provide information on child development (such as cognitive, emotional and behavioural development).
- To make this information available for developing policies and programs that will help children.
- To collect information about the environment in which the child is growing up--family, peers, school, and community.


  • Child care
  • Child development and behaviour
  • Children and youth
  • Education
  • Health and well-being (youth)

Data sources and methodology

Target population

The target population consists of Canadian children 1 to 9 years of age living in the 10 provinces, excluding those living on an Indian reserve or in an institution.

Instrument design

The questionnaire is inspired by the one used for the National Longitudinal Survey of Children and Youth (NLSCY). Some sections were modified to better meet the needs of the Survey of Young Canadians (and its cross-sectional nature), but many aspects of the NLSCY were retained. NLSCY content was developed in coordination with our client, Human Resources and Skills Development Canada (HRSDC), and an expert advisory group. All modified content was tested in focus groups prior to collection.


This is a sample survey with a cross-sectional design.

One of the main objectives is to produce indicators of early child development at the provincial level, by age, for younger children (1 to 5 years old). There is also a need for national-level data on 6 to 9 year old children. The target population is stratified by age and by province to ensure that the sample is representative while remaining efficient. It was determined that a sample of approximately 17,000 units would yield results with the desired accuracy.

Respondents were randomly selected from administrative files produced by Statistics Canada using information obtained from the Canada Revenue Agency.

The sampling unit is the child, but the respondent is their parent or guardian; as such, to reduce response burden, the design prohibits two children from the same household from being selected for the sample.

Data sources

Data collection for this reference period: 2010-11-01 to 2011-03-19

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

The questionnaires were administered by an interviewer using computer-assisted interviewing (CAI). The first contact for all interviews was by telephone. At the beginning of the interview the Person Most Knowledgeable (PMK) about the child is identified. The survey consists of three questionnaires:

- Child component: the respondent is the PMK (proxy only)
- PMK component: the respondent is the PMK (non-proxy only)
- Spouse of the PMK component: the respondent is the spouse of the PMK or the PMK (proxy or non-proxy)

If the selected child is 4 or 5 years old at the time of interview the PMK may also be asked to schedule an in-home interview to complete three assessments (revised Peabody Picture Vocabulary Test, Number Knowledge and Who am I?).

View the Questionnaire(s) and reporting guide(s).

Error detection

The use of CAI (Computer Assisted Interview) allows for complex flows and edits to be built into the questionnaire, helping with data quality and ensuring that respondents answer only the questions appropriate to their situations. The following are the methods used in the SYC CAI:

Review screens - These were created for important and complex information. For example, the selection procedures for the person most knowledgeable (PMK), a critical element of the survey, are based on the household roster. The household roster screen shows the demographic information for each household member and his/her relationship to every other household member. The collected information is displayed on the screen for the interviewer to confirm with the respondent before continuing the interview.

Range edits - These were built into the CAI system to deal with questions asking for numeric values. If values entered are outside the range, the system generates a pop-up window that states the error and instructs the interviewer to make corrections to the appropriate question.

Flow pattern edits - All flow patterns were automatically built into the CAI system. For example, in the Child Care section, the PMKs are asked whether they have ever used regular child care. Based on the response given, the flow of the questions could be different. If child care is used, the CAI system continues with a series of questions about the specific types of child care used for the child and the time spent in each type of care. If not, the CAI system automatically skips this series of questions.

General consistency edits - Some consistency edits were included as part of the CAI system to allow interviewers to return to previous questions to correct for inconsistencies. Instructions are displayed to interviewers for handling or correcting problems such as incomplete or incorrect data. The system generates a pop-up window that states the error and instructs the interviewer to return to the appropriate question to confirm the data and make corrections as required.

After the collection of the SYC data, the following steps were performed:
Relationship edits - This edit step validates the relationships among the members of the household and creates the family-derived variables. Standard edits are made to the relationship information entered for all members of a given household; some inconsistencies are corrected automatically using a set of rules, whereas others are flagged for manual review and recoding.

Flow edits - These edits replicate the flow patterns from the questionnaire. Records that were sent down the incorrect flow by the application have their data corrected either by replacing it with 'valid skip' if they were asked incorrect questions, or 'not stated' for questions that were supposed to be asked but weren't.

Consistency editing - After the flow edits were completed, consistency editing was carried out to verify the relationship between two or more variables.


Imputation in the SYC was performed on income, Motor and Social Development (MSD) and the direct measures, Peabody Picture Vocabulary Test - Revised (PPVT-R), Who Am I (WAI) and the Number Knowledge Assessment (NK).
Income imputation was carried out to assign values to missing items as well as to rectify incoherencies when possible.

The SYC collected data on three components of income: PMK, Spouse and Household.

Two imputation methods were used: deterministic imputation and nearest-neighbour imputation. Wherever possible, missing or inconsistent values were imputed using the deterministic method, which involved adding or subtracting the available components to determine the value of the missing component. Deterministic imputation was used only for households that had no other members aged 18 or older and only one missing income source. When deterministic imputation was impossible, nearest-neighbour imputation was used. Impudon, a generalized statistical routine (written as a SAS macro) was used for performing donor imputation for income.

22.9% of the SYC records had the income of the PMK imputed. 20.4% of the records had the income of the PMK's spouse imputed and 29.0% of the records had total household income imputed.

MSD: Imputation was done on records considered eligible. These were records in which 13 or 14 of the 15 items of the MSD scale had responses.

Imputation was carried out by the nearest-neighbour method. A donor record was chosen at random from among the children having complete responses and the same response pattern to the common questions. When one item was imputed, the "Yes" or "No" from the selected donor replaced the original missing value. When two items were imputed, these were done independently. Imputation was performed using SAS.

5.9% of all eligible children had 1 or 2 MSD items imputed.

Direct measures: Imputation was carried out by the nearest-neighbour method. In the few cases where the child's language or age in months was not obtained in the telephone interview, deterministic imputation was performed using the information in the sampling frame.

Children who were deemed able to perform the direct measures activities, as determined from the questions at the end of the telephone interview, were imputed when necessary. That imputation was carried out at the level of the raw scores for each direct measure, not at the level of the constituent elements or questions.

Imputation was carried out by the nearest-neighbour method.

When more than one direct measure required imputation for a given recipient, the same donor was used to impute all missing direct measures.
Impudon, a generalized statistical routine (written as a SAS macro) was used for performing donor imputation for the direct measures.

21.2% of the eligible children had their PPVT-R raw score imputed. 20.9% of the eligible children had their number knowledge assessment raw score imputed. 21.6% of the eligible children had their WAI raw score imputed.


The Survey of Young Canadians (SYC) is a probability survey. As is the case with any probability survey, the sample is selected so as to be able to produce estimates for a reference population. Therefore, each unit in the sample represents a number of units in the population.

Survey weights are calculated by taking the child's design weight and making adjustments for survey non-response and post-stratification to ensure that the final survey weights sum to known counts of children by age and province. The design weight is the inverse of the probability of selection, that is, the probability that a child in the population is selected into the SYC sample.

First adjustment: Subsampling for parents with more than one child selected in the sample

Since the SYC sample was selected from a list of children, a parent may be selected more than once in the initial sample as the person most knowledgeable (PMK) concerning his or her children. To reduce the response burden, subsampling was done to ensure that there would be only one child per PMK in the final sample.

Second adjustment: Non-response adjustment

We needed to adjust the weights so that the respondents represent the non-respondents. Otherwise, we would, for example, underestimate totals. To decide how to assign the weight of the non-respondents to the respondents, we apply the method of response homogeneous groups (RHGs). The RHG method involves grouping individuals with the same likelihood of response. Then an adjustment factor is computed for each RHG. In the first step, the weights of the cases that we were not able to contact were given to cases for which a contact was made (respondents and other non-respondents) while with the second step, the weights of the other non-respondents were given to the respondents.

Third adjustment: Post-stratification

The last adjustment factor ensures consistency between the estimates produced by SYC and Statistics Canada's population estimates by age and province. This method is called post-stratification. The purpose of this adjustment is to ensure that the sum of the weights match known population totals.

Numbers used in the post-stratification refer to the population counts on December 31, 2010, as estimated by Statistics Canada.

Sampling variance calculation

It would be difficult (not to say impossible) to derive an exact formula to calculate the sampling variance for the SYC because of the sample design, non-response adjustments, treatment of out-of-scope units and post-stratification. Actually, such a task could only be undertaken under such strong assumptions as to yield a framework too simplistic to be of any use in practice. One way to approximate the sampling variance is to use the bootstrap method. With that method, we generate a set of 1,000 weights, known as bootstrap weights, which are derived from the survey weights and used to estimate the variance of the estimates. These 1,000 bootstrap weights are available for the SYC on a separate data file.

Quality evaluation


The SYC is conducted by Statistics Canada interviewers. Project managers and senior interviewers are responsible for ensuring that SYC interviewers are familiar with the survey's concepts and procedures. The project managers and/or senior interviewers working on the SYC were trained at the Statistics Canada head office, and then in turn delivered the same training to the interviewers in the regional offices.


In the event a respondent refused to be interviewed, Statistics Canada interviewers are trained basic refusal conversion techniques. If a respondent is adamant, the interviewer is instructed to obtain as much information as possible about the respondent (such as why they are refusing to participate) and refer the case to the senior interviewer. The senior interviewer then attempts to contact the respondent and convert the case. If the senior interviewer is unable to do so, a letter is sent to the respondent as a final effort to convert the case.


In the event a household could not be reached due to incorrect/outdated contact information from the sample file, the case was transferred to tracing. Attempts were then made to find the correct contact information using various means.

The survey applications

The use of CAI allows for complex flows and edits to be built into the questionnaire, helping with data quality and ensuring that respondents answer only the questions appropriate to their situations. The survey application underwent testing at Statistics Canada to ensure that it functioned properly. During collection review screens, range edits, flow pattern edits and general consistency edits were used for quality control.


Once the data were collected, they underwent processing steps to check the quality of the data. The steps were pre-edit, flow edit and consistency edit. After the consistency edit step, the derived variables were checked to ensure that they were programmed correctly and that all records were assigned their correct values. The key indicators from the SYC were compared to a past survey that had similar content to ensure there were no issues with the SYC means and frequencies.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

In order to prevent any data disclosure, confidentiality analysis is done using the Statistics Canada Generalized Disclosure Control System (G-Confid). G-Confid is used for primary suppression (direct disclosure) as well as for secondary suppression (residual disclosure). Direct disclosure occurs when the value in a tabulation cell is composed of or dominated by few enterprises while residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.

Revisions and seasonal adjustment

This methodology does not apply to this survey.

Data accuracy

The survey population is built using the list of applicants to the Canadian Child Tax Benefit (CCTB), which is a monthly file provided by the Canada Revenue Agency (CRA).The difference between the target and survey populations consists of the children for whom no parent or guardian applied for CCTB benefits. This may include families who are not aware of the benefit, or who chose not to request it. It may also include children who lived with a foster family for the entire year, as their caregivers would have been subsidized by the provincial government. The coverage of the CCTB was estimated to be around 96% in 2009, by comparing with demographic projections that were based on the 2006 Canadian Census of Population. Analyses revealed no important undercoverage bias.

The overall response rate for the SYC is 64.9%. The provincial response rates range between 56.3% and 71.7%. New Brunswick, Ontario and British Columbia are the provinces with the lowest response rates. The age group response rates varied over a much narrower range, from 60.4% (8-year-olds) to 68.6% (9-year-olds).

Date modified: