Survey of Approaches to Educational Planning (SAEP)

Record number:

The primary objective of the Survey of Approaches to Educational Planning (SAEP) is to improve our understanding of the processes by which the parents/guardians of children aged 0-18 marshal the monetary and non-monetary resources needed to successfully pursue post-secondary education.

Detailed information for 2002

Data release - November 20, 2003 (In 2008, the data were collected by the Access and Support to Education and Training Survey (ASETS, record number 5151).)


Statistics Canada was approached by Human Resources Development Canada (HRDC) to conduct a cross-sectional survey which would examine how Canadians are preparing their children for post-secondary education.

Parents/guardians can participate in several ways. They can pro-actively plan for the financing of their children's post-secondary education by putting aside savings for that purpose and by actively participating in government sponsored mechanisms that facilitate savings for post-secondary education (i.e., Registered Education Savings Plans (RESPs), Canada Education Savings Grants (CESGs)). They can also prepare in a non-monetary fashion by encouraging, guiding and supporting their children through their early education, thereby laying the groundwork for successful participation in post-secondary education.

The primary objective of the Survey of Approaches to Educational Planning (SAEP) is to improve our understanding of the processes by which the parents/guardians of children aged 0-18 marshal the monetary and non-monetary resources needed to successfully pursue post-secondary education. These include financial saving strategies, parents/guardians' attitudes and values in respect to post-secondary education, the child's demonstration of commitment to education through academic performance and extra-curricular involvement.


  • Education finance
  • Education, training and learning

Data sources and methodology

Target population

The target population consists of parents and guardians of Canadian children aged 0 to 18 years.

Instrument design

The current SAEP questionnaire was introduced in 2002. At the time, changes were made to the 1999 questionnaire in order to address existing data gaps, improve data quality and make use of Computer Assisted Interviewing (CAI).

The changes incorporated included the addition of new questions. For example, questions with dollar ranges were added to collect information on value of RESP, Contributions to RESP, Value of Savings and Contributions to Savings.

Since the 1999 questionnaire had been designed as a paper questionnaire, the questionnaire redesign represented an opportunity to make extensive use of the power of CAI. This included the incorporation of question wording that depended upon answers to earlier questions, more complex question flows and an extensive set of on-line edits checking for logical inconsistencies.

The implementation of the 2002 questionnaire followed an extensive reassessment of data requirements, input consultations with academic experts and the client (HRDC), questionnaire development and questionnaire testing by Questionnaire Development Resource Centre (QDRC). Questionnaire testing by QDRC were done via focus groups in both official languages.


This is a sample survey with a cross-sectional design.

The SAEP was administered to a sub-sample of the dwellings in the October 2002 Labour Force Survey (LFS) sample, and therefore its sample design is closely tied to that of the LFS. Because the SAEP was a supplement to the LFS, the same frame was used.

The LFS is a monthly household survey whose sample of individuals is representative of the civilian, non-institutionalized population 15 years of age or older in Canada's ten provinces. Specifically excluded from the survey's coverage are residents of the Yukon, Northwest Territories and Nunavut, persons living on Indian Reserves, full-time members of the Canadian Armed Forces and inmates of institutions. These groups together represent an exclusion of approximately 2% of the population aged 15 or over.

The LFS follows a rotating panel sample design, in which households remain in the sample for six consecutive months. The total sample consists of six representative sub-samples or panels, and each month a panel is replaced after completing its six month stay in the survey.

The SAEP used five of the six rotation groups in the October LFS sample. For the SAEP, the coverage of the LFS was modified to include only those households with at least one child aged 18 and under and, within those households, only one randomly selected child.

SAEP-eligible households that were pre-identified as being part of the National Longitudinal Survey of Children and Youth (record number 4450) were excluded from SAEP collection.

Data sources

Data collection for this reference period: 2002-10-20 to 2002-11-15

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

The SAEP was collected by way of Computer Assisted Telephone Interviewing (CATI). The SAEP collected information from the person most knowledgeable (PMK) about the selected child's education. If the PMK was not available, the interviewer arranged for a convenient time to phone back. Proxy response was not allowed.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Some editing was done directly at the time of the interview using the computer assisted program. Where the information entered was out of range (too large or small) of expected values, or inconsistent with previous entries, the interviewer was prompted, through message screens on the computer, to modify the information. However, interviewers had the option of bypassing the edits, and of skipping questions if the respondent did not know the answer or refused to answer. Therefore, the response data were subject to further edit processes once they arrive in head office.

At the head office, the first type of errors that were searched for were questionnaire flow errors. These errors happen when questions that did not apply to respondents were answered when they should not have. The responses are removed and replaced with a valid skip code. Questionnaire flow errors can also happen when the respondent was not asked questions that she/he should have been asked. For this type of questionnaire flow error a "not stated" code was assigned to these unanswered questions.

Further editing phases of processing involved the identification of logically inconsistent items and the modification of such conditions. Since the true value of each entry on the questionnaire was not known, the identification of errors could be done only through recognition of obvious inconsistencies. If a value was suspicious but reasonable, the erroneous value will have found its way into the survey statistics.

Where errors were detected, the erroneous items were either replaced by logically consistent values or not stated value. These changes were based on pre-specified criteria and involve the internal logic of the questionnaire. In order to make the changes, logic tables were developed and programmed and run on all the survey data to ensure that all the changes were done consistently and automatically.


Total non-response was handled by adjusting the weight of households who responded to the survey to compensate for those who did not respond.

In most cases, partial non-response to the survey occurred when the respondent did not understand or misinterpreted a question, refused to answer a question, or could not recall the requested

For the 2002 SAEP, donor imputation was used to fill missing data in household income and six key items (see table below). This was done in order to provide complete data, thereby allowing for totals to be estimated (e.g., total group RESP contributions in Ontario).

The six key items gathered information on the current value of, or annual contribution to, savings for the post-secondary education of children aged 0-18. The savings were in terms of registered educational savings plans (RESPs) or "other" (e.g., term deposits, guaranteed investment certificates (GICs), savings bonds, registered retirement savings plans (RRSPs), mutual funds).

Because the six items depend on previous questions (lead-ins), missing values in the lead-ins were imputed first. The lead-ins ask whether there are (or will be) savings and, if so, whether these savings are for the post-secondary education of children aged 0-18.

Imputation involved filling the missing values in household income, the six items and/or the lead-ins on a given record (the "recipient" record) using another record whose values were all known and whose characteristics were the "closest" (the "donor" record). The characteristics of each recipient were compared to those of each donor in a pool of donors. When a characteristic between the recipient and a donor were the same the weight (value) of that characteristic was added to a "score" for that donor. In the end, the donor with the highest score was deemed to be the closest, and was therefore chosen to fill the missing value(s) in the recipient. If there was more than one donor with the highest score, one donor was randomly selected. The pool of donors was made up in such a way that the imputed value assigned to the recipient, in conjunction with other non imputed items from the recipient, would still pass the edits.

Donor imputation was done in three steps. First, household income was imputed. This is partly because household income is an important factor in the donor score when imputing key items. Second, the five parents saving items and their corresponding lead-ins were imputed. These variables were imputed simultaneously for consistency and coherence. Finally, the others saving item and its corresponding lead-ins were simultaneously imputed.


The principle behind estimation in a probability sample such as the LFS is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of a population of 2,500 persons, each person in the sample represents 50 persons in the population.

The weighting phase is a step which calculates, for each record, what this number is. This weight appears on the microdata file, and must be used to derive meaningful estimates from the survey. For example, if the number of children whose parents/guardians have set aside savings for post-secondary education is to be estimated, it is done by selecting the records referring to those individuals in the sample with that characteristic and summing the weights entered on those records.

During processing of the data, 25 SAEP records did not match to corresponding records in the LFS. Thus they were coded as out of scope and were dropped from further processing. When supplementary survey records do not match to host survey records they must be dropped since a weight cannot be derived for them.

The principles behind the calculation of the weights for the SAEP are identical to those for the LFS. However, 5 adjustments are made to the LFS sub-weights in order to derive a final weight for the individual records on the SAEP microdata file:

1) An adjustment to account for the use of a five-sixth sub-sample, instead of the full LFS sample.

2) An adjustment to account for the additional non response to the supplementary survey i.e., non response to the SAEP for households with at least one child aged 0-18 years which did respond to the LFS or for which the previous month's LFS data was brought forward. The procedure is similar to the LFS non-response weight adjustment, but the groupings are based on province, unemployment insurance region, rotation group, design type, urban versus rural area, census metropolitan versus non-census metropolitan area, type of dwelling, economic family type, household size, and Mom/Dad characteristics such as education, labour force status and social occupation. Since households without children are out-of-scope (and therefore not selected into the SAEP), their weights are not affected by this step.

3) An adjustment for the number of all households (i.e., those with or without at least one child aged 0 -18 years) by household size (1,2 and 3+ people) by province, according to independent estimates.

4) An adjustment to account for the random selection of one child from the selected household. In particular, the weight of the selected child is multiplied by the number of children in the household, up to a maximum (cap) of 4 children.

5) An adjustment for the number of children by province, sex group, and age group (0 to 5, 6 to 12, 13 to 15 and 16 to 18 years old), according to independent estimates.

The resulting weight , WTPM, is the final weight which appears on the SAEP microdata file.

Based on the LFS design, 1000 boothstrap weights were generated and for each set the complete adjusted weighting process was applied.

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

In order to prevent disclosure of certain items, the SAEP dropped some variables, collapsed others, and even changed the value of highly visible variables to Not-Stated.

Data accuracy

The estimates derived from this survey are based on a sample of households. Somewhat different estimates might have been obtained if a complete census had been taken using the same questionnaire, interviewers, supervisors, processing methods, etc. as those actually used in the survey. The difference between the estimates obtained from the sample and those resulting from a complete count taken under similar conditions is called the sampling error of the estimate.

Errors which are not related to sampling may occur at almost every phase of a survey operation. Interviewers may misunderstand instructions, respondents may make errors in answering questions, the answers may be incorrectly entered on the questionnaire and errors may be introduced in the processing and tabulation of the data. These are all examples of non sampling errors.

Over a large number of observations, randomly occurring errors will have little effect on estimates derived from the survey. However, errors occurring systematically will contribute to biases in the survey estimates. Considerable time and effort was made to reduce non sampling errors in the survey. Quality assurance measures were implemented at each step of the data collection and processing cycle to monitor the quality of the data. These measures include the use of highly skilled interviewers, extensive training of interviewers with respect to the survey procedures and questionnaire, observation of interviewers to detect problems of questionnaire design or misunderstanding of instructions, procedures to ensure that data capture errors were minimized and coding and edit quality checks to verify the processing logic.


Data file