Survey of Household Spending (SHS)
Detailed information for 2017
Every 2 years
The main purpose of the survey is to obtain detailed information about household spending as well as limited information on dwelling characteristics and household equipment.
Data release - December 12, 2018
- Questionnaire(s) and reporting guide(s)
- Data sources and methodology
- Data accuracy
The SHS primarily collects detailed information on household expenditures. It also collects information about the annual income of household members (from personal income tax data), demographic characteristics of the household, certain dwelling characteristics (e.g., type, age and tenure) and certain information on household equipment (e.g., electronics and communications equipment). The survey is conducted in the 10 provinces and the 3 territorial capitals every 2 years starting from 2017.
SHS data are used at Statistics Canada by the Canadian System of Macroeconomic Accounts as an input to calculate the gross domestic product (GDP) and by the Consumer Prices Division to calculate basket weights for the Consumer Price Index (CPI). In addition, federal and provincial governments use the data to develop social and economic policies and programs. Various groups also use the data to address issues directly or indirectly related to Canadians' spending habits.
Collection period: The data are collected on a continuous basis from January to December of the survey reference year, from a sample of households spread over twelve monthly collection cycles.
- Families, households and housing
- Household characteristics
- Household spending and savings
- Housing and dwelling characteristics
- Income, pensions, spending and wealth
Data sources and methodology
The target population is the population of Canada's 10 provinces, as well as the territorial capitals of Whitehorse, Yellowknife and Iqaluit, excluding residents of institutions, members of the Canadian Forces living in military camps and people living on Indian reserves. In all, these exclusions account for about 2% of the population.
For operational reasons, people living in some remote areas where the rate of vacant dwellings is very high and where the collection cost would be exorbitant, are excluded from collection. Also excluded, in addition to people living in institutions, are people living in other types of collective dwellings:
- people living in residences for dependent seniors;
- people living permanently in school residences, work camps, etc.; and
- members of religious and other communal colonies.
Collection exclusions make up less than 0.5% of the target population. However, these people are included in the population estimates to which the SHS estimates are adjusted.
Since 2010 in the provinces and 2015 in the three territorial capitals, the SHS combines the use of a questionnaire and an expenditure diary. The questionnaire is used for the most part to collect regular and less frequent expenses during a computer assisted personal interview. The diary is used to collect frequent or smaller expenses, which are difficult to recall during a retrospective interview.
In order to introduce this new collection model, each of the many expenditure items to be collected by the SHS was assigned a collection mode in advance, which was either the questionnaire or the diary. For the questionnaire items, a reference period of one, three or twelve months, last payment, or four weeks, was also selected. Those choices were largely based on the results of qualitative tests, on the international expertise in that type of collection model and on the studies conducted to gauge the potential increase in the variances of estimates due to shortening of reference periods. These studies were based on the data from the Consumer Expenditure Survey conducted by the U.S. Bureau of Labor Statistics.
The content of the questionnaire was determined in consultation with the primary internal and external users of the survey data.
This is a sample survey with a cross-sectional design.
In the 10 provinces, the sampling unit in the first stage of sampling is the geographic area (referred to as a cluster). In the second stage, the sampling unit is the dwelling.
In the three territorial capitals, the sampling unit is the dwelling.
A stratified multi-stage sampling design was used to select the sample in the 10 provinces. It is essentially a two-stage design, of which the first stage is a sample of geographic areas (referred to as clusters). Next, a list of all the dwellings in the selected clusters is prepared and a sample of dwellings is selected within each cluster. The selected dwellings that are inhabited by members of the target population constitute the survey's sample of households. The SHS uses a number of components of the Labour Force Survey's (LFS) sample design to minimize operating costs, though the dwellings selected for the SHS are different than those selected for the LFS.
The national sample is first allocated among the provinces based on the variability of total household expenditures and, to a lesser extent, the number of households in each province. The goal is to obtain estimates of similar quality across all provinces. The sample is then divided into strata defined by grouping clusters with similar characteristics based on a number of socio-demographic variables. Some strata were defined to target specific subpopulations such as high-income households. To improve the quality of the estimates, the high-income household strata are allocated a larger share of the sample than the allocation proportional to stratum size that is used in other strata.
A one-stage sampling design was used to select the sample in the three territorial capitals. The first step of the sample allocation was to determine the number of dwellings to be sampled in each city. The overall sample was allocated to each city taking into account the size of the city and the quality of the estimates obtained from previous cycles of the SHS in the North.
Sampling and sub-sampling:
The target sample of the 2017 Survey of Household Spending consists of 17,500 households across the 10 provinces, and 900 households across the three territorial capitals.
Since data are collected monthly, the sample is divided into 12 subsamples of similar sizes. During that process, the SHS sample is coordinated with the samples of the LFS and, to a lesser extent, the Canadian Community Health Survey (CCHS), which use the same sampling frame and conduct personal interviews for part of their sample. Coordination means that, wherever possible, if a cluster is selected for more than one survey, collection for the surveys will take place in the same month. This enables the interviewer to become familiar with the neighborhood, collect the data and carry out the necessary follow-up for more than one survey at a time.
Data collection for this reference period: 2017-01-02 to 2017-12-31
Responding to this survey is voluntary.
Data are collected directly from survey respondents and extracted from administrative files.
Data from respondents are collected through a computer-assisted personal interview and using a paper diary.
Households in the sample are asked first to respond to complete the personal interview which mainly collects regular expenditures (such as rent and electricity) and less frequent expenditures (such as furniture and dwelling repairs) for a recall period that varies in length depending on the type of expenditure. For regular expenditures, the amount of the last payment and the period it covers are typically collected. For other types of expenditures collected in the interview, recall periods of one, three or twelve months are used. The recall periods are defined in terms of months preceding the month of the interview. For example, for a household in the June sample, "the last three months" corresponds to the period from March 1 to May 31. Demographic characteristics, dwelling characteristics and household equipment information, which are also collected in the interview, refer to the household's situation at the time of the interview.
Following the interview, respondents selected to complete the expenditure diary are asked to record the expenditures of all household members for a period of two weeks starting the day after the interview. Households are required to include all of their spending, except for a few types of expenditures, such as rent, regular utilities payments, and real estate and vehicle purchases. Households have the option of providing receipts of their purchases made during the two-week period in order to reduce the amount of information manually recorded in the diary. However, they are asked to write out additional information on the receipt if the description of the item appearing on the receipt is incomplete.
A telephone follow-up is carried out a few days after the interview to address any questions the respondent may have about the diary and to provide important information about how it should be completed. At the end of the two-week period, the interviewer returns to the respondent's residence to pick up the diary and ask a few additional questions to help the respondent report expenditures that may have been forgotten.
In the 10 provinces, 50% of households sampled for the interview are selected to complete the diary for a period of 2 weeks. In the territorial capitals, all households sampled for the interview are also selected to complete the 2-week expenditure diary.
The diaries and all receipts supplied by respondents are scanned and captured at Statistics Canada's head office. An expenditure classification code is assigned to each item from a list of over 650 different codes.
Household income for the SHS is derived by linking income tax information from the Canada Revenue Agency (CRA) to household members. Respondents are informed that the survey data will be combined with tax data to obtain personal income information for household members aged 16 and over on December 31 of the calendar year preceding the survey year. Income is imputed for individuals who do not agree to have their tax data linked as well as those for which a linkage to income tax information is unsuccessful.
The SHS links income tax data to survey respondents using deterministic and probabilistic record linkage techniques.
View the Questionnaire(s) and reporting guide(s).
The questionnaire includes many features designed to maximize the quality of the data collected. Many edits are built into the questionnaire to compare the reported data with tolerance thresholds and to detect logical inconsistencies. When an edit fails, the interviewer is prompted to correct the information with the respondent's help, if necessary. Once the data are transmitted to Head Office, a comprehensive series of processing steps is undertaken for the purpose of verifying each questionnaire in detail. Invalid responses are corrected or flagged for imputation.
A number of edits are also carried out on the diary data when the diaries are received at Head Office and throughout the capture and coding stages of data processing. For example, checks are carried out to ensure that the start and end dates of the reference period of the diary are indicated, that the reported expenditures were made during the specified reference period, and that there are no duplicated items that appear in both in the diary and on the receipts provided by the respondent. After validation, capture and coding, quality control procedures are applied. A sample of diaries is selected and completely rechecked to ensure that the diaries were captured and coded as specified in the procedures.
Following this initial processing, a series of detailed edits are applied to all diary data. Invalid responses are corrected or flagged for imputation. The final step is to assess whether the information reported in the diaries is of sufficient quality using parameters which differ according to the household characteristics. The reported expenditures and number of items are compared with minimum thresholds estimated for each geographic area (Atlantic Provinces, Quebec, Ontario, Prairie Provinces, and British Columbia), each household income class and each household size. Diaries that satisfy the conditions are deemed usable. The other diaries are examined. They will be deemed usable if there is a note explaining why the number and value of all reported items is low. Diaries that do not meet the usability criteria are excluded from the estimates.
Donor imputation by the nearest neighbour method is generally used to solve problems of missing or invalid information in interview questions. Data from another respondent with similar characteristics (donor) are used to impute. The imputation is done on one group of variables at a time, with the groups formed on the basis of the relationships among the variables. The characteristics used to identify the donor are selected such that as they are correlated with the variables to be imputed. Household income, dwelling type as well as the number of adults and children are commonly used characteristics. The household income used for imputation is taken from the personal income tax data and equals the sum of incomes of all household members aged 16 and over on December 31st of the calendar year preceding the survey.
Donor imputation is also used when information is missing from the daily expenditure diary. A respondent may have reported a particular expenditure item without its cost or given a total amount spent (on groceries, for example) without listing the individual items. Over time, it has been observed that a growing number of households report grocery totals in the diary instead of detailed expenses for individual grocery items. Starting from SHS 2017, a new imputation method was implemented in order to improve the process of distributing the grocery totals among different grocery items.
Imputation is also used to enhance the level of detail in coding the reported items. For example, the information provided by the respondent may simply indicate that a bakery product was purchased, but a more detailed code is required to meet the survey's needs. In this case, donor imputation is used to impute the type of bakery product (e.g., bread, crackers, cookies, cakes and other pastries, etc.). Diary imputation is carried out at the reported item level, and the characteristics most often used to identify the donor are cost, available partial item code, household income and household size. Imputation is done by province and quarter to control for provincial differences and the seasonality of expenditures.
For personal income, respondents are matched to their records in the personal income tax data file. Missing or invalid tax data are generally donor imputed.
Income and expenditure imputation is performed primarily with Statistics Canada's Canadian Census Edit and Imputation System (CANCEIS).
After imputation, taxes are added to those diary items that respondents are told to report without taxes. The applicable Goods and Services Tax (GST) and the Provincial Sales Tax (PST), or the Harmonized Sales Tax (HST) are added to these diary items, according to the appropriate federal and provincial taxation rates.
The estimation of population characteristics from a sample survey is based on the premise that each sampled household represents a certain number of other households in addition to itself. This number is referred to as the survey weight, and the weighting process involves computing the weight assigned to each household. There are a number of steps in this process.
First, each household is given an initial weight equal to the inverse of its selection probability. A few adjustments are later applied to the interview weights and the diary weights.
The interview weights are first adjusted to take into account the households that did not answer the questionnaire. They are then adjusted so that selected survey estimates agree with aggregates or estimates from independent auxiliary sources.
The diary weights are also subject to a series of adjustments. A first factor adjusts for nonresponse to the questionnaire. A second factor compensates for households that respond to the questionnaire but refuse to complete the diary. The weights are also adjusted to demographic estimates in a manner similar to that used for the interview.
More information on the interview and diary weights can be found in the User Guide for the Survey of Household Spending, 2017.
All interview and diary expenditure variables are annualized. To do this they are multiplied by an appropriate factor based on their reference period. Some expenditure data are also corrected by another adjustment factor when they have been identified "influential" or outlier. For the diary, another adjustment is made to compensate for the non-responded days.
For a category of expenditure collected using the interview, estimates are equal to the sum of the annualized, adjusted and weighted (using interview weights) expenditures for that category. Estimates for an expenditure category derived from diary data are calculated in a similar manner using diary weights and the appropriate annualization and adjustment factors. For expenditure categories that include data from both collection vehicles, estimates are based on the sum of estimates from the diary and from the interview.
Starting with 2017, a new method of influential value detection (conditional bias method) was introduced to identify weighted expenditure amounts for a given household and a given item that are much larger or smaller than the weighted amounts of other households for that same item in a given geographic area. Adjustments are made to the most extreme influential expenditure estimates. This method corrects a larger number of influential values but applies smaller adjustments than the previous method.
When all processing and estimation steps are complete, the data are compared with the previous year's estimates and when possible, with other data sources such as the Census, administrative sources and other Statistics Canada surveys.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Suppression rules are applied to the various tables of SHS estimates to ensure respondent confidentiality.
Revisions and seasonal adjustment
The 2017 SHS estimates were revised with weights adjusted to 2017 demographic population estimates based on 2016 Census data as well as more recent information from administrative sources such as birth, death and migration registers.
The standard error is a common measure of sampling error. It is the degree of variation in the estimates due to the selection of one particular sample rather than another. The standard errors for the SHS are estimated using the bootstrap method. The coefficient of variation (CV) is the standard error expressed as a percentage of the estimate. SHS CVs are available for the national and provincial estimates as well as for estimates by household type, age of reference person, income quintile, household tenure and size of area of residence.
The CV, at the national level for total household expenditures, is 0.77% (only the 10 provinces are included).
For the 10 provinces altogether, the response rate for the 2017 SHS interview is 66.9%. The final diary response rate (defined as the percentage of usable diaries relative to the number of households selected to fill out the diary) is 41.3%.
For the three territorial capitals combined, the interview response rate is 64.4% for the 2017 SHS. In the three territorial capitals, all households selected for the interview are also selected to fill out a diary. The final diary response rate is 33.7%.
Non-sampling errors occur because certain factors make it difficult to obtain accurate responses and to ensure that these responses retain their accuracy throughout processing. Unlike sampling errors, non-sampling errors are not easily quantified. Four sources of non-sampling error can be identified: coverage error, response error, non-response error and processing error. For more details about these errors refer to SHS 2019 user guide.
Errors due to non-response occur when potential respondents do not provide the required information or when the information they provide is unusable. The main impact of non-response on data quality is that it can cause a bias in the estimates if the characteristics of non-respondents differ from those of respondents in a way that impacts the expenditures studied. While non-response rates can be calculated, they provide only an indication of data quality, since they do not measure the degree of bias present in the estimates. The magnitude of non-response can be considered a simple indicator of the risks of bias in the estimates.
While the weights of respondent households are adjusted to compensate for non-respondent households, partial non-response, such as failure to answer some questions, is handled through imputation.
- Household Expenditures Research Paper Series - User Guide for the Survey of Household Spending
This report describes the quality indicators produced for the Survey of Household Spending. These quality indicators, such as coefficients of variation, nonresponse rates, slippage rates and imputation rates, help users interpret the survey data.
Last review : January 27, 2017.