Canadian Income Survey (CIS)
Detailed information for 2012
The Canadian Income Survey (CIS) is a cross-sectional survey developed to provide a portrait of the income and income sources of Canadians, with their individual and household characteristics.
Data release - December 10, 2014
The primary objective of the Canadian Income Survey (CIS) is to provide information on the income and income sources of Canadians, along with their individual and household characteristics. The data collected in the CIS is combined with Labour Force Survey (LFS, record number 3701) and tax data.
The survey gathers information on labour market activity, school attendance, activity limitation, support payments, child care expenses, inter-household transfers, personal income, and characteristics and costs of housing. This content is supplemented with information on individual and household characteristics (e.g. age, educational attainment, main job characteristics, family type), as well as geographic details (e.g., province, census metropolitan area (CMA)) from the LFS. Tax data for income and income sources are also combined with the survey data.
Results from the survey are made available not only to various levels of government but also to individuals and organizations. All levels of government can use CIS data to shape policies and programs related to the economic well-being of Canadians. Statistical organizations such as the Organization for Economic Cooperation and Development (OECD) use the results for international benchmarking and comparison studies.
Reference period: Calendar year
- Families, households and housing
- Household, family and personal income
- Income, pensions, spending and wealth
- Low income and inequality
Data sources and methodology
All individuals in Canada, excluding residents of the Yukon, the Northwest Territories and Nunavut, residents of institutions and persons living on reserves and other Aboriginal settlements in the provinces. Overall, these exclusions amount to less than 3 percent of the population.
Qualitative testing was carried out by Statistics Canada's Questionnaire Design Resource Centre (QDRC) for selected modules of the survey questionnaire, while questions for the remaining modules came from other Statistics Canada surveys. Question wording adheres as closely as possible to questions established by the Harmonized Content Committee at Statistics Canada.
The questionnaire follows standard practices and wording used in a computer-assisted interviewing environment, such as the automatic control of flows that depend upon answers to earlier questions and the use of edits to check for logical inconsistencies and capture errors. The computer application for data collection was tested extensively.
This is a sample survey with a cross-sectional design.
The Canadian Income Survey is administered to a sub-sample of the individuals already selected for the Labour Force Survey (LFS), record number 3701. The LFS sample is drawn from an area frame and is based on a stratified, multi-stage design that uses probability sampling. The total sample is composed of six independent samples, called rotation groups, because each month one sixth of the sample (or one rotation group) is replaced.
The 2012 CIS used four rotation groups from the LFS, i.e. the rotation group answering the LFS for the last time in March, April, May and June of 2013 (sample size is approximately 8,400 per rotation group).
Data collection for this reference period: 2013-03-17 to 2013-07-02
Responding to this survey is voluntary.
Data are collected directly from survey respondents and extracted from administrative files.
The LFS interview is completed by a responsible member of the household who generally provides the LFS responses for all members. Following the LFS interview, and subject to operational constraints, the interviewer then requests that this same member answers the CIS questionnaire for all members.
Interviews are conducted from Statistics Canada's regional offices using a Computer Assisted Telephone Interviewing (CATI) application.
In order to reduce response burden and improve the accuracy of the data, CIS does not ask respondents questions on every aspect of their income. Rather, CIS retrieves this information from tax records. CIS respondents are informed of these plans during the interview, a practice which is called informed replacement.
View the Questionnaire(s) and reporting guide(s) .
A series of verifications are undertaken to ensure that the records are consistent and that collection and capture of the data do not introduce errors. Reported data are examined for completeness and consistency using automated edits coupled with manual review. Some responses reporting uncommon values or characteristics are processed manually.
Households are kept as respondents if information for at least one person in the household was provided, and any key data that is missing for individuals within responding households is imputed. Imputation is carried out for income variables as well as labour and school attendance variables. CIS uses a nearest neighbour approach. First, a set of matching variables, each of which is correlated with the variables to be imputed, is defined. Then, through the combined use of a score function (for categorical matching variables) and a distance function (for numeric matching variables), the most similar consistent donor record is identified and used to impute data for the record.
Amounts for certain government programs, such as refundable provincial tax credits, child benefits, and the Goods and Services/Harmonized Sales Tax Credit, are derived based on qualifying characteristics. Data from the tax files do not need imputation.
The CIS sample is a sub-sample selected from the Labour Force Survey sample. LFS uses a complex random sampling plan to select the households. Each household in the sample represents a number of other households in the population. Estimates for a given characteristic are obtained by multiplying the household weight by the corresponding value of the characteristic for the household. The key step in the point estimation process is therefore the derivation of the weight.
The initial weights are the LFS weights, which are adjusted to account for the fact that CIS is a sub-sample of the LFS sample.
Two types of adjustment are then applied to these weights in order to improve the reliability of the estimates. The weights are first inflated to compensate for non-response. The non-response adjusted weights are then further adjusted to ensure that estimates on relevant population characteristics respect population totals from sources other than the survey.
The first set of population totals used for CIS is based on Statistics Canada's Demography Division population counts for different age/sex groups for each province and six Census Metropolitan Areas (Montreal, Toronto, Winnipeg, Calgary, Edmonton, and Vancouver). Totals by household size and economic family size are also used for each province. These annual population totals are based in large part on totals from the 2006 Census of Population.
The second set of totals is derived from Canada Revenue Agency (CRA) administrative data (T4 file) and is intended to ensure that the weighted distribution of income (based on wages and salaries) in the data set matches that of the Canadian population.
In order to estimate sampling variance, the bootstrap approach is used. A set of 1,000 bootstrap weights is produced.
Results from the survey are compared with other data sources that include administrative databases and other Statistics Canada surveys.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Suppression rules, or data reliability cutoffs, are currently established based on a combination of the sample size that underlies the estimate and the coefficient of variation. In general, a sample size of 25 observations is required for the estimate to be published as well as a coefficient of variation less than or equal to 33.3%. These rules help protect the confidentiality of survey respondents and ensure the reliability of estimates.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
There are two types of errors inherent in sample survey data, namely, non-sampling errors and sampling errors. The reliability of survey estimates depends on the combined impact of non-sampling and sampling errors.
Non-sampling errors resulting from human errors such as simple mistakes, misunderstanding or misinterpretation will generally have a minor impact on the overall accuracy of the estimates. Errors occurring systematically and errors arising from sources such as coverage, erroneous response, non-response and processing can have, on the other hand, a major impact on the reliability of estimates. Considerable time and effort is invested into reducing non-sampling errors in CIS.
Coverage error arises when sampling frame units do not exactly represent the target population. Units may have been omitted from the sampling frame (undercoverage), or units not in the target population may have been included (overcoverage), or units may have been included more than once (duplicates). Undercoverage represents the most common coverage problem. Slippage is a measure of survey coverage error. It is defined as the percentage difference between control totals (postcensal population estimates) and weighted sample counts. In 2012, the CIS slippage rate was 9.4%.
Non-response can also impact the quality of survey estimates. The lower the response rate the greater the potential for non-response bias. Total non-response is dealt with by adjusting the weights of the respondents to account for the non-respondents. In 2012, the CIS collection response rate was 76.4%.
Sampling errors occur because inferences about the survey population are based on data from a sample of that population rather than the entire population. The sample design, the variability of the characteristic being measured, and the sample size will all contribute to the magnitude of the sampling error. The standard error is a common measure of sampling error. The standard error measures the degree of variation introduced in estimates by selecting one particular sample rather than another of the same size and design. Another widely used measure of the sampling error is the coefficient of variation (CV), which is the estimated standard error expressed as a percentage of the estimate. For example, the CV for the median after-tax income from the 2012 CIS was 0.5% for economic families and 1.3% for unattached individuals.