Canadian Income Survey (CIS)
Detailed information for 2013
The Canadian Income Survey (CIS) is a cross-sectional survey developed to provide a portrait of the income and income sources of Canadians, with their individual and household characteristics.
Data release - July 8, 2015
The primary objective of the Canadian Income Survey (CIS) is to provide information on the income and income sources of Canadians, along with their individual and household characteristics. The data collected in the CIS is combined with Labour Force Survey (LFS, record number 3701) and tax data.
The survey gathers information on labour market activity, school attendance, disability, support payments, child care expenses, inter-household transfers, personal income, and characteristics and costs of housing. This content is supplemented with information on individual and household characteristics (e.g. age, educational attainment, main job characteristics, family type), as well as geographic details (e.g. province, census metropolitan area (CMA)) from the LFS. Tax data for income and income sources are also combined with the survey data.
Results from the survey are made available not only to various levels of government, but also to individuals and organizations. All levels of government can use CIS data to shape policies and programs related to the economic well-being of Canadians. Statistical organizations such as the Organization for Economic Cooperation and Development (OECD) use the results for international benchmarking and comparison studies.
Reference period: Calendar year
Collection period: January through April of the year following the reference year.
- Families, households and housing
- Household, family and personal income
- Income, pensions, spending and wealth
- Low income and inequality
Data sources and methodology
All individuals in Canada, excluding residents of the Yukon, the Northwest Territories and Nunavut, residents of institutions, persons living on reserves and other Aboriginal settlements in the provinces and members of the Canadian Forces living in military camps. Overall, these exclusions amount to less than 3 percent of the population.
The survey in conducted nationwide, in both the provinces and the territories. It covers all individuals in Canada, excluding persons living on reserves and other Aboriginal settlements in the provinces, the institutionalized population, and households in extremely remote areas with very low population density. Overall, these exclusions amount to less than 2 percent of the population.
The survey is conducted nationwide, in both the provinces and the territories. It covers all individuals in Canada, excluding persons living on reserves and other Indigenous settlements in the provinces, the institutionalized population, and households in extremely remote areas with very low population density. Overall, these exclusions amount to less than two percent of the population.
Qualitative testing was carried out by Statistics Canada's Questionnaire Design Resource Centre (QDRC) for selected modules of the survey questionnaire, while questions for the remaining modules came from other Statistics Canada surveys. Question wording adheres as closely as possible to questions established by the Harmonized Content Committee at Statistics Canada.
The questionnaire follows standard practices and wording used in a computer-assisted interviewing environment, such as the automatic control of flows that depend upon answers to earlier questions and the use of edits to check for logical inconsistencies and capture errors. The computer application for data collection was tested extensively.
This is a sample survey with a cross-sectional design.
The Canadian Income Survey is administered to a sub-sample of LFS respondents. The LFS sample is drawn from an area frame and is based on a stratified, multi-stage design that uses probability sampling. The LFS total sample is composed of six independent samples, called rotation groups, because each month one sixth of the sample (or one rotation group) is replaced.
The 2013 CIS used four rotation groups from the LFS, i.e. the rotation group answering the LFS for the last time in January, February, March and April of 2014 (sample size is approximately 8,400 per rotation group).
Data collection for this reference period: 2014-01-19 to 2014-05-05
Responding to this survey is voluntary.
Data are collected directly from survey respondents and extracted from administrative files.
The LFS interview is completed by a responsible member of the household who generally provides the LFS responses for all members. Following the LFS interview, and subject to operational constraints, the interviewer then requests that this same member answers the CIS questionnaire. This person generally responds to the CIS questions for all household members aged 16 years or older, including the disability screening questions, which are only asked of one randomly selected household member aged 16 years or older.
Interviews are conducted from Statistics Canada's regional offices using a Computer Assisted Telephone Interviewing (CATI) application.
In order to reduce response burden and improve the accuracy of the data, CIS does not ask respondents questions on every aspect of their income. Rather, CIS retrieves this information from tax records. CIS respondents are informed of these plans during the interview, a practice which is called informed replacement.
View the Questionnaire(s) and reporting guide(s) .
A series of verifications are undertaken to ensure that the records are consistent and that collection and capture of the data do not introduce errors. Reported data are examined for completeness and consistency using automated edits coupled with manual review. Some responses reporting uncommon values or characteristics are processed manually.
Households are kept as respondents if information for at least one person in the household was provided, and any key data that is missing for individuals within responding households is imputed. Imputation is carried out for income variables as well as variables related to labour, school attendance, housing and utility costs.
CIS uses a nearest neighbour approach for the imputation of most income variables, and for labour, school attendance and housing variables. This imputation method involves the selection of a donor record based on matching variables. First, a set of matching variables, each of which is correlated with the variables to be imputed, is defined. Then, through the combined use of a score function (for categorical matching variables) and a distance function (for numeric matching variables), the most similar consistent donor record is identified and used to impute data for the record.
Deterministic imputation is also used for selected income variables. Amounts for certain government programs, such as refundable provincial tax credits, child benefits, and the Goods and Services/Harmonized Sales Tax Credit, are derived based on qualifying characteristics.
Cold-deck imputation using donor information from the 2011 National Household Survey (NHS) is used to impute utility costs for all CIS households. Imputation classes are formed to identify groups of NHS households sharing characteristics with the CIS household to be imputed. Data from a randomly selected (with replacement) NHS household is used to impute data for the CIS household.
The CIS sample is a sub-sample of the Labour Force Survey sample. LFS uses a complex random sampling plan to select households. Each household in the sample represents a number of other households in the population. Estimates for a given characteristic are obtained by multiplying the survey weight by the corresponding value of this characteristic. The key step in the point estimation process is therefore the derivation of the weight.
The initial weights are the LFS subweights, which are then adjusted to account for the fact that the CIS is a sub-sample of the LFS sample.
Two types of adjustment are then applied to these weights in order to improve the reliability of the estimates. The weights are first inflated to compensate for CIS non-response. Then, the non-response adjusted weights are further adjusted to ensure that estimates on relevant population characteristics respect population totals from sources other than the survey.
The first set of population totals used by the CIS are estimates provided by Statistics Canada's Demography Division of population counts based on the 2006 Census of Population. For each province, population counts for different age/sex groups, household size and economic family size are used. CIS also employs population counts for six Census Metropolitan Areas (Montreal, Toronto, Winnipeg, Calgary, Edmonton, and Vancouver).
The second set of totals is derived from the T4 file from the Canada Revenue Agency (CRA) and is intended to ensure that the weighted distribution of income (based on wages and salaries) in the dataset matches that of the Canadian population.
In order to estimate sampling variance, the bootstrap approach is used. A set of 1,000 bootstrap weights is produced.
A separate set of weights is created specifically for estimating disability. The initial weights are the CIS non-response adjusted weights. These weights are then inflated to account for the fact that only one person in the household among those aged 16 years or older is selected for the disability questions. They are further increased to compensate for non-response to these questions. To ensure that estimates of population characteristics respect population totals, weights are adjusted to match age/sex group counts within each province.
A set of 1,000 bootstrap weights is also produced in order to estimate sampling variance related to disability.
Results from the survey are compared with other data sources that include administrative databases and other Statistics Canada surveys.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Suppression rules, or data reliability cutoffs, are currently established based on a combination of the sample size that underlies the estimate and the coefficient of variation. In general, a sample size of 25 observations is required for the estimate to be published as well as a coefficient of variation less than or equal to 33.3%. These rules help protect the confidentiality of survey respondents and ensure the reliability of estimates.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
There are two types of errors inherent in sample survey data, namely, non-sampling errors and sampling errors. The reliability of survey estimates depends on the combined impact of non-sampling and sampling errors.
Non-sampling errors resulting from human errors such as simple mistakes, misunderstanding or misinterpretation will generally have a minor impact on the overall accuracy of the estimates. Errors occurring systematically and errors arising from sources such as coverage, erroneous response, non-response and processing can have, on the other hand, a major impact on the reliability of estimates. Considerable time and effort is invested into reducing non-sampling errors in CIS.
Coverage error arises when sampling frame units do not exactly represent the target population. Units may have been omitted from the sampling frame (undercoverage), or units not in the target population may have been included (overcoverage), or units may have been included more than once (duplicates). Undercoverage represents the most common coverage problem. Slippage is a measure of survey coverage error. It is defined as the percentage difference between control totals (postcensal population estimates) and weighted sample counts. In 2013, the CIS slippage rate was 10.7%.
Non-response can also impact the quality of survey estimates. The lower the response rate the greater the potential for non-response bias. Total non-response is dealt with by adjusting the weights of the respondents to account for the non-respondents. In 2013, the CIS collection response rate was 73.4%.
Sampling errors occur because inferences about the survey population are based on data from a sample of that population rather than the entire population. The sample design, the variability of the characteristic being measured, and the sample size will all contribute to the magnitude of the sampling error. The standard error is a common measure of sampling error. The standard error measures the degree of variation introduced in estimates by selecting one particular sample rather than another of the same size and design. Another widely used measure of the sampling error is the coefficient of variation (CV), which is the estimated standard error expressed as a percentage of the estimate. For example, the CV for the median after-tax income from the 2013 CIS was 0.6% for economic families and 1.5% for persons not in an economic family.