Canadian Community Health Survey - Annual Component (CCHS)
Detailed information for 2009
The central objective of the Canadian Community Health Survey (CCHS) is to gather health-related data at the sub-provincial levels of geography (health region or combined health regions).
Data release - June 15, 2010 (first in a series of data releases)
- Questionnaire(s) and reporting guide(s)
- Data sources and methodology
- Data accuracy
In 1991, the National Task Force on Health Information cited a number of issues and problems with the health information system. To respond to these issues, the Canadian Institute for Health Information (CIHI), Statistics Canada and Health Canada joined forces to create a Health Information Roadmap. From this mandate, the Canadian Community Health Survey (CCHS) was conceived.
The CCHS is a cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. It relies upon a large sample of respondents and is designed to provide reliable estimates at the health region level. The CCHS has the following objectives:
- Support health surveillance programs by providing health data at the national, provincial and intra-provincial levels;
- Provide a single data source for health research on small populations and rare characteristics;
- Timely release of information easily accessible to a diverse community of users;
- Create a flexible survey instrument that includes a rapid response option to address emerging issues related to the health of the population.
Prior to 2007, data collection occurred every two years on an annual period. Data are available for the 2001, 2003 and 2005 periods. In 2007, major changes were made to the survey design with the goal of improving its effectiveness and flexibility through data collection on an ongoing basis. Data collection now occurs every year, rather than every two years as was the case prior to 2007.
The CCHS produces an annual microdata file and a file combining two years of data. The CCHS collection years can also be combined by users to examine populations or rare characteristics.
The primary use of the CCHS data is for health surveillance and population health research. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information collected from respondents to monitor, plan, implement and evaluate programs to improve the health of Canadians. Researchers from various fields use the information to conduct research to improve health. Non-profit health organizations and the media use the CCHS results to raise awareness about health, an issue of concern to all Canadians.
Reference period: Varies according to the question (for example: "over the last 12 months", "over the last 6 months", "during the last week", etc.)
Collection period: January to December
- Diseases and health conditions
- Lifestyle and social conditions
- Prevention and detection of disease
Data sources and methodology
The CCHS covers the population 12 years of age and over living in the ten provinces and the three territories. Excluded from the survey's coverage are: persons living on reserves and other Aboriginal settlements in the provinces; full-time members of the Canadian Forces; the institutionalized population and persons living in the Quebec health regions of Région du Nunavik and Région des Terres-Cries-de-la-Baie-James. In Nunavut, the coverage is limited to the ten largest communities* which represents about 70% of the Nunavut population. Altogether, these exclusions represent less than 3% of the target population.
*The 10 largest communities in Nunavut are: Iqaluit, Rankin Inlet, Cambridge Bay, Kugluktuk, Cape Dorset, Pangnirtung, Igloolik, Pond Inlet, Baker Lake and Arviat.
Each component of the CCHS questionnaire is developed in collaboration with specialists from Statistics Canada, other federal and provincial departments and/or academic fields. The CCHS questions are designed for computer-assisted interviewing (CAI), meaning that, as the questions were developed, the associated logical flow into and out of the questions was programmed. This includes specifying the type of answer required, the minimum and maximum values, on-line edits associated with the question and what to do in case of item non-response.
The CCHS has three content components: the common content, the optional content and the rapid response content. The common content is collected from all survey respondents. Some modules are collected every year and remain relatively unchanged over several years. Other common modules are collected for one or two years and rotate every two or four years. The optional content fulfils the need for data at the health region level. This content, while often harmonized across the province, is unique to each region or province and may vary from year to year. The rapid response component is offered to organizations interested in national estimates on an emerging or specific issue related to the population's health. The rapid response content may be included in the survey in each collection period, that is, in every two month period. The data will be released shortly after the collection period via data availability announcement in the Daily.
New modules and revisions to existing CCHS content are tested using different methods. Qualitative tests using individual cognitive interviews or, more rarely, focus groups are used to ensure that questions and concepts are appropriately worded. Field testing can also be conducted to test new modules or significant revisions of the collection instrument. This kind of test was conducted before CCHS began. The test involved Statistics Canada's Regional Offices. The main objectives of the test were to observe respondent reaction to the survey, to obtain estimates of time for the various sections, to study the response rates, and to test feedback questions. Field operations and procedures, interviewer training and the data collection computer application were also tested.
In addition to the field test, the computer application for data collection is extensively tested in-house each time changes are made. The objective of these tests is to identify any errors in the program flow and text before the start of the main survey.
This is a sample survey with a cross-sectional design.
To provide reliable estimates to the 121 health regions (HRs), a sample of 65,000 respondents is required on an annual basis. A multi-stage sample allocation strategy gives relatively equal importance to the HRs and the provinces. In the first step, the sample is allocated among the provinces according to the size of their respective populations and the number of HRs they contained. Each province's sample is then allocated among its HRs proportionally to the square root of the population in each HR.
The CCHS uses three sampling frames to select the sample of households: 49% of the sampled households comes from an area frame, 50% comes from a list frame of telephone numbers and the remaining 1% comes from a Random Digit Dialling (RDD) telephone number frame. For most of the health regions, 50% of the sample is selected from the area frame and 50% from the list frame of telephone numbers. In two health regions (Nord-du-Québec and Prairie North), only the RDD frame is used. In Nunavut, only the area frame is used. In the Yukon and Northwest Territories, most of the sample comes from the area frame but a small RDD sample is also selected in the territorial capitals.
The CCHS uses the area frame designed for the Labour Force Survey (LFS) as its area frame. Thus, the sampling plan of the LFS must be considered in selecting the CCHS dwelling sample. The LFS plan is a complex two stage stratified design in which each stratum is formed of clusters. The LFS first selects clusters using a sampling method with a probability proportional to size (PPS), and then the final sample is chosen using a systematic sampling of dwellings in the cluster. The CCHS uses the LFS clusters, which it then stratifies by HRs. Lastly, it selects a sample of clusters and dwellings in each HR.
The list frame of telephone numbers is used in all but four HRs to complement the area frame. The list frame is an external administrative frame of telephone numbers updated every six months. It is stratified by HR by means of a postal code conversion file in order to match the HRs to the telephone numbers. Telephone numbers are selected using a random sampling process in each HR.
Lastly, in four HRs, a Random Digit Dialling (RDD) sampling frame of telephone numbers is used in accordance with the working banks technique, whereby only 100-number banks with at least one valid residential telephone number are retained. The banks are grouped in RDD strata to encompass, as closely as possible, the HR areas. Within each stratum, a 100-number bank is randomly chosen and a number between 00 and 99 is generated at random to create a complete, ten-digit telephone number. This procedure is repeated until the required sample size is reached.
The size of the sample is enlarged during the selection process to account for non responses and units outside the coverage (for example, vacant dwellings, institutions, telephone numbers not in use, etc.).
Once the dwelling or telephone number sample has been chosen, the next step is to select a member in each household. This decision is made at the time of contact for data collection. All members of the household are listed and a person aged 12 years or over is automatically selected using various selection probabilities based on age and household composition.
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
The CCHS questionnaire is administered using computer-assisted interviewing (CAI). Sample units selected from the area frame are interviewed using the Computer-Assisted Personal Interviewing (CAPI) method while units selected from the Random Digit Dialling (RDD) and telephone list frames are interviewed using the Computer-Assisted Telephone Interviewing (CATI) method.
CAI offers a number of data quality advantages over other collection methods. First, question text, including reference periods and pronouns, is customised automatically based on factors such as the age and sex of the respondent, the date of the interview and answers to previous questions. Second, edits to check for inconsistent answers or out-of-range responses are applied automatically and on-screen prompts are shown when an invalid entry is recorded. Immediate feedback is given to the respondent and the interviewer is able to correct any inconsistencies. Third, questions that are not applicable to the respondent are skipped automatically.
CAPI interviewers work independently from their homes using laptop computers and are supervised from a distance by senior interviewers. Completed interviews are transmitted daily to Statistics Canada's head office using a secure telephone transmission directly from the interviewer's home. CATI interviewers work in centralised offices and are supervised by a senior interviewer located in the same office. Transmission of cases from each of 4 CATI offices to head office is the responsibility of the regional office project supervisor, senior interviewer and the technical support team.
An automated call scheduler, for example a central system to optimise the timing of call-backs and the scheduling of appointments, is used to support CATI collection.
View the Questionnaire(s) and reporting guide(s) .
Some editing of the data is performed at the time of the interview by the computer-assisted interviewing (CAI) application. It is not possible for interviewers to enter out-of-range values and flow errors are controlled through programmed skip patterns. For example, CAI ensures that questions that do not apply to a respondent are not asked. In response to some types of inconsistent or unusual reporting, warning messages are invoked but no corrective action is taken at the time of the interview.
Several edits are performed at Head Office during the data processing step. A critical error edit is done that rejects respondent entries (for instance, excluded populations). Flow errors are also adjusted during processing and a data inconsistency detection and correction program is applied. Where appropriate, edits are instead developed to be performed at Head Office after data collection. Inconsistencies are usually corrected by setting one or both of the variables in question to "not stated". Response frequency obtained during the current period and previous reference periods is also compared to identify errors prior to release.
Finally, health indicators originating from the CCHS common content are going through a validation process after final microdata files are produced. Estimates for all geography levels by sex and by age groups are compared to estimates from previous years. This process allows to confirm that estimates of key indicators are acceptable.
No imputation is done for this survey.
The principle behind estimation in a probability sample is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of the population, each person in the sample represents 50 persons in the population. In the terminology used here, it can be said that each person has a weight of 50. The weighting phase is a step that calculates, for each person, his or her associated sampling weight. This weight must be used to derive meaningful estimates from the survey. For example, if the number of individuals who had a major depressive episode is to be estimated, the weights of survey respondents having that characteristic should be summed. In order for estimates produced from survey data to be representative of the covered population and not just the sample itself, a user must incorporate the survey weights into their calculations.
In order to determine the quality of an estimate, the variance must be calculated. Because the CCHS uses a multi-stage survey design, there is no simple formula that can be used to calculate variance estimates. Therefore, an approximative method is needed. Coefficient of variation, standard deviation and confidence intervals can then be calculated from the variance. The bootstrap re-sampling method used in the CCHS involves the selection of simple random samples known as replicates, and the calculation of the variation between the estimates from replicate to replicate. In each stratum, a simple random sample of (n-1) of the n clusters is selected with replacement to form a replicate. Note that since the selection is with replacement, a cluster may be chosen more than once. In each replicate, the survey weight for each record in the (n-1) selected clusters is recalculated. These weights are then post-stratified according to demographic information in the same way as the sampling design weights in order to obtain the final bootstrap weights. The entire process (selecting simple random samples, recalculating and post-stratifying weights for each stratum) is repeated B times, where B is large. The CCHS typically uses B=500, to produce 500 bootstrap weights. To obtain the bootstrap variance estimator, the point estimate for each of the B samples must be calculated. The standard deviation of these estimates is the bootstrap variance estimator. Statistics Canada has developed a program that can perform all of these calculations for the user: the Bootvar program.
To ensure the survey meet its objectives (see "Survey Description"), a Steering Committee and an Advisory Board comprised of authorities from the provincial and territorial Ministries of Health, Health Canada and the Public Health Agency of Canada determined the concepts and focus. Expert Groups were convened to advise on the measures to obtain the results envisioned by the Steering Committee and Advisory Board, and to recommend proven collection vehicles and indices. The resulting data are recognized as valid measures of contemporary concepts such as: depression, activity limitation, weight problems and chronic pain.
Throughout the collection process, control and monitoring measures were put in place and corrective action was taken to minimize non-sampling errors. These measures included response rate evaluation, reported and non-reported data evaluation, on site observation of interviews, improved collection tools for interviewers and others.
Once processing steps are completed, three data validation steps are undertaken. First, a validation program is run in order to compare estimates for the health indicators taken from the common content with the previous year. This validation is performed at various geographical levels, as well as by age and sex. Significant differences are examined further to find any anomalies in data. Also, the work of analysts who use the CCHS data to publish analytical articles on specific themes, allows for an in-depth look at many variables of the survey and represents a very effective way to find error.
Last, an external validation step is also part of the validation process. Share files are sent before release to provincial and federal partners for a two-week examination period. They can then scrutinize the data and inform Statistics Canada of any concerns or anomalies related to data quality.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Disclosure control methods are applied to Public Use Microdata Files (PUMFs). Although the survey now produces annual data files, PUMFs will still be produced with two years of data at the time (for example, 2007-2008, 2009-2010, etc.). The PUMFs differ in a number of important aspects from the survey 'master' files held by Statistics Canada. These differences are the result of actions taken to protect the anonymity of individual survey respondents. First, sensitive variables are grouped, capped or completely removed from the files. Second, some health regions are collapsed with other regions on the PUMF due to their small population sizes and the risk of disclosure associated with it. Last, some values of indirect identifiers (mostly socio-demographic variables) were suppressed.
Users requiring access to information excluded from the microdata files may purchase custom tabulations, or access the master files through the Research Data Centres program or the Remote Access program. Outputs are vetted for confidentiality before being given to users.
Estimates in the main body of a statistical table are rounded to the nearest hundred units using the normal rounding technique. If the first or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is raised by one. Marginal sub-totals and totals in statistical tables are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units using normal rounding methods. Averages, proportions, rates and percentages are computed from unrounded components (for example, numerators and/or denominators) and then are rounded themselves to one decimal using normal rounding. In normal rounding to a single digit, if the final or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is increased by one. Sums and differences of aggregates (or ratios) are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units (or the nearest one decimal) using normal rounding. Under no circumstances are unrounded estimates, published or otherwise, released. Unrounded estimates imply greater precision than actually exists.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
For details on data accuracy measures and response rates, please refer to the Mode Study and User Guide documents under the Documentation section.
- CCHS Content Overview, 2009-2010
- Interpreting Estimates from the Redesigned CCHS
- Mode Study
- 2009 Master File - 12 months
- Health Surveys - Aspects that may explain differences in the estimates obtained from two different survey occasions
- Format: Health Surveys - Aspects that may explain differences in the estimates obtained from two different survey occasions - ARCHIVED - HTML[ARCHIVED - HTML] Health Surveys - Aspects that may explain differences in the estimates obtained from two different survey occasions - ARCHIVED - PDF, 59.87[ARCHIVED - PDF, 59.87 kb]
- Canadian Community Health Survey - Errata (last updated June 2010)