Canadian Community Health Survey (CCHS)
The central objective of the Canadian Community Health Survey (CCHS) is to gather health-related data at the sub-provincial levels of geography (health region or combined health regions).
Detailed information for 2005 (Cycle 3.1)
Data release - December 21, 2005 (this data release covers data collected from January to June 2005); June 13, 2006 (this data release covers all data collected from January to December 2005).
- Questionnaire(s) and reporting guide(s)
- Data sources and methodology
- Data accuracy
In 1991, the National Task Force on Health Information cited a number of issues and problems with the health information system. These problems were that: data was fragmented; data was incomplete; data could not be easily shared; data was not being analysed to the fullest extent; and the results of research were not consistently reaching Canadians. In responding to these issues, the Canadian Institute for Health Information (CIHI), Statistics Canada and Health Canada joined forces to create a National Health Information Roadmap.
A major achievement resulting from this joined initiative was the inception of the Canadian Community Health Survey (CCHS). The primary objectives of the CCHS are to:
-provide timely, reliable, cross-sectional estimates of health determinants, health status and health system utilization across Canada,
-gather data at the sub-provincial level of geography,
-create a flexible survey instrument that:
- meets specific health region data gaps,
- develops focused survey content for key data,
- deals with emerging health and health care issues as they arise.
The CCHS is a cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. The CCHS operates on a two-year collection cycle. The first year of the survey cycle ".1" is a large sample, general population health survey, designed to provide reliable estimates at the health region level. The second year of the survey cycle ".2" has a smaller sample and is designed to provide provincial level results on specific focused health topics.
In Canada, the primary use of the data is for health surveillance, such as in prevalence of disease and other forms of health research. The data are used extensively by the research community and other health professionals. The uniqueness of the CCHS arises from the regional nature of both content and survey implementation. These aspects allow for analysis of health data at a regional level, across Canada. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information collected from respondents to plan, implement and evaluate programs to improve health and the efficiency of health services. Non-profit health organizations and researchers in academic fields use the information to make research to improve health. The media use the results from the surveys to raise awareness about health, an issue of concern to all.
- Diseases and health conditions
- Lifestyle and social conditions
- Prevention and detection of disease
Data sources and methodology
The CCHS covers the population 12 years of age and over living in the ten provinces and the three territories. Excluded from the survey's coverage are: persons living on reserves and other Aboriginal settlements in the provinces; full-time members of the Canadian Forces; the institutionalized population and persons living in the Quebec health regions of Région du Nunavik and Région des Terres-Cries-de-la-Baie-James. In Nunavut, the coverage is limited to the ten largest communities* which represents about 70% of the Nunavut population. Altogether, these exclusions represent less than 3% of the target population.
*The 10 largest communities in Nunavut are: Iqaluit, Rankin Inlet, Cambridge Bay, Kugluktuk, Cape Dorset, Pangnirtung, Igloolik, Pond Inlet, Baker Lake and Arviat.
Each component of the CCHS questionnaire is developed in collaboration with specialists from Statistics Canada, other federal and provincial departments and/or academic fields. The CCHS questions are designed for computer-assisted interviewing (CAI), meaning that, as the questions were developed, the associated logical flow into and out of the questions was programmed. This includes specifying the type of answer required, the minimum and maximum values, on-line edits associated with the question and what to do in case of item non-response.
One field test was conducted for cycle 3.1 in June 2004. The test involved Statistics Canada's Regional Offices. Experienced Statistics Canada interviewers carried out interviews. The main objectives of the test were to observe respondent reaction to the survey, to obtain estimates of time for the various sections, to study the response rates and to test feedback questions. Field operations and procedures, interviewer training and the data collection computer application were also tested.
In addition to the field test, the data collection computer application was extensively tested in-house in order to identify any errors in the program flow and text. The testing of the data collection computer application was an ongoing operation up until the start of the main survey.
This is a sample survey with a cross-sectional design.
To provide reliable estimates to the 122 Health Regions (HRs), and given the budget allocated to the CCHS component, a sample of 130,000 respondents was desired. A sample allocation strategy gave relatively equal importance to the HRs and the provinces. The sample was allocated among the provinces according to their respective populations and the number of HRs they contain. Finally, each province's sample was allocated among its HRs proportionally to the square root of the estimated population in each HR.
The CCHS used three sampling frames to select the sample of households: 49% of the sample of households came from an area frame, 50% came from a list frame of telephone numbers and the remaining 1% came from a Random Digit Dialling (RDD) sampling frame. For most of the health regions, 50% of the sample was selected from the area frame and 50% from the list frame of telephone numbers. In two health regions (Northern Quebec and Northern Saskatchewan), only the RDD frame was used. In Nunavut, only the area frame was used. In Yukon and Northwest Territories, most of the sample came from the area frame but a small RDD sample was also selected in Whitehorse and Yellowknife.
The CCHS used the area frame designed for the Labour Force Survey (LFS) as a sampling frame. The sampling plan of the LFS is a multistage stratified cluster design in which the dwelling is the final sampling unit. In the first stage, homogeneous strata are formed and independent samples of clusters are drawn from each stratum. In the second stage, dwelling lists are prepared for each cluster and dwellings, or households, are selected from the lists.
Each province is divided into three types of regions: major urban centres, cities and rural regions. Geographic or socio-economic strata are created within each major urban centre. Within the strata, dwellings are grouped to create clusters. In each stratum, clusters or residential buildings are chosen by a random sampling method with a probability proportional to size (PPS), the size of which corresponds to the number of households.
The other cities and rural regions of each province are stratified first on a geographical basis, then according to socio-economic characteristics. Clusters are then selected using the PPS method. The final sample is obtained using a systematic sampling of dwellings.
The list frame of telephone numbers was used in all but 5 HRs to complement the area frame. The Canada Phone directory was linked to internal administrative conversion files to obtain postal codes, and these were mapped to HRs to create list frame strata. There was one list frame stratum per HR. Within each stratum the required number of telephone numbers was selected using a simple random sampling process from the list.
Finally, in four HRs, a Random Digit Dialling (RDD) sampling frame of telephone numbers was used to select the sample of households. Using available geographic information (postal codes), telephone numbers on the frame were regrouped to create RDD strata to encompass, as closely as possible, the HR areas. Within each RDD stratum, telephone numbers were randomly selected until the required number within the RDD stratum was reached.
Data collection for this reference period: 2005-01-04 to 2005-06-30
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
The CCHS questionnaire is administered using computer-assisted interviewing (CAI). Sample units selected from the area frame are interviewed using the Computer-Assisted Personal Interviewing (CAPI) method while units selected from the Random Digit Dialling (RDD) and telephone list frames are interviewed using the Computer-Assisted Telephone Interviewing (CATI) method.
CAI offers a number of data quality advantages over other collection methods. First, question text, including reference periods and pronouns, is customised automatically based on factors such as the age and sex of the respondent, the date of the interview and answers to previous questions. Second, edits to check for inconsistent answers or out-of-range responses are applied automatically and on-screen prompts are shown when an invalid entry is recorded. Immediate feedback is given to the respondent and the interviewer is able to correct any inconsistencies. Third, questions that are not applicable to the respondent are skipped automatically.
CAPI interviewers work independently from their homes using laptop computers and are supervised from a distance by senior interviewers. Completed interviews are transmitted daily to Statistics Canada's head office using a secure telephone transmission directly from the interviewer's home. CATI interviewers work in centralised offices and are supervised by a senior interviewer located in the same office. Transmission of cases from each of 4 CATI offices to head office is the responsibility of the regional office project supervisor, senior interviewer and the technical support team.
An automated call scheduler, i.e. a central system to optimise the timing of call-backs and the scheduling of appointments, is used to support CATI collection.
View the Questionnaire(s) and reporting guide(s) .
Some editing of the data is performed at the time of the interview by the computer-assisted interviewing (CAI) application. It is not possible for interviewers to enter out-of-range values and flow errors are controlled through programmed skip patterns. For example, CAI ensures that questions that do not apply to a respondent are not asked. In response to some types of inconsistent or unusual reporting, warning messages are invoked but no corrective action is taken at the time of the interview. Where appropriate, edits are instead developed to be performed at Head Office after data collection. Inconsistencies are usually corrected by setting one or both of the variables in question to "not stated".
Several edits are performed at Head Office during the data processing step. A critical error edit is done that rejects respondent entries (for instance, excluded populations). Flow errors are also adjusted during processing and a data inconsistency detection and correction program is applied. Response frequency obtained during the current period and previous reference periods is also compared to identify errors prior to release.
Health indicators originating from the CCHS core content are going through a validation process after final micrododata files are produced. Estimates for all geograhy levels by sex and by age groups are compared to estimates from previous years. This process allows to confirm that estimates of key indicators are acceptable.
The principle behind estimation in a probability sample is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of the population, each person in the sample represents 50 persons in the population. In the terminology used here, it can be said that each person has a weight of 50. The weighting phase is a step that calculates, for each person, his or her associated sampling weight. This weight must be used to derive meaningful estimates from the survey. For example, if the number of individuals who had a major depressive episode is to be estimated, the weights of survey respondents having that characteristic should be summed. In order for estimates produced from survey data to be representative of the covered population and not just the sample itself, a user must incorporate the survey weights into their calculations.
In order to determine the quality of an estimate, the variance must be calculated. Because the CCHS uses a multi-stage survey design, there is no simple formula that can be used to calculate variance estimates. Therefore, an approximative method is needed. Coefficient of variation, standard deviation and confidence intervals can then be calculated from the variance. The bootstrap re-sampling method used in the CCHS involves the selection of simple random samples known as replicates, and the calculation of the variation between the estimates from replicate to replicate. In each stratum, a simple random sample of (n-1) of the n clusters is selected with replacement to form a replicate. Note that since the selection is with replacement, a cluster may be chosen more than once. In each replicate, the survey weight for each record in the (n-1) selected clusters is recalculated. These weights are then post-stratified according to demographic information in the same way as the sampling design weights in order to obtain the final bootstrap weights. The entire process (selecting simple random samples, recalculating and post-stratifying weights for each stratum) is repeated B times, where B is large. The CCHS typically uses B=500, to produce 500 bootstrap weights. To obtain the bootstrap variance estimator, the point estimate for each of the B samples must be calculated. The standard deviation of these estimates is the bootstrap variance estimator. Statistics Canada has developed a program that can perform all of these calculations for the user: the Bootvar program.
To ensure the survey meet its objectives (see "Survey Description"), a Steering Committee and an Advisory Board comprised of authorities from the provincial and territorial Ministries of Health, Health Canada and the Public Health Agency of Canada determined the concepts and focus. Expert Groups were convened to advise on the measures to obtain the results envisioned by the Steering Committee and Advisory Board, and to recommend proven collection vehicles and indices. The resulting data are recognized as valid measures of contemporary concepts such as: depression, activity limitation, weight problems and chronic pain.
Throughout the collection process, control and monitoring measures were put in place and corrective action was taken to minimize non sampling errors. These measures included response rate evaluation, reported and non reported data evaluation, on site observation of interviews, improved collection tools for interviewers and others.
Once processing steps are completed, three data validation steps are undertaken. Internally, analysts use the data to publish analytical articles on specific themes. This work allows for an in-depth look at many variables of the survey and represents a very effective way to find errors. Also, a validation program is run in order to compare some survey key indicators with the previous year. This validation is performed at various geographical levels, as well as by age and sex. Significant differences are examined further to find any anomalies in data. CCHS variables collected by other Statistics Canada surveys are compared. When important differences with other sources are found, the CCHS team investigates and documents possible causes.
Last, an external validation step is also part of the validation process. Share files are sent before release to provincial and federal partners for a two-week examination period. They can then scrutinize the data and inform Statistics Canada of any concerns or anomalies related to data quality.
Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Public Use Microdata Files (PUMFs) based on the entire 12 months data from the CCHS Cycle 3.1 are planned for release in September 2006. The PUMFs differ in a number of important aspects from the survey 'master' files held by Statistics Canada. These differences are the result of actions taken to protect the anonymity of individual survey respondents. First, sensitive variables are grouped, capped or completely removed from the files. Second, some health regions are collapsed with other regions on the PUMF due to their small population sizes and the risk of disclosure associated with it. Last, some values of indirect identifiers (mostly socio-demographic variables) were suppressed.
Users requiring access to information excluded from the microdata files may purchase custom tabulations, or access the master files through the Research Data Centres program or the Remote Access program. Outputs are vetted for confidentiality before being given to users.
Before releasing and/or publishing any estimate from these files, microdata users should first determine the number of sampled respondents who contribute to the calculation of the estimate. If this number is less than 30, the weighted estimate should not be released regardless of the value of the coefficient of variation for this estimate. For weighted estimates based on sample sizes of 30 or more, users should determine the coefficient of variation of the rounded estimate and follow the guidelines below.
Estimates in the main body of a statistical table are rounded to the nearest hundred units using the normal rounding technique. If the first or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is raised by one. Marginal sub-totals and totals in statistical tables are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units using normal rounding methods. Averages, proportions, rates and percentages are computed from unrounded components (for example, numerators and/or denominators) and then are rounded themselves to one decimal using normal rounding. In normal rounding to a single digit, if the final or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is increased by one. Sums and differences of aggregates (or ratios) are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units (or the nearest one decimal) using normal rounding. Under no circumstances are unrounded estimates, published or otherwise, released.
For details on data accuracy measures and response rates, please refer to the Mode Study and User Guide documents under the Documentation section.
- Master File - 6 months
- Master File - 12 months
- Public Use Microdata File
- ARCHIVED - Mode Study (PDF Version, 131.32kb)
- ARCHIVED - Response Rates (PDF Version, 139.53kb)
- ARCHIVED - Health Surveys - Cross-sectional samples - Aspects that may explain differences in the estimates obtained from two different survey occasions (PDF Version, 128.30kb)
- ARCHIVED - Canadian Community Health Survey - Errata (last updated October 2009) (PDF Version, 196.77kb)
- Date modified: