Canadian Community Health Survey (CCHS)
The central objective of the Canadian Community Health Survey (CCHS) is to gather health-related data at the sub-provincial levels of geography (health region or combined health regions).
Detailed information for 2003 (Cycle 2.1)
Data release - June 15, 2004
- Questionnaire(s) and reporting guide(s)
- Variables and definitions
- Data sources and methodology
- Data accuracy
- Data file
In 1991, the National Task Force on Health Information cited a number of issues and problems with the health information system. These problems were that: data was fragmented; data was incomplete; data could not be easily shared; data was not being analysed to the fullest extent; and the results of research were not consistently reaching Canadians. In responding to these issues, the Canadian Institute for Health Information (CIHI), Statistics Canada and Health Canada joined forces to create a National Health Information Roadmap.
A major achievement resulting from this joined initiative was the inception of the Canadian Community Health Survey (CCHS). The primary objectives of the CCHS are to:
-provide timely, reliable, cross-sectional estimates of health determinants, health status and health system utilization across Canada,
-gather data at the sub-provincial level of geography,
-create a flexible survey instrument that:
- meets specific health region data gaps,
- develops focused survey content for key data,
- deals with emerging health and health care issues as they arise.
The CCHS is a cross-sectional survey that collects information related to health status, health care utilization and health determinants for the Canadian population. The CCHS operates on a two-year collection cycle. The first year of the survey cycle ".1" is a large sample, general population health survey, designed to provide reliable estimates at the health region level. The second year of the survey cycle ".2" has a smaller sample and is designed to provide provincial level results on specific focused health topics.
In Canada, the primary use of the data is for health surveillance, such as in prevalence of disease and other forms of health research. The data are used extensively by the research community and other health professionals. The uniqueness of the CCHS arises from the regional nature of both content and survey implementation. These aspects allow for analysis of health data at a regional level, across Canada. Federal and provincial departments of health and human resources, social service agencies, and other types of government agencies use the information collected from respondents to plan, implement and evaluate programs to improve health and the efficiency of health services. Non-profit health organizations and researchers in academic fields use the information to make research to improve health. The media use the results from the surveys to raise awareness about health, an issue of concern to all.
- Diseases and health conditions
- Lifestyle and social conditions
- Prevention and detection of disease
Data sources and methodology
The CCHS covers the population 12 years of age and over living in the ten provinces and the three territories. Excluded from the survey's coverage are: persons living on reserves and other Aboriginal settlements in the provinces; full-time members of the Canadian Forces; the institutionalized population and persons living in the Quebec health regions of Région du Nunavik and Région des Terres-Cries-de-la-Baie-James. In Nunavut, the coverage is limited to the ten largest communities* which represents about 70% of the Nunavut population. Altogether, these exclusions represent less than 3% of the target population.
*The 10 largest communities in Nunavut are: Iqaluit, Rankin Inlet, Cambridge Bay, Kugluktuk, Cape Dorset, Pangnirtung, Igloolik, Pond Inlet, Baker Lake and Arviat.
Each CCHS cycle questionnaire has been conceived in collaboration with specialists from Statistics Canada, other departments and/or academic fields. The CCHS questions were designed for computer-assisted interviewing (CAI), meaning that, as the questions were developed, the associated logical flow into and out of the questions was programmed. This included specifying the type of answer required, the minimum and maximum values, on-line edits associated with the question and what to do in case of item non-response.
With CAI, the interview can be controlled based on answers provided by the respondent. On-screen prompts are shown when an invalid entry is recorded and thus immediate feedback is given to the respondent and/or the interviewer to correct inconsistencies. Another enhancement is the automatic insertion of reference periods based on current dates. Pre-filling of text or data based on information gathered during the interview allows the interviewer to proceed without having to search back for previous answers. This type of pre-fill includes such things as using the correct name or sex within the questions themselves. Allowable ranges/answers based on data collected during the interview can also be programmed. In other words, the questionnaire can be customized to the respondent based on data collected at that time or during a previous interview.
One field test was conducted for cycle 2.1. The test involved Statistics Canada's Regional Offices. Experienced Labour Force Survey interviewers carried out interviews. The main objectives of the test was to observe respondent reaction to the survey, to obtain estimates of time for the various sections, to study the response rates and to test feedback questions. Field operations and procedures, interviewer training and the data collection computer application were also tested. In addition to the field test, the data collection computer application was extensively tested in-house in order to identify any errors in the program flow and text. The testing of the data collection computer application was an ongoing operation up until the start of the main survey.
This is a sample survey with a cross-sectional design.
To provide reliable estimates to the 133 Health Regions (HRs), and given the budget allocated to the CCHS component, a sample of 130,000 respondents was desired. Although producing reliable estimates at the HR level was a primary objective, the quality of the estimates for certain key characteristics at the provincial level was also deemed important. The sample allocation strategy consisting of three steps, gave relatively equal importance to the HRs and the provinces. In the first two steps, the sample was allocated among the provinces according to their respective populations and the number of HRs they contain. In the third step, each province's sample was allocated among its HRs proportionally to the square root of the estimated population in each HR.
The CCHS used three sampling frames to select the sample of households. The majority of the sample of households came from an area frame. In some HRs, a Random Digit Dialling (RDD) sampling frame or a list frame of telephone numbers was also used.
The CCHS used the area frame designed for the Canadian Labour Force Survey (LFS) as its primary frame. The sampling plan of the LFS is a multistage stratified cluster design in which the dwelling is the final sampling unit. In the first stage, homogeneous strata were formed and independent samples of clusters were drawn from each stratum. In the second stage, dwelling lists were prepared for each cluster and dwellings, or households, were selected from the lists.
For the purpose of the plan, each province is divided into three types of regions: major urban centres, cities and rural regions. Geographic or socio-economic strata are created within each major urban centre. Within the strata, dwellings are regrouped to create clusters. Some urban centres have separate strata for apartments or for census enumeration areas (EA) in which the average household income is high. In each stratum, six clusters or residential buildings (sometimes 12 or 18 apartments) are chosen by a random sampling method with a probability proportional to size (PPS), the size of which corresponds to the number of households. The number six was used throughout the sample design to allow a one-sixth rotation of the sample every month for the LFS.
The other cities and rural regions of each province are stratified first on a geographical basis, then according to socio-economic characteristics. In the majority of strata, six clusters (usually census EAs) are selected using the PPS method. Where there is low population density, a three-step plan is used whereby two or three primary sampling units (PSU), which normally correspond to groups of EAs, are selected and divided into clusters, six of which are sampled. The selection is made at each step using the PPS method. Once the new clusters are listed, the sample is obtained using a systematic sampling of dwellings.
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
The CCHS questionnaire was administered using computer-assisted interviewing (CAI). Sample units selected from the area frame were interviewed using the Computer-Assisted Personal Interviewing (CAPI) method while units selected from the Random Digit Dialling (RDD) and telephone list frames were interviewed using the Computer-Assisted Telephone Interviewing (CATI) method.
CAI offers a number of data quality advantages over other collection methods. First, question text, including reference periods and pronouns, is customised automatically based on factors such as the age and sex of the respondent, the date of the interview and answers to previous questions.
Second, edits to check for inconsistent answers or out-of-range responses are applied automatically and on-screen prompts are shown when an invalid entry is recorded. Immediate feedback is given to the respondent and the interviewer is able to correct any inconsistencies.
Third, questions that are not applicable to the respondent are skipped automatically.
CAPI interviewers worked independently from their homes using laptop computers and were supervised from a distance by senior interviewers. Completed interviews were transmitted daily to Statistics Canada's head office using a secure telephone transmission directly from the interviewer's home.
CATI interviewers worked in centralised offices and were supervised by a senior interviewer located in the same office. Transmission of cases from each of 5 CATI offices to head office was the responsibility of the regional office project supervisor, senior interviewer and the technical support team.
An automated call scheduler, ie. a central system to optimise the timing of call-backs and the scheduling of appointments, was not available to support CATI collection. Instead, at the start of each month a batch of cases was assigned to each personal computer in each CATI office. The caseload on each PC was then managed manually. Because the number of CATI cases was relatively small, this approach was reasonably efficient and the absence of a call scheduler is not thought to have had an adverse effect on data quality.
View the Questionnaire(s) and reporting guide(s).
Some editing of the data is performed at the time of the interview by the computer-assisted interviewing (CAI) application. It is not possible for interviewers to enter out-of-range values and flow errors are controlled through programmed skip patterns. For example, CAI ensures that questions that do not apply to a respondent are not asked. In response to some types of inconsistent or unusual reporting, warning messages are invoked but no corrective action is taken at the time of the interview. Where appropriate, edits are instead developed to be performed at Head Office after data collection. Inconsistencies are usually corrected by setting one or both of the variables in question to "not stated".
Several edits are performed at Head Office during the data processing step. A critical error edit is done that rejects respondent entries (for instance, excluded populations). Flow errors are also adjusted during processing and a data inconsistency detection and correction program is applied. Response frequency obtained during the current period and previous reference periods is also compared to identify errors prior to release.
Health indicators originating from the CCHS core content are going through a validation process after final micrododata files are produced. Estimates for all geograhy levels by sex and by age groups are compared to estimates from previous years. This process allows to confirm that estimates of key indicators are acceptable.
The principle behind estimation in a probability sample is that each person in the sample "represents", besides himself or herself, several other persons not in the sample. For example, in a simple random 2% sample of the population, each person in the sample represents 50 persons in the population. In the terminology used here, it can be said that each person has a weight of 50. The weighting phase is a step that calculates, for each person, his or her associated sampling weight. This weight must be used to derive meaningful estimates from the survey. For example, if the number of individuals who had a major depressive episode is to be estimated, the weights of survey respondents having that characteristic should be summed. In order for estimates produced from survey data to be representative of the covered population and not just the sample itself, a user must incorporate the survey weights into their calculations.
In order to determine the quality of an estimate, the variance must be calculated. Because the CCHS uses a multi-stage survey design, there is no simple formula that can be used to calculate variance estimates. Therefore, an approximative method is needed. Coefficient of variation, standard deviation and confidence intervals can then be calculated from the variance. The bootstrap re-sampling method used in the CCHS involves the selection of simple random samples known as replicates, and the calculation of the variation between the estimates from replicate to replicate. In each stratum, a simple random sample of (n-1) of the n clusters is selected with replacement to form a replicate. Note that since the selection is with replacement, a cluster may be chosen more than once. In each replicate, the survey weight for each record in the (n-1) selected clusters is recalculated. These weights are then post-stratified according to demographic information in the same way as the sampling design weights in order to obtain the final bootstrap weights. The entire process (selecting simple random samples, recalculating and post-stratifying weights for each stratum) is repeated B times, where B is large. The CCHS typically uses B=500, to produce 500 bootstrap weights. To obtain the bootstrap variance estimator, the point estimate for each of the B samples must be calculated. The standard deviation of these estimates is the bootstrap variance estimator. Statistics Canada has developed a program that can perform all of these calculations for the user: the Bootvar program.
Survey design has a profound effect on the objectives of the survey which are listed under "Survey Description". To meet these objectives, a Steering Committee and an Advisory Board comprised of authorities from the provincial and territorial Ministries of Health, the Canadian Institute for Health Information, and Health Canada determined the concepts and focus. Expert Groups were convened to advise on the measures to obtain the results envisioned by the Steering Committee and Advisory Board, and to recommend proven collection vehicles and indices. The resulting data is recognized as valid measures of contemporary concepts such as: depression, activity limitation, weight problems and chronic pain.
The frames chosen to provide the sample, the Labour Force Survey, RDD and list frame of telephone numbers, have been combined with sampling design methodologies which have been tested, used repeatedly, and have been proven to produce accurate results. The large sample in each province/territory helps ensure accurate and meaningful results.
High response rates are essential for quality data. Actions have been taken to reduce non-sampling errors to a minimum. To reduce the number of non-response cases, the interviewers are all extensively trained by Statistics Canada, provided with detailed Interviewer Manuals, and are under the direction of interviewer supervisors. The extent of non-response varies from partial non-response (failure to answer just one or some questions) to total non-response. Partial non-response was basically non-existent because once the interview was started, the respondents generally finished it. In most cases, partial non-response to the survey occurred when the respondent did not understand or misinterpreted a question, refused to answer a question, could not recall the requested information or could not provide personal or proxy information. Total non-response occurred because the interviewer was either unable to trace the respondent, no member of the household was able to provide the information or the respondent refused to participate in the survey. Total non-response was handled by adjusting the weight of households that responded to the survey to compensate for those who did not respond. Refusals were followed up by senior interviewers, project supervisors or by other interviewers to encourage respondents to participate in the survey. In addition, to maximize the response rate, non-response cases were also followed up in subsequent collection periods.
Public Use Microdata Files (PUMFs) are produced in addition to the Master files. The PUMFs differ in a number of important aspects from the survey "master" files held by Statistics Canada. These differences are the result of actions taken to protect the anonymity of individual survey respondents. First, only cross-sectional data are available on such files, because longitudinal information can lead to the identification of respondents. Also, some sensible variables are regrouped, capped or completely deleted from the files. Users requiring access to information excluded from the microdata files may purchase custom tabulations, or access the master files through the Research Data Centres program or the Remote Access program. Outputs are vetted for confidentiality before being given to users.
Before releasing and/or publishing any estimate from these files, users should first determine the number of sampled respondents who contribute to the calculation of the estimate. If this number is less than 30, the weighted estimate should not be released regardless of the value of the coefficient of variation for this estimate. For weighted estimates based on sample sizes of 30 or more, users should determine the coefficient of variation of the rounded estimate and follow the guidelines below.
Estimates in the main body of a statistical table are rounded to the nearest hundred units using the normal rounding technique. If the first or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is raised by one. Marginal sub-totals and totals in statistical tables are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units using normal rounding methods. Averages, proportions, rates and percentages are computed from unrounded components (for example, numerators and/or denominators) and then are rounded themselves to one decimal using normal rounding. In normal rounding to a single digit, if the final or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is increased by one. Sums and differences of aggregates (or ratios) are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units (or the nearest one decimal) using normal rounding. Under no circumstances are unrounded estimates, published or otherwise, released. Unrounded estimates imply greater precision than actually exists.
Most editing for the CCHS is conducted at the time of the interview by the Computer Assisted Interview (CAI) application. Some types of inconsistent or unusual reporting were edited after data collection at Head Office. Inconsistencies were usually corrected by setting answers to questions to 'not stated'.
For Cycle 2.1, some questions or modules were appropriate for self-response only and these were skipped for the interviews completed by proxy.
Detailed Microdata User Guides are developed for each cycle to provide all the relevant background information on each of the surveys (background, methodology, data quality, data dictionary, derived variables specifications, etc).
Special studies were conducted on the survey data. These include a validation of CCHS results in relation to various other surveys for both CCHS cycles conducted so far. A special study comparing the profile of respondents accepting to share their data with other departments was conducted.
- Master File
- Public Use Microdata File
- Health Services Access Survey
- ARCHIVED - Guidelines for the use of sub-sample variables (PDF Version, 81.95kb)
- ARCHIVED - Mode Study (PDF Version, 131.32kb)
- ARCHIVED - Health Surveys - Cross-sectional samples - Aspects that may explain differences in the estimates obtained from two different survey occasions (PDF Version, 128.30kb)
- ARCHIVED - Canadian Community Health Survey - Errata (last updated October 2009) (PDF Version, 196.77kb)
- Public use microdata file (PUMF): Canadian Community Health Survey - 2003 - Cycle 2.1
- Date modified: