Joint Canada/United States Survey of Health (JCUSH)
Detailed information for 2003
The Joint Canada/United States Survey of Health (JCUSH) will collect information from both Canadian and U.S. residents, about their health, their use of health care and their functional limitations.
Data release - June 2, 2004
The Joint Canada/United States Survey of Health (JCUSH) is an outgrowth from a session at the 2000 United States/Canada Interchange on making data more comparable and integrated. The Interchanges, consisting of two-day meetings with the site of the meeting alternating between the Washington DC area and Ottawa, began in 1999. The purpose of the meetings is to promote communication, collaboration, cooperation, synergy, facilitation of comparative analyses, and interaction between countries. Participants at that 2000 Interchange suggested that the best way to make comparisons would be to conduct a joint survey in which the questionnaire, sample design, data collection methods, data processing, and editing were done at the same time in the same way. A team of staff members from the two agencies was designated to explore the feasibility of the project and implement it.
Through integrating survey design and data collection methodologies, the JCUSH attempts to increase knowledge about the comparability of the two health data systems, as well as provide a model for future international comparisons. Although previous studies have examined cross-cultural comparisons of social, economic, and political characteristics using separate data sets, a number of methodological limitations exist in their research. Differences in survey results may exist even when taking into account methodology specific to each body of data. While debates such as this over validity of comparative research continue, the JCUSH adds to the literature as well as produces a body of fully comparable data between Canada and the United States.
The principal objectives of the JCUSH are:
- To develop, implement, and document a collaboration between national statistical offices for conducting joint health surveys of their national populations;
- To use knowledge gained in conducting the JCUSH to modify or fine-tune questionnaires from the two countries' ongoing national health surveys so as to enhance comparability between those surveys; and
- To produce a data set with highly comparable data on the Canadian and American populations for use by researchers studying the effect of variations in health systems, health care, health status, and functional status, and for use in survey methodological studies.
Reference period: Varies according to the question (for example: "over the last 12 months", "over the last 6 months", "during the last week", etc.)
Data sources and methodology
The target population of the JCUSH is Canadian and American household residents aged 18 years or older. The institutionalized population is excluded, as are people in prison and full time members of the Canadian or American Armed Forces. In Canada, the three northern territories (Yukon, Northwest Territories and Nunavut) were excluded. Similarly, in the United States, the United States territories (Puerto Rico, the United Sates Virgin Islands, American Samoa, Guam and the Commonwealth of the Northern Mariana Islands) were excluded, but residents of the District of Columbia were included.
The questionnaire has been conceived in collaboration with specialists from Statistics Canada, other departments and/or academic fields. The questions were designed for computer-assisted interviewing (CAI), meaning that, as the questions were developed, the associated logical flow into and out of the questions was programmed. This included specifying the type of answer required, the minimum and maximum values, on-line edits associated with the question and what to do in case of item non-response.
The JCUSH questionnaire was administered using the Computer-Assisted Telephone Interviewing (CATI) method. CATI offers a number of data quality advantages over other collection methods. First, question text, including reference periods and pronouns, is customised automatically based on factors such as the age and sex of the respondent, the date of the interview, and answers to previous questions. Second, edits to check for inconsistent answers or out-of-range responses are applied automatically, and on-screen prompts are shown when an invalid entry is recorded. Immediate feedback is given to the respondent, and the interviewer is able to correct any inconsistencies. Third, questions that are not applicable to the respondent are skipped automatically.
This is a sample survey with a cross-sectional design.
The JCUSH sample was designed to produce reliable national estimates for three age groups (18-44, 45-64 and 65 years and older), by sex. Statistics Canada and NCHS were each responsible for designing their respective samples. To provide reliable national estimates for three age groups, by sex, and to adhere to the budget allocated to the JCUSH, a sample of 3,500 respondents in Canada and 5,000 respondents in United States was desired. These sample sizes were increased before data collection to take into account out-of-scopes and anticipated non-response.
The JCUSH sample was stratified by province in Canada and by four geographic regions in the United States (Northeast, Midwest, West and South). In each country, the sample was proportionally allocated within each stratum based on their population sizes.
The sample selection method allowing for the best comparability between the two countries was Random Digit Dialling (RDD). Each organization was responsible for drawing its own sample. In Canada, the sampling of households from the RDD frame uses the Elimination of Non-Working Banks (ENWB) method and, in the United States., the RDD sample was selected using the GENESYS Sampling System (a proprietary product of Donnelley Marketing Information Systems (DMIS)).
Sampling of Respondents
With the RDD method, it is difficult to control the sample composition since the age and the sex of the respondents are unknown beforehand. Since males aged 65 years and older represent only about 7% of the population, and since only about 13% of the households contain at least one male aged 65 years or older, a purely random selection of the respondents among the adult household members would have necessitated a very large sample size to guarantee reliable estimates for this group. For the JCUSH, the age group 65 years and older is important. To avoid an overly large sample and to respect operational and budget constraints, it was decided to increase the probability of selection for persons aged 65 years and older.
To increase the selection probability in this group, the computer application was designed to randomly select the respondent from among only the household members aged 65 years and older when at least one person in the household was part of this group. For households containing only people younger than 65 years old, the respondent was randomly selected from among all the adult members. This strategy slightly increased the representation of those 65 years and older in the sample, without creating an overly large distortion compared to the observable distribution in the population. The main inconvenience of this approach is that it systematically excludes from the sample the population younger than 65 years old living with one or more people aged 65 years and older. A bias might be introduced in the sample if these people have particular characteristics. On the other hand, this approach avoids obtaining extreme weights. Such weights would be obtained for the population younger than 65 years old living with one or more people 65 years old and older, if their probability of selection was decreased and close to zero. For this reason and to ensure a sufficient representation of those 65 years and older, it was concluded that the possible bias was an acceptable compromise.
For the JCUSH, the final numbers of respondents are as follows: 3,505 respondents in Canada and 5,183 respondents in the United-States.
Data collection for this reference period: 2002-11-04 to 2003-03-31
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
Data collection took place between November 4th 2002 and March 31st 2003. Additional collection took place during several weeks in April and June 2003 for only the American sample to focus on encouraging selected persons who had previously refused to participate in the survey. In all selected households, a knowledgeable household member aged 18 years or older was asked to supply basic demographic information on all residents of the household. A household member aged 18 years or older was then randomly selected for a more in-depth interview.
Both the Canadian and American interviews were conducted by Statistics Canada permanent employees from Statistics Canada's regional offices using the same questionnaire. Interviewers are employees hired and trained specifically to carry out surveys using computer-assisted interviewing, and most are experienced interviewers. All interviewers attended a training session and received a manual for use as a reference tool. The questionnaire was administered in three languages: French and English for Canadian interviews and Spanish and English for American interviews. Interview duration was about 30 minutes.
Prior to the first contact by an interviewer, an introductory letter was mailed to each selected dwelling for which a valid mailing address was available. This explained the importance of the survey and assured confidentiality of the respondents.
Advance letters for both countries were nearly identical, the divergence stemming from the mention of authorizing legislation (The Canadian Statistics Act versus the United States Public Health Service Act) and agencies involved. The letters were written to meet the requirements of both agencies' institutional criteria, reflecting the effort of staff from both countries to make the letter concise and readable at an 8th grade level. Statistics Canada was responsible for mailing out advance letters to Canadians in the sample, while NCHS mailed the advance letters to the United States sample through the United States Public Health Service mailing facility in Rockville, Maryland.
Interviewers were instructed to make all reasonable attempts to obtain interviews. When the timing of the interviewer's call was inconvenient, an appointment was made to call back at a more convenient time. If no one was home, numerous call-backs were made. For individuals who at first refused to participate in the survey, a letter was sent to the respondent stressing the importance of the survey and the household's collaboration. This was followed by a second call from a senior interviewer, a project supervisor or another interviewer to try to convince respondents of the importance of participating in the survey. During the final months of data collection, collection efforts focused on non-response cases and on selected persons who had previously refused to participate in the survey.
View the Questionnaire(s) and reporting guide(s).
Most editing of the data was performed at the time of the interview by the computer-assisted interviewing (CAI) application. It was not possible for interviewers to enter out-of-range values and flow errors were controlled through programmed skip patterns. For example, CAI ensured that questions that did not apply to the respondent were not asked. In response to some types of inconsistent or unusual reporting, warning messages were invoked but no corrective action was taken at the time of the interview. Where appropriate, edits were instead developed to be performed after data collection at Head Office. Inconsistencies were usually corrected by setting one or both of the variables in question to "not stated".
This methodology does not apply.
The principle behind estimation in a probability sample such as the JCUSH is that each person in the sample represents himself/herself and a number of others not in the sample who have similar socio-demographic characteristics. For example, in a simple random sample in which each person had a 2% probability of being selected, each person in the sample represents 50 persons in the population. In the terminology used here, it can be said that each person has a weight of 50.
The weighting phase is a step that calculates, for each person, his or her associated sampling weight. This weight appears on the microdata file and must be used to derive meaningful estimates from the survey. For example, the number of individuals who smoke is calculated by selecting the records for individuals in the sample having that characteristic and summing the weights entered on those records.
In order for estimates produced from survey data to be representative of the target population, and not just of the sample itself, users must incorporate the survey weights into their calculations. A survey weight is given to each person included in the final sample, that is, the sample of persons who responded to the survey questions. This weight corresponds to the number of persons represented by the respondent for the entire population.
The weights for the Canadian and the U.S. samples were obtained separately, but both used the same method and the same weight adjustments.
Since it is an unavoidable fact that estimates from a sample survey are subject to sampling error, sound statistical practice calls for researchers to provide users with some indication of the magnitude of this sampling error. The basis for measuring the potential size of sampling errors is the standard deviation of the estimates derived from survey results. However, because of the large variety of estimates that can be produced from a survey, the standard deviation of an estimate is usually expressed relative to the estimate to which it pertains. This resulting measure, known as the coefficient of variation (CV) of an estimate, is obtained by dividing the standard deviation of the estimate by the estimate itself and is expressed as a percentage of the estimate.
The JCUSH uses a complex survey design, which means that there is no simple formula that can be used to calculate variance estimates. Therefore, an approximative method is needed. For the JCUSH, it is recommended to use the bootstrap method or the Taylor series method for variance estimation.
The bootstrap method involves the selection, from the initial sample, of simple random samples (known as replicates). In each replicate, the survey weight for each record is recalculated. These weights are adjusted and post-stratified according to population estimates information in the same way as the initial weights in order to obtain the final bootstrap weights. The standard deviation is calculated based on the variation in the estimates from replicate to replicate.
The JCUSH uses 1,000 bootstrap replicates to produce 1,000 sets of bootstrap weights, which are provided with the Public Use Microdata File. To obtain a bootstrap variance estimator, the point estimate for each of the 1,000 samples must be calculated. The variance of these estimates is the bootstrap variance estimator. A program was developed and can perform all of these calculations for the user: the Bootvar program.
Calculation of standard deviation for estimates obtained with the JCUSH data can also be done with the Taylor series method, using statistical software such as SUDAAN. The design information (strata and clusters) is included in the Public Use Microdata File.
Actions have been taken to reduce non-sampling errors to a minimum. To reduce the number of non-response cases, the interviewers are all extensively trained by Statistics Canada, provided with detailed Interviewer Manuals, and are under the direction of interviewer supervisors. The extent of non-response varies from partial non-response (failure to answer just one or some questions) to total non-response. Partial non-response was basically non-existent because once the interview was started, the respondents generally finished it. In most cases, partial non-response to the survey occurred when the respondent did not understand or misinterpreted a question, refused to answer a question, could not recall the requested information or could not provide personal or proxy information. Total non-response occurred because the interviewer was either unable to contact the respondent, no member of the household was able to provide the information or the respondent refused to participate in the survey. Total non-response was handled by adjusting the weight of households that responded to the survey to compensate for those who did not respond. Refusals were followed up by senior interviewers, project supervisors or by other interviewers to encourage respondents to participate in the survey. In addition, to maximize the response rate, data collection period was extended for the U.S. portion of the sample.
In order to evaluate the quality of the JCUSH data, comparisons with other existing surveys were done. The Canadian data was compared to the Canadian Community Health Survey (CCHS), cycles 1.1, 1.2 and 2.1. The U.S. data was compared to the U.S. National Health Interview Survey (NHIS).
Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Public Use Microdata Files (PUMFs) are produced in addition to the Master files. The PUMFs differ in a number of important aspects from the survey "master" files held by Statistics Canada. These differences are the result of actions taken to protect the anonymity of individual survey respondents. First, only cross-sectional data are available on such files, because longitudinal information can lead to the identification of respondents. Also, some sensible variables are regrouped, capped or completely deleted from the files.
Before releasing and/or publishing any estimate from these files, users should first determine the number of sampled respondents who contribute to the calculation of the estimate. If this number is less than 30, the weighted estimate should not be released regardless of the value of the coefficient of variation for this estimate. For weighted estimates based on sample sizes of 30 or more, users should determine the coefficient of variation of the rounded estimate and follow the guidelines below.
Estimates in the main body of a statistical table are rounded to the nearest hundred units using the normal rounding technique. If the first or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is raised by one. Marginal sub-totals and totals in statistical tables are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units using normal rounding methods. Averages, proportions, rates and percentages are computed from unrounded components (for example, numerators and/or denominators) and then are rounded themselves to one decimal using normal rounding. In normal rounding to a single digit, if the final or only digit dropped is zero to four, the last digit retained is not changed. If the first or only digit dropped is five to nine, the last digit retained is increased by one. Sums and differences of aggregates (or ratios) are derived from their corresponding unrounded components and then are rounded themselves to the nearest 100 units (or the nearest one decimal) using normal rounding. Under no circumstances are unrounded estimates, published or otherwise, released. Unrounded estimates imply greater precision than actually exists.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
- JCUSH - Data Dictionary Topical Index
- JCUSH - User Guide