Canadian COVID-19 Antibody and Health Survey (CCAHS)
Detailed information for April 2022 to August 2022
The Canadian COVID-19 Antibody and Health Survey (CCAHS), collected key information relevant to the COVID-19 pandemic to learn as much as possible about the virus, how it affects overall health, how it spreads, and whether Canadians are developing antibodies against it.
Data release - March 27, 2023
The Canadian COVID-19 Antibody and Health Survey (CCAHS) collected information in two parts. The first part is an electronic questionnaire about general health and exposure to COVID-19. The second part is two self-administered sample collections; an at-home finger-prick sample collection called a dried blood spot (DBS) sample, which was used to measure the presence of antibodies against SARS-CoV-2, the virus that causes COVID-19, from vaccination or prior infection. The second at-home collection was a saliva sample which was used to determine if there was a recent or current SARS-CoV-2 infection at the time of sampling, by testing for viral material in the sample using a polymerase chain reaction (PCR) test. Participants were asked to complete both sample collections as soon as possible after the questionnaire.
The data can be used to:
- estimate how many Canadians test positive for antibodies against COVID-19. By using each participant's DBS samples combined with their survey responses, we can determine how many Canadians have antibodies against COVID-19 due to infection, vaccination or both.
- provide a platform to explore emerging public health issues;
- assist in the development of programs and services to respond to the needs of the current pandemic.
- identify the estimated prevalence of infection on any given day during May to August 2022 in Canada.
Reference period: Varies according to the question (e.g., March 2020 until today, in the past 12 months, etc.).
Collection period: April 2022 - August 2022
- Diseases and health conditions
- Lifestyle and social conditions
Data sources and methodology
The target population for the survey was adults 18 years of age and older living in the 10 provinces.
The observed population excluded: persons living in the three territories; persons living on reserves and other Indigenous settlements in the provinces; members of the Canadian Forces living on a base; the institutionalized population and residents of certain remote regions.
The content for the survey was developed by Statistics Canada's Centre for Population Health Data, with input from the COVID-19 Immunity Task Force (CITF) and in consultation with Health Canada and the Public Health Agency of Canada.
The survey took place in two parts: an electronic questionnaire about general health and exposure to COVID-19. The second part was two self-administered sample collections. The first was an at-home finger-prick dried blood spot sample collection and test and the second was a saliva sample. The samples are sent to a lab to determine the presence of COVID-19 antibodies and active or recent COVID-19 infections.
This is a sample survey with a cross-sectional design.
Dwelling Universe File (DUF) was used to select dwellings for persons 18 and over. Their contact information was then updated where possible using the 2021 Census database.
A stratified random sample of dwellings, and age order selection chose the respondent within the dwelling. The strata were based on the provinces and their census metropolitan area (CMA) regions within.
The following sampling units were used in order to have accurate information on dwellings:
Residential Telephone Frame (RTF)
Dwelling Universe File (DUF)
Given the heterogeneity of COVID-19 in the population, particularly by geography, sub-provincial strata were created and the sample was allocated across these strata.
In the provinces, 27 strata were created from first subdividing each province into CMA and non-CMA areas. The CMAs of St. John's, Halifax, Saint John, Montréal, Québec City, Toronto, Ottawa, Hamilton, Winnipeg, Regina, Saskatoon, Calgary, Edmonton and Vancouver form their own strata. From Ontario, Québec and British Columbia there were three additional strata of aggregated remaining CMA areas. Finally, there were 10 non-CMA regions, one for each province.
Typically, the population size of a stratum contributes to the sample size determination, where larger strata get more sample. This is then balanced by the need to ensure all strata receive sufficient sample to produce estimates. Increasing the sample in larger populations and increasing the sample in populations with more heterogeneity leads to more precise results at the national level. In this context, this means increasing the sample in large CMAs and strata with more COVID-19 confirmed cases leads to increased precision in the national estimates. Statistical sample allocation formulae were adapted to fit this specific situation, where the specific population size and proportion of confirmed COVID-19 cases for all strata were used in the allocation. Strata sample sizes were determined by a formula that favors larger population sizes and higher proportions of COVID-19 confirmed cases. The formula was then balanced to ensure sufficient sample was allocated to smaller strata with fewer cases. The results provide a sample allocation that will facilitate analysis for the hardest hit and larger strata with the added benefit of yielding more precise results nationally. Weighting that incorporates the sampling design ensures that the final weighted sample is representative of the population.
Sampling and sub-sampling
The age groups defined in the proposal are quite broad being defined as 18-39, 40-59 and 60 and over, but analysis is not limited to these broad groups.
Within each household, one individual aged 18+ was selected based on specific instructions within the letter they received (or provided by the interviewer if they respond by phone). The instructions used the age of household members to determine who was selected, and varied from one household to another. For some households, the oldest member was selected, others the second oldest, or the youngest, etc. These letters were randomly assigned to the selected dwellings ensuring that the selected individual from within the dwelling is random. This method randomly selects individuals of all ages (18+) and given the proposed sample sizes, analysis can be conducted at much finer age groups for aggregated geographies. Weighting of the sample was also performed for these finer age groups to ensure representativeness.
This comprehensive sample will provide nationally representative estimates as well as facilitating more granular estimation.
For those aged 18 and over, dwellings with a mailing address were be randomly selected, and one person from within the dwelling were selected at random to participate. There were strict instructions to ensure the selected individual does not choose a different person in the household.
A sample of 105,998 people was selected for the survey, split between 3 approximately equal and overlapping waves of collection. Respondents of wave 1 of collection received a dried blood spot (DBS) collection kit, while respondents of wave 2 and 3 of collection (approximately 70,000 respondents) received a DBS and a saliva collection kit.
A response rate of 50% for the questionnaire, 30% for the DBS antibody test and 30% for the PCR saliva test was assumed. It was hypothesized that the prevalence of Canadians aged 18 years and older with antibodies against the SARS-CoV-2 virus during collection was approximately 90%. This represents persons that have previously had an infection from or have been vaccinated against SARS-CoV-2, and have antibodies against the virus. It was hypothesized that the prevalence of Canadians aged 18 years and older with an active SARS-CoV-2 infection during waves 2 and 3 of collection was less than 5%.
Data collection for this reference period: 2022-04-01 to 2022-08-31
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
1- Collection methods
A) Electronic questionnaire
The only contact with respondents is a letter sent through the mail with the DBS and saliva collection kits. The letter informed people living at the sampled address that a randomly selected person has been chosen to participate in the survey. On the letter there was a code which gave access to the online questionnaire. The electronic questionnaire takes on average 20 minutes to complete. Respondents were asked a series of questions covering a wide range of COVID-19 related questions as well as questions on chronic conditions, medication, health behaviour and interactions with the health care system.
i. Dried blood spots (DBS) sample
The respondents were asked to provide a small blood sample (via finger prick) to be tested for COVID-19 antibodies. Respondents pricked their finger and place up to 5 blood spots on a test strip.
ii. A saliva sample
The respondent was asked to provide a saliva sample which was collected as soon as possible after completing the electronic questionnaire and the DBS.
Respondents then returned the DBS and saliva samples using the enclosed prepaid package for further analysis to determine if the respondent had antibodies to SARS-CoV-2 in their blood and whether they had a recent or active SARS-CoV-2 infection.
All materials related to the survey (initial letter, questionnaire, DBS and saliva sample collection instructions, etc.) were available in both official languages.
2- Follow-up methods
A Statistics Canada interviewer contacted invited participants by phone, email or text to follow up if we did not receive the respondent's complete questionnaire. A tracking system was used to flag the DBS and saliva samples that were not sent.
3- Languages offered
The questionnaire was developed in both official languages.
4- Average time to complete the survey
The electronic questionnaire took on average of 20 minutes to complete and each sample collection took approximately 10 minutes.
The CCAHS covered the population aged 18 and older living in the 10 provinces. Excluded from the survey's coverage were: persons living in the three territories; persons living on reserves and other Indigenous settlements in the provinces; members of the Canadian Forces living on a base; the institutionalized population and residents of certain remote regions. For the respondents 18 and over, this represents about 3% of the target population.
View the Questionnaire(s) and reporting guide(s) .
Electronic files containing the daily transmissions of completed respondent survey records were combined to create the "raw" survey file. Before further processing, verification was performed to identify and eliminate potential duplicate records and to drop non-response and out-of-scope records.
In addition, some out-of-scope respondent records were found during the data clean-up stage. All respondent records that were determined to be out-of-scope and those records that contained no data were removed from the data file.
After the verification stage, editing was performed to identify errors and modify affected data at the individual variable level. The first editing step was to identify errors and determine which items from the survey output needed to be kept on the survey master file.
Subsequent to this, invalid characters were deleted and the remaining data items were formatted appropriately.
There are only a few variables for which imputation was carried out. First, there are 2 household size variables: total household size and those only aged 18+. Only household size aged 18+ was asked to each respondent. In some cases, the respondent did not answer. The donor imputation was used. Lastly, total household size and income data were linked to the 2021 Census of Population database where possible and donor imputed otherwise.
The estimation of population characteristics from a sample survey is based on the premise that each person in the sample represents a certain number of other persons in addition to themselves. This number is referred to as the survey weight. The process of computing survey weights for each survey respondent involves several steps.
1) Each selected dwelling (in the household sample) is given an initial weight equal to the inverse of its selection probability from the sampling frame (DUF). Dwellings identified as out-of-scope during collection are dropped from the sample.
2) The weights for responding households are adjusted to represent the households that did not respond. Adjustment factors are calculated separately by province, and using a nonresponse model based on frame information.
3) The household weights are calibrated so that the sum of the weights match province level household size demographic counts.
4) Person weights are computed by multiplying the household level weights by the inverse of the probability of selecting the person within the household.
5) Each selected person in the targeted respondent sample is given an initial weight to the inverse of the selection probability of the person selected in the household. Note that the probability of a person being selected changes depending on the number of persons in the household.
6) Using the control totals for known COVID-19 vaccination cases, and persons reporting having been vaccinated against COVID-19, a post-stratification is applied to the person weights.
7) After responding to the electronic questionnaire (EQ), a number of respondents opted out from completing a DBS or saliva sample collection. The weight of non-respondent persons is redistributed to respondents within homogeneous response groups (HRGs). The logistic regression models are created using some of the questionnaire responses and adjustment factors are calculated with each HRG.
8) The person weights for EQ, DBS and PCR are calibrated so that the sum of the weights match demographic population counts at the region by age group and by sex. The weights are also calibrated to demographic counts for large Census Metropolitan Areas (CMAs) and grouping of small CMAs.
9) Note that following a series of adjustments applied to the weights, it is possible that some units will have weights that stand out from the other weights to the point of being aberrant. Some respondents may actually represent an abnormally high proportion in their group and therefore strongly influence both the estimates and the variance. To avoid this situation, a respondent weight that contributes aberrantly to the provincial total is adjusted downward using a method known as "winsorization." In this process, respondent weights that are considered to be outliers are replaced by the highest non-outlier weight for that province. All of the weights are then adjusted to redistribute the surplus weight (the part of the weight that is higher than the highest non-outlier weight). This is done by multiplying the non-outlier weights by an adjustment factor to create the winsorized adjusted weights.
10) A second calibration (an exact repetition of the first calibration) is done on the winsorized weights to produce the final weight.
Sampling variance estimation is based on a resampling method called the bootstrap.
The Generalized Estimation System (G-Est) was used to generate the survey weights and bootstrap weights.
Estimations of Seroprevalence Based on Dried Blood Spot Testing
High rates of vaccination in the Canadian adult population have resulted in a lower nucleocapsid response to infection, increasing uncertainty around declaring a sample positive for infection or not. Different methods can be used to estimate the proportion of a population with a particular antibody. For example, a threshold is often used to identify how many individuals have a sufficiently high quantity of antibodies for test results to be deemed positive. In that approach, case studies can be used to determine the risk of false positives (e.g., an antibody measurement quantity was above the threshold but the person never had an infection) and false negatives (e.g. an antibody measurement quantity was below the threshold but the person did have an infection). In the context of this study, a probability-based model was developed in partnership with the University of Ottawa and Sinai Health, to account for the uncertainty associated with determining nucleocapsid positivity. The model was derived from a case study of positive and negative samples, confirmed by repeated PCR and rapid antigen tests (RAT), conducted by researchers at Sinai Health. The cohort in this study was followed after the roll-out of vaccines and through the initial onset of the omicron variant. The model assumes that the relationship between the past infected status and the nucleocapsid levels in this cohort is representative of the general population covered by the CCAHS-2 between April and August 2022. The probability model was applied to the nucleocapsid data to obtain a probability of testing positive for nucleocapsid. The spike and receptor binding domain of spike (RBD) antibodies were considered positive if their value was above a pre-determined threshold. The sensitivity and specificity of the assays for spike and RBD were not accounted for in the seroprevalence estimations. Further information is available in the study's user guide.
While quality assurance mechanisms are applied at all stages of the statistical process, the validation and detailed review of data by statisticians is the final verification of quality prior to release. Many validation measures were implemented, they include:
a. Verification of estimates through cross-tabulations
b. Consultation with stakeholders internal to Statistics Canada
c. Consultation with external stakeholders
Survey weights were also adjusted to minimise any potential bias that could arise from survey non-response; non-response adjustments and calibration using available auxiliary information were applied and are reflected in the survey weights provided with the data file.
Extensive validations of survey estimates were also performed and examined from a bias analysis perspective. Despite these rigorous adjustments and validations, the high non-response increases the risk of a remaining bias and the magnitude with which such a bias could impact estimates produced using the survey data. Therefore, users are advised to use the CCAHS data with caution, especially when creating estimates for small sub-populations or when comparing to other publicly available sources of data.
Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Estimates with less than 5 positive counts in the numerator are suppressed for confidentiality reasons.
Estimates for which the effective sample size is below 30 are also suppressed.
Revisions and seasonal adjustment
This methodology does not apply.
The survey aims at producing unbiased national and provincial estimates of good quality. Age group and sex breakdowns are also possible, but careful considerations of sample size and quality indicator (confidence interval) must be taken into account.
In all, 105,998 persons were selected to participate in the Canadian COVID-19 Antibody and Health Survey cycle 2 (CCAHS-2) by two-stage sampling (household, then person).
In all, 105,998 persons were selected to participate in the CCAHS-2. The response rate to the electronic questionnaire was 30.7%. Of those who completed a questionnaire, 53.9% provided a completed DBS sample and consented to testing and 54.5% provided a saliva sample.
The CCAHS covers the population aged 18 and over living in the 10 provinces. Excluded from the survey's coverage are: persons living in the three territories; persons under the age of 18; persons living on reserves and other Indigenous settlements in the provinces; members of the Canadian Forces living on a base; persons living in institutions and residents of certain remote regions.
Much time and effort was devoted to reducing non-sampling errors in the survey. Quality assurance measures were applied at each stage of the data collection and processing cycle to control the quality of the data.