Ethnic Diversity Survey (EDS)

Detailed information for 2002




One Time

Record number:


There are two primary objectives of the survey. First of all, the survey will help us to better understand how people's backgrounds affect their participation in the social, economic and cultural life of Canada. Secondly, the survey will provide information to better understand how Canadians of different ethnic backgrounds interpret and report their ethnicity.

Data release - September 29, 2003


Statistics Canada (STC) was approached by Canadian Heritage (PCH) to develop and conduct a survey on ethnicity, its various dimensions and related issues of changing cultural diversity in Canada.

The survey followed the 2001 Census with the census providing the frame for the sample. The survey is funded jointly by STC and PCH.

There are two primary objectives of the survey. First of all, the survey will help us to better understand how people's backgrounds affect their participation in the social, economic and cultural life of Canada. Secondly, the survey will provide information to better understand how Canadians of different ethnic backgrounds interpret and report their ethnicity.

Topics covered in the survey include ethnic ancestry, ethnic identity, place of birth, visible minority status, religion, religious participation, knowledge of languages, family background, family interaction, social networks, civic participation, interaction with society, attitudes, satisfaction with life, trust and socio-economic activities.

The information collected in the survey will be used to inform policy and program development in the Department of Canadian Heritage. Several of the modules, such as Interaction with Society, Civic Participation, Attitudes and Socio-economic Activities, specifically relate to the policy and program needs of Canadian Heritage.

In addition, information collected will be used in future data collections in the area of ethnicity, specifically in the content development of the 2006 Census. For example, the survey asks questions about ethnic ancestry and ethnic identity, and the importance of ethnicity to the respondent. The survey explores both objective and subjective dimensions of ethnicity and asks questions about the respondent's ethno-cultural background in order to better understand how respondents choose or do not choose certain ethnic identifications.


  • Immigration and ethnocultural diversity (formerly Ethnic diversity and immigration)
  • Integration of newcomers
  • Society and community

Data sources and methodology

Target population

The target population for the main survey are persons aged 15 years or over living in private households in the 10 provinces. The population does not include persons living in collective dwellings, persons living on Indian reserves, persons declaring an Aboriginal origin or identity in the 2001 Census, or persons living in Northern and remote areas.

Instrument design

Extensive discussions were held between Statistics Canada and the Department of Canadian Heritage in order to develop a questionnaire that reflects the survey's objectives. Where possible, standard questions from Statistics Canada's surveys, as well as from other surveys in the area of ethnicity, have been used in the questionnaire. The development of the survey's content was also guided by the discussions and recommendations of the Advisory Committee to the survey.

The content for this survey also reflects results of a series of external qualitative tests, conducted between January and October 2001. External testing of the questionnaire involved one-on-one interviews and focus groups, conducted across Canada. In total, 265 participants were involved in the qualitative testing of the questionnaire content for this survey.

Results from a pilot test, conducted in September 2001, were also used to develop and refine the content and the survey instrument. The pilot test was conducted with approximately 1,500 respondents from across Canada. The 1998 National Census Test provided the frame for the test. The objective of the test was to evaluate the questions, format of the questionnaire, collection procedures, interviewing procedures (manuals, instructions, etc), and the CATI and Blaise capture procedures in preparation for the main survey.


This is a sample survey with a cross-sectional design.

Respondents for the EDS were selected from those who answered the long questionnaires of the 2001 Census, which had been distributed to one household in five in Canada. The population sampled in the survey were selected on the basis of the responses given to questions on ethnic origin, place of birth, and place of birth of parents. Responses to the ethnic origin question were divided up to form the two main categories of interest: CBFA+ (Canadian or British or French or Americans or Australians and/or New Zealanders) and Non-CBFA+ (all other responses containing at least one origin other than CBFA+). This final category was divided into European origins and non-European origins.

Finally, questions on the birthplace of respondents and their parents were used to establish the respondent's generational status. The first generation includes respondents born outside Canada. The second generation includes respondents born in Canada with at least one parent born outside Canada. The third-plus generation includes respondents born in Canada both of whose parents were also born in Canada. These strata by generations were then consolidated to obtain a sufficiently high number of persons in the stratum.

Because of the goals of the survey and the data requirements for certain subpopulations, sample distribution was established at one-third for CBFA+ and two-thirds for non-CBFA+. This distribution ensured that a sufficient number of persons would be obtained in the categories of interest, especially where immigrants were concerned.

The final sample was 57,242 persons. Of that number, 42,476 responded to the survey, which corresponds to an overall response rate of 75.6% if the 1057 persons classified as being outside the scope of the survey are taken into account.

Data sources

Data collection for this reference period: April 2002 to August 2002

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

The data for the Ethnic Diversity Survey were collected using the Blaise software and the computer-assisted telephone interview (CATI) method. The average length of interviews was 35 to 40 minutes, but this varied with the respondent's situation. Proxy (or third person) responses were not permitted. In addition to the two official languages, interviews were conducted in seven non-official languages: Mandarin, Cantonese, Italian, Punjabi, Portuguese, Vietnamese and Spanish.

View the Questionnaire(s) and reporting guide(s) .

Error detection

The data received from the interviews were checked to ensure the validity, consistency and completeness of the questionnaires. Wherever possible, automated controls were integrated into the collection mode to minimize errors and correct them with the respondent's assistance. Subsequently, potential errors were corrected with the help of notes made by the interviewers. Control and edit rules were developed to identify and correct inconsistencies for each question and each potential path within the questionnaire. In addition to these checks, missing responses to geographic (i.e., province of residence, census metropolitan area) and demographic (age, sex, marital status and relationships between members of household) variables were imputed in a deterministic way.


Missing responses to geographic (i.e., province of residence, census metropolitan area) and demographic (age, sex, marital status and relationships between members of household) variables were imputed in a deterministic way.


Since the Ethnic Diversity Survey is a survey based on a probability sampling plan, each respondent represents a certain number of other persons of the population who do not form part of the sample. This number of persons is known as the weight. A weight is attributed to each respondent selected. The weight is then adjusted in order to take into account non-respondents as well as the differences between the sample's characteristics and those of the target population.

The Ethnic Diversity Survey used the bootstrap method to estimate the variance. This re-sampling method drew 500 independent samples (with replacement) from our initial sample, corrected as per the survey sampling plan. The initial weights were calculated for each of the samples and adjustments to the weights performed using the same steps as for the initial sample (non-response, post-stratification and raking ratio estimation). In this way, the component of variance generated at each step of the weighting can be taken into account. Estimators of interest were then calculated for each sample. Empirical variance was calculated on all the samples to produce an estimate of the variance of the estimators in question. The Ethnic Diversity Survey uses the coefficient of variation (c.v.) as a quality indicator measurement. The WesVar software, developed by Westat, was used as a tool to calculate coefficients of variation.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

In order to prevent any data disclosure, confidentiality analysis is done using the Statistics Canada Generalized Disclosure Control System (G-Confid). G-Confid is used for primary suppression (direct disclosure) as well as for secondary suppression (residual disclosure). Direct disclosure occurs when the value in a tabulation cell is composed of or dominated by few enterprises while residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.

Data accuracy

The possible errors of a survey can be grouped into two main categories: sampling errors and non-sampling errors. Sampling error derives from the fact that the estimates were obtained from a sample, rather than from a census of the entire population performed under the same conditions. In the Ethnic Diversity Survey, this error was measured using the coefficient of variation (c.v.). This number, expressed as a percentage, corresponds to the standard error (or square root of the variance of the estimator) divided by the estimator itself. The smaller the c.v., the smaller the variability of the sample and the more accurate the estimators. The EDS uses the following measurements:
(i) when the c.v. is greater than or equal to 33.4%, the estimator is considered "unacceptable" and the symbol "F" appears beside the corresponding estimator;
(ii) when the c.v. falls between 16.6% and 33.3%, the estimator is considered "poor" and the symbol "E" for caution appears beside the estimator;
(iii) when the c.v. is 16.5% or less, the estimator is considered "acceptable", it can be used without restriction and no indication appears beside it.

All other types of errors are not due to sampling and may arise at any stage of a survey. This type of error includes primarily errors in coverage, in non-response, response and processing. In general, the effect of some of these errors (response and processing) is more difficult to identify and to quantify. The editing and verification steps taken at each phase of the survey were done to minimize these two types of errors.

As mentioned earlier, the total response rate to the survey was 75.6%. Rates of response per stratum ranged from 72% to 80%. As might be expected, the first generations had the lowest response rate, 73% compared with 77% for the second and third generations or more. Partial non-response accounted for only 3.2% of responses, which means that, generally, when a person began the interview for this survey, all the questions were answered.

A coverage error occurs when there is a variance between the target population and the sampled population, such as when it is forgotten to include persons in the survey frame, or when they are mistakenly included, or when they are counted twice. Using the Census as a survey frame helped reduce this kind of error. The net under-coverage error of the Census is around 3%. It should also be noted that in selecting the sample, a coding error slipped into the survey frame for some Census subdivisions. This problem mainly affects two Atlantic provinces, Nova Scotia and New Brunswick. A detailed study of the characteristics of persons not covered, using census data, shows that the sample remained representative of the target population for the majority of the survey variables of interest. Adjustments similar to those used for correction of non-response were made to the weights to reduce potential bias due to this error in coverage. Although the bias created could not be completely eliminated, the quality of the data at the national level was not affected. However, all identification of provinces in the Atlantic Region was removed from the final database . The lowest geographical level for the Atlantic Region is the indicator of Census Metropolitan Area (CMA) and the non-CMAs.


Date modified: