The two primary objectives of the General Social Survey (GSS) are: to gather data on social trends in order to monitor changes in the living conditions and well being of Canadians over time; and to provide information on specific social policy issues of current or emerging interest.

This survey collects information on the nature and extent of criminal victimization in Canada.

Data release - July 7, 2005


Statistical activity

This record is part of the General Social Survey (GSS) program. The GSS, originating in 1985, conducts telephone surveys. Each survey contains a core topic, focus or exploratory questions and a standard set of socio-demographic questions used for classification. More recent cycles have also included some qualitative questions, which explore opinions and perceptions.

Until 1998, the target sample of respondents was approximately 10,000 persons. This was increased in 1999 to 25,000. With a sample of respondents of 25,000, results are available at both the national and provincial levels and possibly for some special population groups such as disabled persons and seniors.


  • Crime and justice
  • Society and community
  • Victims and victimization

Data sources and methodology

Target population

The target population is non-institutionalized persons 15 years of age or older, living in the ten provinces.

The samples for most GSS cycles are selected using random digit dialing telephone methods and the interviews are conducted by telephone. Thus persons in households without telephones cannot be interviewed. However, persons living in such households represent less than 2% of the target population. Interviews are not conducted by cellular telephone so persons with only cellular telephone service are also excluded; again, this group makes up a very small proportion of the population, less than 3%.

Instrument design

The questionnaire was designed based on qualitative testing (focus groups), a pilot test and interviewer debriefing.


This is a sample survey with a cross-sectional design.

Data for Cycle 18 of the GSS were collected in seven waves starting in January 2004 and ending in December 2004. The sample was evenly distributed over the seven waves to counterbalance as much as possible the seasonal variation in the information gathered. Households were selected for the survey by Random Digit Dialing. The telephone numbers in the sample were selected using the Elimination of Non-Working Banks technique. This sampling technique is a method in which an attempt is made to identify all working banks for an area (i.e., to identify all sets of 100 telephone numbers with the same first eight digits containing at least one number that belongs to a household). Thus, all telephone numbers within non working banks are eliminated from the sampling frame.

In order to carry out sampling, each of the ten provinces was divided into strata, i.e. geographic areas.

Many of the Census Metropolitan Areas (CMAs) were each considered separate strata. This was the case for St. John's, Halifax, Saint John, Montreal, Quebec City, Toronto, Ottawa, Hamilton, Winnipeg, Regina, Saskatoon, Calgary, Edmonton, Vancouver and Victoria. CMAs not on this list are located in Quebec and Ontario. Two more strata were formed by grouping the remaining CMAs in each of these two provinces. Finally, the non-CMA areas of each of the ten provinces were also grouped to form ten more strata. This resulted in 27 strata in all.

Data sources

Data collection for this reference period: January 2004 to December 2004

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

Computer assisted telephone interviewing (CATI) was used to collect data for the GSS. Households were selected through Random Digit Dialling methods. When a private household was reached, interviewers enumerated all household members, collecting basic demographic information including age, sex and martial status. An algorithm was then used to randomly select one household member (age 15 and older) to participate in the survey. Respondents were interviewed in the official language of their choice. Interviews by proxy were not allowed. The overall response rate during collection for Cycle 18 was 75%.

All interviewing took place using centralized telephone facilities in three Statistics Canada regional offices, with calls being made from approximately 9:00 a.m. until 9:00 p.m., Monday to Saturday inclusive. The three regional offices were: Halifax, Montreal, and Winnipeg. Statistics Canada staff trained interviewers in survey concepts and procedures as well as telephone interviewing techniques using CATI. The majority of interviewers had previous experience interviewing for the GSS.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Error detection is done through edits programmed into the CATI system.

The CATI data capture program allows a valid range of codes for each question and built-in edits, and automatically follows the flow of the questionnaire.

All survey records are subjected to computer edits throughout the course of the interview. The CATI system principally edits the flow of the questionnaire and identifies out of range values. As a result, such problems can be immediately resolved with the respondent. If the interviewer is unable to correctly resolve the detected errors, it is possible for the interviewer to bypass the edit and forward the data to head office for resolution. All interviewer comments are reviewed and taken into account in head office editing.

Head office edits perform the same checks as the CATI system as well as more detailed edits.


The flow editing carried out by head office followed a 'top down' strategy, in that whether or not a given question was considered "on path" was based on the response codes to the previous questions. If the response codes to the previous questions indicated that the current question was "on path", the responses, if any, to the current question were retained, though "don't know" was recoded as 9 (99 or 999, etc.) and refusals were recoded as "Not Stated", i.e. 8 (98 or 998, etc.). If the response codes to the previous questions indicated that the current question was "off path" because the respondent was clearly identified as belonging to a sub-population for which the current question was inappropriate or not of interest, the current question was coded as "Not Applicable", i.e. 7 (97 or 997, etc.).

Due to the nature of the survey, imputation was not appropriate for most items so missing data were coded as 'not stated'.

However, non-response was not permitted for those items required for weighting. Values were imputed in the rare cases where either of the following was missing: sex or number of residential telephone.


There are two microdata files from which GSS Cycle 18 estimates can be made.

The Main File contains questionnaire responses and associated information from 23,766 respondents. Characteristics on this file concern the person as opposed to information about any individual victimization incidents which he or she may have experienced.

Three weighting factors were placed on the Main File. They are listed and explained below:

WGHT_PER: This is the basic weighting factor for analysis at the person level, i.e. to calculate estimates of the number of persons (non-institutionalized and aged 15 or over) having one or several given characteristics. WGHT_PER should be used for all person-level estimates.

WGHT_HSD: This weighting factor can be used to estimate the number of households with a given characteristic. For example, to estimate the number of households that live in low-rise apartments, WGHT_HSD should be summed over all records with this characteristic.

WGHT_ABU: This weighting factor is required to estimate the number of victimization incidents that occurred over the past 12 months within certain violent relationships, namely those with spousal or ex-spousal violence. It should therefore only be used for estimates involving variables PR_101_2004 and PR_304_2004.

The second microdata file is the Incident File. The 9,824 records on this file contain reports of victimization incidents. Each victimization incident experienced by a respondent of the survey is included on one of the file's records, excluding incidents of stalking and spousal or ex-spousal violence. Each record of the Incident File can be thought of as representing a number of victimization incidents experienced by persons in the overall population. This number is given by the weighting factor WGHT_VIC. Usually there is a report for each victimization incident, but victimization incidents with very similar details are recorded on the same report (known as a series report). The number of incidents that the report represents is known as the series factor and is given by variable NUMINC. To estimate the total number of incidents with a given characteristic, one would multiply WGHT_VIC by the series factor and sum over all records with the characteristic.

Note that some series reports involve a large number of similar incidents. Some analysts may feel that leaving them as is will lead to a disproportionate contribution to victimization estimates from this type of incident. Indeed, the series factor was capped at 3 for estimates published in this user's guide. If analysts wish to use the same cap for the series factor, they may use the weighting factor ADJWTVIC, which is WGHT_VIC multiplied by the capped (at 3) series factor.

Note also that violence by a current or ex-spouse is only captured in abuse reports on the Main File. Suppose a given estimate of a number of victimization incidents is to include violence of this type. Then the number of victimization incidents involving this type of violence must be calculated from the Main File separately and then added to the estimate from the Incident File.

More information and examples of these kinds of estimates can be found in Section 7.6 of the User's Guide.

In addition to the estimation weights,bootstrap weights have been created for the purpose of design-based variance estimation.

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Variables that provide data on abuse by a current spouse / partner have been suppressed in the Cycle 18 public use microdata file. This includes all direct and derived variables based on sections 3 and 5 of the Cycle 18 questionnaire.

Revisions and seasonal adjustment

This methodology type does not apply to this statistical program.

Data accuracy

The methodology of this survey has been designed to control errors and to reduce the potential effects of these. However, the results of the survey remain subject to error due to both sampling error and non-sampling error.

Sampling error:
As the data are based on a sample of persons they are subject to sampling error. That is, estimates based on a sample will vary from sample to sample, and typically they will be different from the results that would have been obtained from a complete census. The potential range of this difference has been estimated for key data and used to produce tables that can be used to estimate the sampling variability of many estimates. More precise estimates of the sampling variability of estimates can be produced with the bootstrap method using bootstrap weights that have been created for this survey. The bootstrap method was used to estimate the sampling variability for all of the estimates included in 'General Social Survey on Victimization, cycle 18: an overview of findings' and for the important comparisons made in the text. Estimates with high sampling variability are indicated in this article and all of the highlighted differences between subgroups of the population are significant at the 95% level.

Non-sampling error:
Even a census of the population of interest produces estimates subject to error. While these are called non-sampling errors, estimates from samples still contain errors of this type. Common sources of these errors are imperfect coverage, non-response, response errors, and processing errors.

Coverage of the GSS-18 targeted population by the RDD frame is estimated to be more than 96% complete; rates of telephone service are very high in Canada. These rates are high for virtually all socio-demographic groups, but are lowest among those households with the lowest incomes. As a result persons living in such households are slightly under-represented in the GSS-18 sample. In addition, while every effort was made to avoid non-response, the non-response rate for GSS-18 was 25%. Little or nothing is known about the non-responding cases, and so the results may be biased to the extent that the non-responding cases differ from those that provided responses.


