General Social Survey Historical Database
Detailed information for 1985-2006
The Historical Database gathers the data from existing cycles of the General Social Survey (GSS) together in an easily accessed form so that researchers may follow trends in Canadian society over time.
Data release - The GSS Historical Database will only be available in the Canadian Research Data Centres, but it is possible to purchase special tabulations from Statistics Canada.
While the annual releases of GSS data typically comprise analytic output that includes some comparisons of data over time, the 20th year of GSS data is an excellent opportunity to look back over the years and ask: what has been discovered about Canadian society over those 20 years?
The GSS Historical Database was developed in order to facilitate researchers attempting to answer this question. Since processing has changed over the years (for example, variable names and code sets have changed), this means that the files cannot simply be concatenated together with all of the variables. Nevertheless, a database has been created containing much of the GSS information, which can be easily used by researchers. The idea in producing the database was to link information from past cycles together in a harmonized way so that researchers may follow trends over time.
The basic model for the GSS Historical Database is presented in Figure 1 in the "Documentation" section below. The figure gives rows as the various cycles and columns as the various collections of variables. The database consists of twenty separate, but harmonized, files (one cannot simply combine them and consider a large pooled set of data, as there is no real population to which such data would refer).
Each record of the database represents a respondent. There are approximately 10,000 records for each of GSS-1 to GSS-12 and 25,000 records for the recent cycles, giving a database with around 330,000 records in total. The number of fields for each record will be constant, but records will have different fields filled, as the diagram shows by having "holes" for columns not applicable to a specific cycle.
Harmonized variables that are common across cycles have a common name and format.
This record is part of the General Social Survey (GSS) program. The GSS, originating in 1985, conducts telephone surveys. Each survey contains a core topic, focus or exploratory questions and a standard set of socio-demographic questions used for classification. More recent cycles have also included some qualitative questions, which explore opinions and perceptions.
Until 1998, the target sample of respondents was approximately 10,000 persons. This was increased in 1999 to 25,000. With a sample of respondents of 25,000, results are available at both the national and provincial levels and possibly for some special population groups such as disabled persons and seniors.
Reference period: The GSS Historical Database was created once based on the first twenty cycles of data (1985-2006).
- Society and community
Data sources and methodology
To be in the GSS Historical Database, a variable must be comparable (in definition, questionnaire flow, concept measurement, etc.) in at least two cycles of the General Social Survey. Variables that were found to be not comparable across years (because of changes in definitions, for example) have been excluded.
This methodology does not apply.
Data are collected from other Statistics Canada surveys and/or other sources.
The GSS Historical Database gathers together some 330,000 respondent records in twenty individual data sets with harmonized variable names and formats. All data come from the General Social Survey.
The GSS Historical Database has gone through a series of quality checks to make sure that the data conform with those originally published in all of the General Social Survey cycles. Historical trend analysis was conducted and when variables varied slightly in content, the categories were collapsed to make them compatible over time. All documentation was verified with existing cycle documentation to ensure consistency.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
No public use microdata file will be produced for the GSS Historical Database.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
Not all variables could be harmonized in the GSS Historical Database. In some cases, variables were collapsed in order to have consistency over time, however this meant that some detail could be lost.
Caution should be taken when pooling data because of the population changes over time (see the link "Considerations before Pooling Data from Two Different Cycles of the General Social Survey" in the "Documentation" section).
- The General Social Survey: An Overview
Last review : January 7, 2021
- Figure 1: Basic Data Model for GSS 20th Anniversary Historical Database
- Considerations before Pooling Data from Two Different Cycles of the General Social Survey
The General Social Survey Program has been gathering data for some time now in a series of independent annual household surveys. A rich collection of data now exists for social science and other researchers. One interesting potential tool is the integration of data across two or more cycles. This means using the data from two or more cycles to estimate quantities of interest. There are two ways of doing this (which, in general, give different answers). One is to compute separate estimates by cycle and combine them. Another way is by simply pooling the data sets from different cycles together and computing estimates using the pooled data. This document provides a brief description of situations when pooling is appropriate as well as some things to take into account for the researcher interested in pooling. Additionally, some basic, practical recipes are provided to aid the researcher in the integrating exercise.