The two primary objectives of the General Social Survey (GSS) are: to gather data on social trends in order to monitor changes in the living conditions and well being of Canadians over time; and to provide information on specific social policy issues of current or emerging interest.

This survey monitors the changes in the structure of families with respect to marriages, common-law unions, children and fertility intentions.

Data release - February 28, 1997


This survey monitors changes in Canadian families. It collects information on: conjugal and parental history (chronology of marriages, common-law unions and children), family origins, children's home leaving, fertility intentions as well as work history and other socioeconomic characteristics.

The information collected will impact program and policy areas such as parental benefits, early learning and child-care strategies, affordable housing, child custody and spousal support programs.

Statistical activity

This record is part of the General Social Survey (GSS) program. The GSS, originating in 1985, conducts telephone surveys. Each survey contains a core topic, focus or exploratory questions and a standard set of socio-demographic questions used for classification. More recent cycles have also included some qualitative questions, which explore opinions and perceptions.

Until 1998, the target sample of respondents was approximately 10,000 persons. This was increased in 1999 to 25,000. With a sample of respondents of 25,000, results are available at both the national and provincial levels and possibly for some special population groups such as disabled persons and seniors.


  • Families, households and housing
  • Family history
  • Family types
  • Household characteristics
  • Society and community

Data sources and methodology

Target population

The target population includes all persons 15 years of age and older in Canada, excluding: 1. Residents of the Yukon, Northwest Territories, and Nunavut 2. Full-time residents of institutions. Respondents were contacted and interviewed by telephone. Thus persons in households without telephones could not be interviewed. However, persons living in such households represent less than 2% of the target population.

Instrument design

The questionnaire was designed based on qualitative testing (focus groups), a pilot test and interviewer debriefing.


This is a sample survey with a cross-sectional design.

Derivation of sampling variabilities for each of the estimates which could be generated from the survey would be an extremely costly procedure, and for most users, an unnecessary one. Consequently, approximate measures of sampling variability, in the form of tables, have been developed for use.

The Approximate Variance tables have been produced using the coefficient of variation formula based on a simple random sample and the straightforward expansion estimator. Since estimates for the General Social Survey were based on a complex sample design and the complicated raking ratio estimator alluded to earlier, a factor called the Design Effect was introduced into the variance formula.The Design Effect for an estimate is the actual variance for the estimate (taking into account the design and estimator that were used) divided by the variance that would result if the estimate had been derived from a simple random sample and a simple expansion estimator. The Design Effect used to produce the Approximate Variance Tables has been determined by first calculating Design Effects for a wide range of characteristics and then choosing among these a conservative value which will not give a false impression of high precision.

It should be noted that all coefficients of variation in these tables are approximate and therefore unofficial. Estimates of actual variance for specific variables may be purchased from Statistics Canada. Use of actual variance estimates may allow users to release otherwise unreleasable estimates, i.e. estimates with coefficients of variation in the "Not for Release" range.

Data sources

Data collection for this reference period: January 1995 to December 1995 (12 independent monthly samples)

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

Data for Cycle 10 were collected monthly from January 1995 to December 1995. The sample was evenly distributed over the 12 months to evenly represent the seasonal variation in the information gathered. Most of the sample was selected using the Elimination of Non-working Banks techniques of Random Digit Dialing (RDD). An additional sample of 1,250 respondents sponsored by the province of Quebec was added in May and spread equally over the remaining months.

Data collection was conducted by Computer Assisted Telephone Interviewing (CATI) methods. All interviewing took place using centralized telephone facilities in four of Statistics Canada's regional offices.

Data collection for the GSS was conducted by Computer Assisted Telephone Interviewing (CATI) methods and involved two possible questionnaires. Respondents were interviewed in the official language of their choice. The French and English versions of the main questionnaire were identical with the exception of question R25 "What language did you first speak in childhood?" Respondents were not asked if they still understood the language in which they were being interviewed. The questionnaires, the procedures and the CATI system were field tested in August, 1994 in Winnipeg and Montreal. Data collection began in January 1995 and continued through the second week of December 1995. The main sample was evenly distributed over the 12 months. All interviewing took place using centralized telephone facilities in four of Statistics Canada's regional offices with calls being made from approximately 09:00 until 21:00, Monday to Saturday inclusive. The four regional offices were: Halifax, Montreal, Winnipeg and Vancouver. Interviewers were trained by Statistics Canada staff in telephone interviewing techniques using CATI, survey concepts and procedures in a four day classroom training session. The majority of interviewers had computer and telephone interviewing experience.
Using CATI, responses to survey questions were entered directly into computers as the interview progressed. The CATI data capture program allowed a valid range of codes for each question and built-in edits, and automatically followed the flow of the questionnaire. The data were transmitted to Ottawa electronically.

View the Questionnaire(s) and reporting guide(s) .

Error detection

All survey records were subjected to computer edits throughout the course of the interview. With CATI, built-in edits identified invalid or inconsistent information as the interview progressed. As a result, such problems could be immediately resolved with the respondent.

The system principally edited the main questionnaire for possible flow errors, out of range values and missing values. Edits on the 10-1 were limited to a few edits for the respondent's age and sex. The CATI system implemented such edits throughout the course of the interview. If the interviewer was unable to correctly resolve the detected errors, it was possible for the interviewer to bypass the edit and forward the data to head office for resolution.

Head office edits performed the same checks as the CATI system as well as more detailed edits. Records with missing or incorrect information were assigned non-response codes and in a small number of cases corrected from other information from the respondent's questionnaire. In most cases when editing, if data were inconsistent with responses that came earlier, the earlier information was considered to be correct. For example, if a screening question introduced two or more mutually exclusive "branches" (or "paths") in the questionnaire and data existed for more than one branch, it was the response to the screening question that was deemed correct, and only data in the branch corresponding to this response was retained.

Due to the nature of the survey, imputation was not appropriate for most items and thus "Not stated" codes were usually assigned for missing data. In some cases, the answer was not known but could be obtained deterministically by either the questions which followed or from information from other areas of the survey.

There are three reasons that can explain the absence of a response for a question: the question may have been skipped because of a previous response; the question may have been skipped because of a previous refusal; or the respondent may have refused to answer the question. In the first case, the question is considered "not applicable" and is given a code of 7, 97, 997 or 9997. In the second case the applicability of the question is not known since the question that determines the applicability was refused, so the question is not asked, applicability unknown and is given a code of 6, 96, 996 or 9996. In the third case the question is "refused" and is given a code of 9, 99, 999 or 9999.

Non-response was not permitted for those items required for weighting. Values were imputed in the rare cases where the number of residential telephone lines was missing. The imputation was based on a detailed examination of the data and the consideration of any useful data such as the ages and sexes of other household members, and the interviewer's comments. The procedure
used to select the respondent ensured that there was always a value for age. When not provided by the respondent, DVTEL (number of residential phone lines) was assigned a value of one (1).

Quality evaluation

The extent of non-response varies from partial non-response (failure to answer just one or some questions) to total non-response. Total non-response occurred because the interviewer was either unable to contact the respondent, a language problem prevented the interview from taking place, or the respondent refused to participate in the survey. Total non-response was handled by adjusting the weight of households who responded to the survey to compensate for those who did not respond.

In most cases, partial non-response to the survey occurred when the respondent did not understand or misinterpreted a question, refused to answer a question, or could not recall the requested information.

Users should determine the number of records on the microdata file which contribute to the calculation of a given estimate. When the number of contributors to the weighted estimate is less then 15 the weighted estimate should not be released regardless of the value of the Approximate Coefficient of Variation.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Data accuracy

Statistics from the General Social Survey (GSS) databases are estimates based on data collected from a small fraction of the population (roughly one person in 2,000) and are subject to error. The error can be divided into two components: sampling error and non-sampling error.

Sampling error is the difference between the estimate derived from a sample and the result that would have been obtained from a population census using the same data collection procedures. For a sample survey such as the GSS, this error is estimated from the survey data. The measurement of error used is the standard deviation of the estimate. When a sampling error is more than 33 1/3% of the estimate itself, it is considered to be too unreliable to be published. In such a case, the symbol '' -- '' appears in statistical tables in place of the estimate. When the sampling error is between 16 2/3% and 33 1/3%, the corresponding estimate is accompanied by the symbol " * '' in a table. Such estimates should be used with caution. Finally, all estimates with a sampling error of less than 16 2/3% can be used without restriction.

All other types of errors, such as coverage, response, processing, and non-response, are non-sampling errors.

Many of these errors are difficult to identify and quantify.
Coverage errors arise when there are differences between the target population and the surveyed population. Households without telephones represent a part of the target population that was excluded from the surveyed population. To the extent that this excluded population differs from the rest of the target population, the estimates will be biased. Since these exclusions are small, one would expect the biases introduced to be small. However, since there are correlations between a number of questions asked on this survey and the groups excluded, the biases may be more significant than the small size of the groups would suggest.

Individuals residing in institutions were excluded from the surveyed population. The effect of this exclusion is greatest for people aged 65 and over, for whom it approaches 9%.

In a similar way, to the extent that the non-responding households and persons differ from the rest of the sample, the estimates will be biased. The overall response rate for the GSS was 80%. Non-response could occur at several stages in this survey. There were two stages of information collection: at the household level and at the individual level. Non-response at the household level was about 6%. Non-response also occurs at the level of individual questions. For most questions, the response rate was high and, in tables, the non-responses generally appear under the heading "not stated".


