General Social Survey - Education, Work and Retirement (GSS)

Status:
Inactive
Frequency:
Quinquennial (5 year)
Record number:
4500

The two primary objectives of the General Social Survey (GSS) are: to gather data on social trends in order to monitor changes in the living conditions and well being of Canadians over time; and to provide information on specific social policy issues of current or emerging interest.

This survey monitored changes in education, work and retirement, and examined the relationships between these three main activities.

Detailed information for 1994 (Cycle 9)

Data release - November 30, 1995

Description

The two primary objectives of the General Social Survey (GSS) are: to gather data on social trends in order to monitor changes in the living conditions and well being of Canadians over time; and to provide information on specific social policy issues of current or emerging interest.

This survey monitored changes in education, work and retirement, and examined the relationships between these three main activities.

Statistical activity

This record is part of the General Social Survey (GSS) program. The GSS, originating in 1985, conducts telephone surveys. Each survey contains a core topic, focus or exploratory questions and a standard set of socio-demographic questions used for classification. More recent cycles have also included some qualitative questions, which explore opinions and perceptions.

Until 1998, the target sample of respondents was approximately 10,000 persons. This was increased in 1999 to 25,000. With a sample of respondents of 25,000, results are available at both the national and provincial levels and possibly for some special population groups such as disabled persons and seniors.

Subjects

  • Educational attainment
  • Education, training and learning
  • Job training and educational attainment
  • Labour
  • Seniors
  • Society and community
  • Work and retirement
  • Work transitions and life stages

Data sources and methodology

Target population

The target population includes all persons 15 years of age and older in Canada, excluding:
1. Residents of the Yukon, Northwest Territories, and Nunavut
2. Full-time residents of institutions.

Sampling

This is a sample survey with a cross-sectional design.

The GSS used a stratified design, with significant differences in sampling fractions between strata. Thus some areas are over-represented in the sample (relative to their populations) while some other areas are relatively under-represented; this means that the unweighted sample is not representative of the target population.


The GSS employed two different Random Digit Dialing (RDD) sampling techniques. For Newfoundland, Nova Scotia, Ontario and Alberta, the Elimination of Non-Working Banks method was used. For the remaining provinces, the Waksberg method was used.

Data sources

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

The data were collected over a 12-month period, from January 1994 to December 1994, using a computer-assisted telephone interview system.

View the Questionnaire(s) and reporting guide(s) .

Error detection

All survey records were subjected to an exhaustive computer edit to identify and correct invalid or inconsistent information on the questionnaires. Records with missing or incorrect information were assigned non-response codes or corrected from other information from the respondent's questionnaire. In most cases, editing was 'bottom-up' meaning that specific related information following a question with a branching pattern was employed to ensure the branching was correct. For example, question E1 of the "Education and Work Questionnaire" was "In the next five years, do you plan to start an additional educational or training program?", this question was edited in relation to question E2 "What is your main reason for planning to do this?". The edit ensured that the information was consistent and complete between these questions.

Due to the nature of the survey, imputation was not appropriate for most items and thus 'not stated' codes were usually assigned for missing data. In some cases, the answer was not known but could be narrowed to a subset of possible answers using the skip pattern (e.g. variable A11, value 21, where the answer is one of the values 01 to 04 but the exact one is not known).

However, non-response was not permitted for those items required for weighting. Values were imputed for the following: age; sex; number of residential telephone lines. Sex was imputed for 12 records. The imputation was based on the respondent's name as recorded on the screening form. For 2 cases where the name was not clearly one sex or the other, the sex was imputed randomly.

Age was imputed for 8 records. The imputation was based on the respondent's values for various fields on the questionnaire which were age related (such as question G2 from the Education and Work Questionnaire "Which of the following best describes your main activity during 1988?" and years when particular events had occurred, i.e question C1, "In what year did you complete your studies or stop taking courses?" ). As well, a number of records had age "derived" using the year and month of birth as reported on the Education and Work Questionnaire.

DVTEL (number of residential telephone lines) was derived from questions P5 to P9 of the Education and Work Questionnaire (4-2). For 137 records, there was incomplete or conflicting information for these questions:
- 48 records had some of the P5 to P9 fields changed by an edit program
- 89 records with no information for any of P5 to P9 were assigned a DVTEL value of 1.

Data from the survey questionnaires were entered directly into mini-computers in Statistics Canada's regional offices and transmitted to Ottawa. The data capture program allowed for a valid range of codes for each question and automatically followed the flow of the questionnaire. Operators were able to enter either invalid data or information that violated the questionnaire flow but only through the use of special functions after they had been alerted that the entry was not valid. No editing to check consistency between questions was done at this stage.

Given the content of this cycle, the coding of occupation, industry and education was important. The coding for the three education fields (A15, B3 and E5) was performed manually using the 1986 Census specifications for the Major Field of Study.

For each job held by the respondent (with the exception of L14), the questionnaire collected information on the name of the employer, the kind of business, industry or service the employer was in and the kind of work done. This information was used to assign industry and occupation codes to each job using the 1980 version of Statistics Canada's Standard Industrial and Occupational Classifications.

The coding for industry and occupation was done using the automated coding system used by the Labour Force Survey of Statistics Canada. Those codes which could not be coded automatically were done manually. The codes

Estimation

Statistics from the General Social Survey (GSS) databases are estimates based on data collected from a small fraction of the population (roughly one person in 2,000) and are subject to error. The error can be divided into two components: sampling error and non-sampling error.

Sampling error is the difference between the estimate derived from a sample and the result that would have been obtained from a population census using the same data collection procedures. For a sample survey such as the GSS, this error is estimated from the survey data. The measurement of error used is the standard deviation of the estimate. When a sampling error is more than 33 1/3% of the estimate itself, it is considered to be too unreliable to be published. In such a case, the symbol '' -- '' appears in statistical tables in place of the estimate. When the sampling error is between 16 2/3% and 33 1/3%, the corresponding estimate is accompanied by the symbol " * '' in a table. Such estimates should be used with caution. Finally, all estimates with a sampling error of less than 16 2/3% can be used without restriction.

All other types of errors, such as coverage, response, processing, and non-response, are non-sampling errors.

Many of these errors are difficult to identify and quantify.

Coverage errors arise when there are differences between the target population and the surveyed population. Households without telephones represent a part of the target population that was excluded from the surveyed population. To the extent that this excluded population differs from the rest of the target population, the estimates will be biased. Since these exclusions are small, one would expect the biases introduced to be small. However, since there are correlations between a number of questions asked on this survey and the groups excluded, the biases may be more significant than the small size of the groups would suggest.

Individuals residing in institutions were excluded from the surveyed population. The effect of this exclusion is greatest for people aged 65 and over, for whom it approaches 9%.

In a similar way, to the extent that the non-responding households and persons differ from the rest of the sample, the estimates will be biased. The overall response rate for the GSS was approximately 80%. Non-response could occur at several stages in this survey. There were two stages of information collection: at the household level and at the individual level. Non-response at the household level averaged 6%. Non-response also occurs at the level of individual questions. For most questions, the response rate was high and, in tables, the non-responses generally appear under the heading "not stated".

While refusal to answer specific questions was very low, accuracy of recall and ability to answer some questions completely can be expected to affect some of the results presented in the subsequent chapters. Awareness of exact question wording will help the reader interpret the survey results.

Since the survey is cross-sectional, caution is required in making causal inferences about the association between variables. Observed associations may be a reflection of differences between cohorts, period effects, differences between age groups or a combination of these factors.

The estimates derived from this survey are based on a sample of households. Somewhat different figures might have been obtained if a complete census had been taken using the same questionnaire, interviewers, supervisors, processing methods, etc. as those actually used. The difference between the estimates obtained from the sample and the results from a complete count taken under similar conditions is called the sampling error of the estimate.

Although the exact sampling error of the estimate, as defined above, cannot be measured from sample results alone, it is possible to estimate
a statistical measure of sampling error, the standard error, from the sample data.

Quality evaluation

Users should determine the number of records on the microdata file which contribute to the calculation of a given estimate. This number should be 25 or more. When the number of contributors to the weighted estimate is less than this, the
weighted estimate should not be released regardless of the value of the Approximate Coefficient of Variation.

It should be noted that the public use microdata files differ in a number of important respects from the survey 'master' files held by Statistics Canada. These differences are the result of actions taken to protect the anonymity of individual survey respondents. Users requiring access to information excluded from the microdata files may purchase custom tabulations. Estimates generated will be released to the user subject to meeting the publication and release guidelines.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Data accuracy

The response rate for the 1994 GSS was 82 %.

Documentation

Data file