Workplace and Employee Survey (WES)
Detailed information for 2001
The overall goal of the survey is to examine the way in which employers and their employees respond to the changing competitive and technological environment.
Data release - July 11, 2003
- Questionnaire(s) and reporting guide(s)
- Data sources and methodology
- Data accuracy
The Workplace and Employee Survey (WES) is designed to explore a broad range of issues relating to employers and their employees. The survey aims to shed light on the relationships among competitiveness, innovation, technology use and human resource management on the employer side and technology use, training, job stability and earnings on the employee side.
The survey is unique in that employers and employees are linked at the micro data level; employees are selected from within sampled workplaces. Thus, information from both the supply and demand sides of the labour market is available to enrich studies on either side of the market.
To create the best conditions for growth in the knowledge-based economy, governments need to fine-tune their policies on education, training, innovation, labour adjustment, workplace practices, industrial relations and industry development. The results from the survey will help clarify many of these issues and will assist in policy development.
The Workplace and Employee Survey offers potential users several unique innovations: chief among these is the link between events occurring in workplaces and the outcomes for workers. In addition, being longitudinal, it allows for a clearer understanding of changes over time.
There are two reference periods used for the WES. Questions concerning employment breakdown use the last pay period of March for the reference year while other questions refer to the last 12-month period ending in March of the reference year.
- Adult education and training
- Education, training and learning
- Hours of work and work arrangements
- Job training and educational attainment
- Non-wage benefits
- Wages, salaries and other earnings
- Workplace organization, innovation, performance
Data sources and methodology
The target population for the employer component is defined as all business locations operating in Canada that have paid employees in March, with the following exceptions:
a) Employers in Yukon, Nunavut and Northwest Territories; and
b) Employers operating in crop production and animal production; fishing, hunting and trapping; private households, religious organizations and public administration.
The target population for the employee component is all employees working or on paid leave in March in the selected workplaces who receive a Canada Revenue Agency T-4 Supplementary form. If a person receives a T-4 slip from two different workplaces, then the person will be counted as two employees on the WES frame.
The survey population is the collection of all units for which the survey can realistically provide information. The survey population may differ from the target population due to operational difficulties in identifying all the units that belong to the target population.
The WES draws its sample from the Business Register (BR) maintained by the Business Register Division of Statistics Canada and from lists of employees provided by the surveyed employers.
The Business Register is a list of all businesses in Canada and is updated each month using data from various surveys, business profiling and administrative data.
In 1994, research on the possibility of an integrated approach to the collection and analysis of data on establishments and their employees was conducted by the Business and Labour Market Analysis Division. The findings were presented, a pre-test was funded and a WES working group was created. The group consulted with experts such as the EKOS group to determine the important research issues and a questionnaire was carved.
A pre-test of 50 businesses was conducted. There were consultations with an Advisory Subject Matter Group and based on the results of the pre-test and recommendations, changes were made to the questionnaire.
A pilot survey was conducted. There were consultations with experts such as the Subject Matter Advisory Group, Human Resources Development Canada, EKOS Group and based on their comments and suggestions changes were made to the questionnaires. Also, due to the improvement of the Business Register profiling the survey was simplified from a 3-stage process to a 2-stage process.
Ongoing scrutiny of the questionnaire by subject matter analysts, researchers, and interviewers alerts the WES team to any further modifications needed in the wording or order of questions.
This is a sample survey with a longitudinal design.
The survey frame is a list of all statistical locations that carries contact and classification (e.g., industrial classification) information on the units. This list is used for sample design and selection; ultimately, it provides contact and classification information for the selected units.
The survey frame of the Workplace component of WES is created from the information available on the Statistics Canada Business Register.
Prior to sample selection, the business locations on the frame are stratified into relatively homogeneous groups called strata, which are then used for sample allocation and selection. The WES frame is stratified by industry (14), region (6), and size (3), which is defined using estimated employment. The size stratum boundaries are typically different for each industry/region combination. The cut-off points defining a particular size stratum are computed using a model-based approach. The sample is selected using Neyman allocation. This process partitions the target population into 252 strata, where some 10,595 business locations were selected in 2001.
All sampled units are assigned a sampling weight (a raising factor is attached to each sampled unit to obtain estimates for the population from a sample). For example, if two units are selected at random and with equal probability out of a population of ten units, then each selected unit will represent five units in the population, and it will have a sampling weight of five.
The 2001 WES survey collected data from 6,223 out of the 10,595 sampled employers. The remaining employers were either out-of-business, seasonally inactive, holding companies, or out-of-scope. The majority of non-respondents were owner-operators with no paid help and in possession of a payroll deduction account.
The initial sample selected in 1999 is followed over time and is supplemented at two-year intervals with a sample of births selected from units added to the Business Register since the last survey occasion. Stratification of units remains constant over the life of the initial panel (set at 8 years). Whenever possible, the same sampling fractions are applied to all panels. Sometimes the sampling fractions are adjusted to offset stratum erosion, or to compensate for upswings or downswings in the economy. For 2001, they were revised slightly upward. This resulted in a birth panel of 1,792 workplaces.
The frame of the employee component of WES is based on lists of employees made available to interviewers by the selected workplaces. A maximum of twenty four employees are sampled using a probability mechanism. In workplaces with fewer than four employees, all employees are selected.
Sample Size - Employer
1999 - 6,322
2000 - 6,068
2001 - 6,223
Sample Size - Employee
1999 - 23,540
2000 - 20,167
2001 - 20,377
Employees will be followed for two years only, due to the difficulty of integrating new employers into the location sample as workers change companies. As such, fresh samples of employees will be drawn on every second survey occasion (i.e. first, third, fifth).
Data collection for this reference period: The employer collection is from April 1st to June 30, 2001. The employee collection is from September 1st to November 30, 2001.
Responding to this survey is mandatory.
Data are collected directly from survey respondents.
Data collection, data capture, preliminary editing and follow-up of non-respondents are all done in Statistics Canada Regional Offices. In 1999, workplace data were collected in person. As of 2000, computer assisted telephone interviews are conducted. For about 20% of the surveyed units (mostly large workplaces), more than one contact person is required.
For the employee component, telephone interviews are conducted with persons who agree to participate in the survey by filling out and mailing in an employee participation form.
View the Questionnaire(s) and reporting guide(s).
The use of CATI for data collection greatly reduces the number of response and typographical errors. The system incorporates basic data validation and verification of known relationships such as full time and part time employment not exceeding total employment. To detect errors that have eluded the CATI application, both micro and macro level analysis of questionable responses is performed to protect the coherence of the data.
Imputation methods are used cross-sectionally for item non-response for units appearing within each wave for the first time. Longitudinal imputation methods are used for wave non-response if historical data are available or for item non-response for units that have been in-sample for more than a year. In the absence of prior information, total non-response is handled by modifying the weights of the respondents. This approach assumes that the non-response is occurring completely at random.
There are four main imputation methods being used for the first wave of the employer portion of WES: deterministic, distributional, ratio and weighted hot deck. Deterministic imputation is used when a single missing field can be deduced uniquely from the given information. For example, if one component of a sum is missing and the remaining components including the sum are present, then the missing component can be determined uniquely.
Distributional imputation is used for questions where the respondent is asked to provide a total and its breakdown into multiple categories when either two or more of the categories are missing. The distribution of the categories is computed at a macro level and applied at the micro level. To illustrate this approach, let us assume that the respondent gave us total employment but was unable to provide a breakdown by occupational group. We would apply the distribution of the occupational groups computed at the industry/size level to the total employment figure to impute the missing fields.
Ratio imputation is mainly used for continuous variables. The missing value is replaced by the adjusted value of an auxiliary variable from a randomly selected donor within an imputation class. The adjustment usually takes the form of the sum of all donors of the missing variable divided by the sum of the auxiliary variable.
For weighted hot deck, a missing field is imputed using the response of a suitable donor. The donor is selected randomly with a probability of selection equal to the ratio of its sample weight over the sum of the sample weights of all units in the corresponding cross-sectional imputation class. The weighted hot deck approach was adopted for the following four reasons. The method is easy to implement. It leads to approximately p unbiased, or design unbiased, point estimates (Rao, 1996). A consistent variance estimator can be constructed in the presence of imputed data (Rao, 1996). And lastly, most questions are independent keeping the number of post-imputation adjustments to maintain internal data consistency to a minimum.
Missing data on the employee questionnaire are imputed using deterministic and weighted hot-deck imputation. To avoid producing inconsistencies in the data, most interrelated fields are imputed as a block. Since there are a number of questions falling into this category, a post-imputation system has been developed to preserve all inter-field relationships.
The reported (or imputed) values for each workplace and employee in the sample are multiplied by the weight for that workplace or employee; these weighted values are summed up to produce estimates. An initial weight equal to the inverse of the original probability of selection is assigned to each unit. To calculate variance estimates, the initial survey weights are adjusted to force the estimated totals in each industry/region group to agree with the known population totals. These adjusted weights are then used in forming estimates of means or totals of variables collected by the survey.
Variables for which population totals are known are called auxiliary variables. They are used to calibrate survey estimates to increase their precision. Each business location is calibrated to known population totals at the industry/region level. The auxiliary variable used for WES is total employment obtained from the Survey of Employment, Payrolls and Hours.
Estimates are computed for many domains of interest such as industry and region.
The estimation method is described in full in the attached document.
To validate estimates of key financial variables such as revenues and expenditures, comparisons were made with the United Enterprise Survey, the Annual Retail and Wholesale Trade Survey, and the Census of Manufacturing. Other data sources such as LEAP were used to assess survey coverage and death rates. On the employee side, comparisons were made with wage data collected by the Survey of Labour and Income Dynamics and the Labour Force Survey. Other variables were scrutinized as well. Most of these data verification activities took place during the revision of the 1999 wave. Since then, data are vigorously validated and edited each year of the survey to ensure sufficient data quality.
Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
The release of the workplace portion of WES to the Research Data Centres marks the first occasion that business data have been made available outside of Statistics Canada. Despite all external users of WES data being sworn to uphold the confidentiality of the files, further steps are taken to ensure disclosure avoidance. All obvious workplace identifiers are removed from the files and a number of large or unique respondents are suppressed. The procedure is done in two steps.
In the first step one computes an average rank of a record based on its contribution to the estimates of totals for a number of key variables. The top five ranked records in each industry are analyzed to assess their likelihood of being identified. The second step involves a multivariate technique called Principal Component Analysis whereby data are reduced to at most three dimensions -- principal components -- such that the characteristics of the original data are preserved. Any records whose principal components set them apart from the rest of the observations are reviewed. This can also be done visually by rotating a 3-D representation of the principal components to identify units that lie outside of the main data cloud. The records deemed unique by this step are combined with those obtained in the first step and suppressed. The suppression pattern is reviewed at two year intervals coinciding with the sample top-up.
The confidentiality of the employee portion is less problematic in that the sampling weights tend to mask the identity of respondent. Note that the employees associated with the suppressed workplaces are also suppressed.
The information presented in publications is reviewed to ensure that the confidentiality of individual responses is respected. Any estimate that could reveal the identity of a specific respondent is declared confidential, and consequently not published.
While considerable effort is made to ensure a high standard throughout all survey operations, the resulting estimates are inevitably subject to a certain degree of error. This is true in every survey. The total survey error can be divided into two main components: the sampling error and the non-sampling errors. The sampling error is due to the fact that estimates are computed using only a sample of the whole population instead of a complete census while the non-sampling errors are due to all other causes such as an imperfect frame, measurement errors or non-response. For instance, measurement errors can arise from mistakes made by respondents or interviewers during the collection of data, from errors made in keying in the data, or from other sources. This type of error may lead to the imputation of consistent but not necessarily correct values.
The WES sample was designed to be efficient for estimating totals at an industry by region by size level within the available budget. The projected coefficients of variation were around 5% for industry and 10% for industry by region for variables highly correlated with employment. When estimates are produced, they are compared to the projected precision. Approximately 60% of all estimates of totals exceeded expectation with another 25% being within the Statistics Canada publishable cut-off of 33%. The remaining 15% were not publishable by our standards. These were mostly estimates not highly correlated with employment. All estimates falling into the unpublishable category are validated. Estimates with a cv in the range of 25 to 33% are published with a cautionary flag, denoting their relatively high variability.
The measure of non-response error and the coefficient of variation must be considered jointly to assess the quality of the estimates. The lower the coefficient of variation and the higher the response fraction, the better will be the published estimate.
The 2001 response rates:
Workplace - 85.9
Employee - 86.9
- The development and use of a Canadian linked employer-employee survey
- Message intended for Workplace Employee Survey microdata users - 2001