Biennial Environmental Protection Expenditures Survey

Detailed information for 2016




Every 2 years

Record number:


The purpose of this survey is to obtain information on the expenditures made by industry to protect the environment in Canada. This information serves as an important indicator of Canadian investment in environmental protection.

Data release - Planned for Spring of 2019


The Environmental Protection Expenditures Survey provides a measure of the costs incurred by Canadian industries to comply with present or anticipated environmental regulations, conventions and voluntary agreements. The survey also collects information on the processes and practices adopted by industries to protect the environment.

Data from this survey are used by all levels of government for establishing informed environmental policies. The private sector also uses this information in the corporate decision-making process.

Statistical activity

The survey is administered as part of the Integrated Business Statistics Program (IBSP). The IBSP program has been designed to integrate approximately 200 separate business surveys into a single master survey program. The IBSP aims at collecting industry and product detail at the provincial level while minimizing overlap between different survey questionnaires. The redesigned business survey questionnaires have a consistent look, structure and content.

The integrated approach makes reporting easier for firms operating in different industries because they can provide similar information for each branch operation. This way they avoid having to respond to questionnaires that differ for each industry in terms of format, wording and even concepts. The combined results produce more coherent and accurate statistics on the economy.

Reference period: The calendar year or the 12-month fiscal period for which the final day occurs on or between April 1st of the reference year and March 31st of the following year.

Collection period: October through April of the year after the reference period.


  • Environment
  • Environmental protection

Data sources and methodology

Target population

The target population includes all establishments of 20 or more employees engaged in primary industries (resource extraction), manufacturing industries, the electric power generation, transmission, and distribution industry and the natural gas distribution industry.

The observed population comes from the Generic Survey Universe File (GSUF) created by Statistics Canada's Statistical Registers and Geography Division in April 2017. It contains all establishments in Canada existing in April 2017. From this frame, only establishments with 20 or more employees and a North American Industry Classification System (NAICS) code belonging to manufacturing industries (31-33), logging (except contract) (113311), oil and gas extraction (211), coal mining (2121), metal ore mining (2122), other non-metallic mineral mining and quarrying (21239), shale, clay and refractory mineral mining and quarrying (212326), electric power generation, transmission and distribution (2211) and natural gas distribution (2212) are retained.

Instrument design

The questionnaire has undergone various transformations since its inception in 1994. The original material was developed by the Environmental Accounts and Statistics Division with assistance from Industry Canada. A pilot study involving a limited group of respondents was performed as a means of initial testing of content and terminology. The basic expenditure questions have remained relatively constant over the cycles, with most significant changes involving the environmental management processes and technologies material (qualitative data). Further testing was done in 2010 to ensure that the concepts were clearly understood and that respondents were able to answer the questions correctly.

The electronic questionnaire for the Environmental Protection Expenditures Survey has been in use since reference year 2016. Prior to its implementation, the questionnaire was tested on survey respondents to ensure that the concepts were clearly understood and that respondents were able to answer the questions correctly.


This is a sample survey with a cross-sectional design.

The Generic Survey Universe File (GSUF) was used as the survey frame. A stratified Bernoulli sample of establishments classified to the North American Industry Classification System (NAICS) Canada 2012 and to geographical regions was selected.

Data sources

Data collection for this reference period: 2017-10-17 to 2018-04-30

Responding to this survey is mandatory.

Data are collected directly from survey respondents.

Data are collected using English and French electronic questionnaires. Respondents are contacted by email or letter and given an access code for the electronic questionnaire for the survey. The electronic questionnaires are addressed to a contact person who is either responsible for, or has knowledge of, the environment-related operations of the firm.

Telephone follow-up is used to obtain data from establishments who return incomplete questionnaires or who fail to respond.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Many factors affect the accuracy of data produced in a survey. For example, respondents may make errors in interpreting questions, answers may be incorrectly entered on the questionnaires, and errors may be introduced during the data capture or tabulation process. Every effort is made to reduce the occurrence of such errors in the survey.

The electronic questionnaire contains edits to help respondents correct for inconsistencies (e.g., a total variable does not equal the sum of its parts). Historical edits are used to identify large year over year reporting changes.

For returned data the system verifies that all mandatory cells have been filled in, that certain values are within acceptable ranges, that questionnaire flow patterns have been respected, that percentages are converted to dollars, that certain transformed and derived variables are assigned values, and that blanks are set to zeroes where appropriate. Consistency edit rules are performed on the data for each usable record. These rules ensure that all the variables have valid responses and are complete and coherent both within the questionnaire and across questionnaires.

Further data checking is performed by subject matter officers who research companies (annual reports, web sites, etc.) in an effort to verify information submitted by respondents.

Outliers are identified after collection and outside of IBSP, and are removed from the imputation process.


Statistical imputation is used for total non-response and partial non-response records. Five methods of imputation are used: deterministic imputation (there is only one possible value for the field to impute), historical imputation when available, imputation by ratio, donor imputation (using a nearest neighbour approach to find, for each record requiring imputation, the valid record that is most similar to it) and manual imputation. The criteria used for ratio and donor imputation are various combinations of industry group and geographical location (province, region, or Canada). Statistics Canada's generalized edit and imputation system (Banff) is used for this process. Usually, key variables are imputed first and are used as anchors in subsequent steps to impute other related variables.

Imputation generates a complete and coherent microdata file that covers all survey variables.


The Generalized Estimation System (G-Est) developed at Statistics Canada is used to produce domain estimates and quality indicators. It is a SAS based application for producing estimates (totals, ratios, percentages) for domains of a population based on a sample. An initial sampling weight (the design weight) is calculated for each unit in the survey and is the inverse of the probability of selection. The weight calculated for each sampling unit indicates how many other units it represents. Sampling units which are selected with certainty (must-take units) have sampling weights of one and only represent themselves; outlier units with larger than expected size are seen as misclassified and their weight is usually adjusted so that they only represent themselves, and the weights of other units are adjusted accordingly to take into account the existence of outliers. The final weights are usually either one or greater than one. Estimates are computed at several levels of interest such as NAICS and region or province, based on the most recent classification information for the statistical entity and the survey reference period.

Sampling error is measured by the coefficient of variation (CV) which represents the proportion of the estimate that comes from the variability associated to it. The CVs are calculated and are indicated in the data tables.

Quality evaluation

Data evaluation and error detection during data collection is an important way to ensure that the final estimates that are produced are of good quality. Post-collection, the survey results and estimates are evaluated as a further method of evaluating data quality. One way to assess data quality is to compare the data to the trends of other data collected. For example, in comparing the environmental protection expenditures with those of the previous cycle, some industry groups would show increases in expenditures, while others would show a decrease. Increased expenditures would be expected in industry groups that were subject to new environmental regulations or that had experienced a general increase in capital investments. On the other hand, some industries would have made large expenditures in the past to comply with regulations that came into effect before the reference year, and therefore would not be reporting as much in the current year. Instead, a shift in expenditures from capital to operating may be noticed, as firms now pay for the operation and maintenance on previously implemented processes and equipment. Respondents are provided with an area to provide a brief explanation to account for significant changes in environmental protection expenditures made by the establishment (either increased or decreased compared to previous reporting periods).

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Statistics Canada's generalized G-Confid system is used to prevent the identification of all data points that are at risk for disclosure (i.e., sensitive) as well as those data points that need to be suppressed to prevent the residual disclosure of the sensitive data points.

Revisions and seasonal adjustment

Revisions are made for the previous survey reference period, with the initial release of the current data, as required. The purpose is to address any significant issues with the data that were found between survey cycles. The actual period of revision depends on the nature of the issue, but rarely exceeds three years. For the most current data please refer to CANSIM tables 153-0052 to 153-0056 and 153-0117 to 153-0120. The data are not seasonally adjusted.

Data accuracy

The accuracy of data collected in a sample survey is affected by both sampling and non-sampling errors. Sampling errors arise from the fact that the information obtained from a sample of the population is applied to the entire population. The sampling method as well as the estimation method, the sample size and the variability associated to each measured variable determine the sampling error. A possible measure of sampling error is the coefficient of variation (CV). It represents the proportion of the estimate that comes from the variability associated to it. The CVs are calculated and are indicated on the data tables. As for non-sampling errors, they arise from coverage error, data response error, non-response error, and processing errors. Every effort is made to reduce these types of errors including verification of keyed data, consistency and validity edits, extensive follow up and consultation with government departments and industry associations.

Data response error may be due to questionnaire design, the characteristics of a question, inability or unwillingness of the respondent to provide correct information, misinterpretation of the questions or definitional problems. These errors are controlled through careful questionnaire design and testing and the use of simple concepts and consistency checks.

Processing errors may occur at various stages of processing such as data entry, editing and tabulation. Measures have been taken to minimize these errors.

Non-response error results when respondents refuse to answer, are unable to respond or are too late in reporting. Missing data items are imputed for partial non-responses (i.e., when only some questions are left unanswered) and total non-response (i.e., when all questions from the survey are left unanswered).

Date modified: