Survey of Environmental Protection Expenditures (SEPE)

Detailed information for 2014




Every 2 years

Record number:


This survey provides a measure of the expenditures made by industry operating in Canada for environmental protection in response to Canadian and international environmental regulations, conventions and voluntary agreements.

Data release - Planned for Spring of 2017


The Survey of Environmental Protection Expenditures provides a measure of the cost to Canadian industry to comply with present or anticipated environmental regulations, conventions and voluntary agreements. The survey also collects information on environmental management practices and environmental technologies used by industry for the purpose of preventing or abating pollution.

Data from the survey are used predominantly by policy analysts and by both industry and academic researchers. Data may be used to monitor expenditure trends within industry groups or regions for the purpose of developing business or research and development strategies, as well as to evaluate the effectiveness of policy.

Reference period: The calendar year or the 12-month fiscal period for which the final day occurs on or between April 1st of the reference year and March 31st of the following year.

Collection period: October through April of the year after the reference period.


  • Environment
  • Environmental protection

Data sources and methodology

Target population

The target population includes all establishments of 20 or more employees engaged in primary industries (resource extraction), manufacturing industries, the electric power generation, transmission, and distribution industry and the natural gas distribution industry.

The observed population comes from the Generic Survey Universe File (GSUF) created by Statistics Canada's Statistical Registers and Geography Division in April 2015. It contains all establishments in Canada existing in April 2015. From this frame, only establishments with 20 or more employees and a North American Industry Classification System (NAICS) code belonging to manufacturing industries (31-33), logging (except contract) (113311), oil and gas extraction (211), coal mining (2121), metal ore mining (2122), other non-metallic mineral mining and quarrying (21239), shale, clay and refractory mineral mining and quarrying (212326), electric power generation, transmission and distribution (2211) and natural gas distribution (2212) are retained.

Instrument design

The questionnaire has undergone various transformations since its inception in 1994. The original material was developed by the Environmental Accounts and Statistics Division with assistance from Industry Canada. A pilot study involving a limited group of respondents was performed as a means of initial testing of content and terminology. The basic expenditure questions have remained relatively constant over the cycles, with most significant changes involving the environmental management processes and technologies material (qualitative data). Further testing was done in 2010 to ensure that the concepts were clearly understood and that respondents were able to answer the questions correctly.


This is a sample survey with a cross-sectional design.

The GSUF was used as the survey frame. A stratified sample of establishments classified to the North American Industry Classification System (NAICS) Canada 2012 and to geographical regions was selected.

Data sources

Data collection for this reference period: 2015-10-15 to 2016-04-29

Responding to this survey is mandatory.

Data are collected directly from survey respondents.

Data were collected using a paper questionnaire mailed to the respondent. A letter explaining the purpose of the survey, the requested return date and the legal requirements of response was included with the mail-out package. Note that the questionnaire is available in both English and French versions.

The questionnaires were addressed to a contact person who is either responsible for, or has knowledge of, the environment-related operations of the firm. In the case of some multi-establishment firms, the survey was mailed to the head office which either forwarded the questionnaires to the appropriate establishments or gathered the information and completed the questionnaires or a combined report for all targeted establishments.

Telephone follow-up was used to obtain data from establishments who returned incomplete questionnaires or who failed to respond. Information was automatically captured and entered into a database using image character recognition software. This process also applied edit checks which served to illuminate real or potential response errors. Phone follow-up was performed to verify information in cases where edit checks failed.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Many factors affect the accuracy of data produced in a survey. For example, respondents may have misinterpreted questions, answers may have been incorrectly entered on the questionnaires, and errors may have been introduced during the data capture or tabulation process. Every effort was made to reduce the occurrence of such errors in the survey.

Returned data were first checked using an automated edit-check program immediately after capture. This first procedure verified that all mandatory cells had been filled in, that certain values were within acceptable ranges, that questionnaire flow patterns had been respected, and that totals equalled the sum of their components. Collection officers evaluated the edit failures and concentrated follow-up efforts accordingly. Consistency edit rules were performed on the data for each usable record. These rules ensured that all the variables had valid responses and were complete and coherent both within the questionnaire and across questionnaires.

If a record had no response for at least one mandatory cell after editing, the record was not processed any further and was considered a total non-response.

Further data checking was performed by subject matter officers who research companies (annual reports, web sites, etc.) in an effort to verify information submitted by respondents.

Outliers were identified after collection and were removed from the imputation process.


Statistical imputation was used for partial non-response records. Five methods of imputation were used: deterministic imputation (there is only one possible value for the field to impute), historical imputation when available, imputation by ratio, donor imputation (using a nearest neighbour approach to find, for each record requiring imputation, the valid record that is most similar to it) and manual imputation. The criteria used for ratio and donor imputation were various combinations of industry group and geographical location (province, region, or Canada). Statistics Canada generalized edit and imputation system (Banff) was used for this process.


Estimates for the target population were calculated by multiplying the response values for the sampled units by their sampling weight. The sampling weight was calculated using a number of factors, including the probability of the unit being selected in the sample. The weights were adjusted to account for respondents who could not be contacted or who refused to respond to the survey.

Totals, ratios, percentages were estimated with Horvitz-Thompson estimator and the Generalized Estimation System (GES).

Sampling error was measured by the coefficient of variation (CV) which represents the proportion of the estimate that comes from the variability associated to it. The CVs were calculated and are indicated in the data tables.

Quality evaluation

Data evaluation and error detection during data collection is an important way to ensure that the final estimates that are produced are of good quality. Post-collection, the survey results and estimates are evaluated as a further method of evaluating data quality. One way to assess data quality is to compare the data to the trends of other data collected. For example, in comparing the environmental protection expenditures with those of the previous cycle, some industry groups would show increases in expenditures, while others would show a decrease. Increased expenditures would be expected in industry groups that were subject to new environmental regulations or that had experienced a general increase in capital investments. On the other hand, some industries would have made large expenditures in the past to comply with regulations that came into effect before the reference year, and therefore would not be reporting as much in the current year. Instead, a shift in expenditures from capital to operating may be noticed, as firms now pay for the operation and maintenance on previously implemented processes and equipment. Respondents are provided with an area to provide a brief explanation to account for significant changes in environmental protection expenditures made by the establishment (either increased or decreased compared to previous reporting periods).

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Statistics Canada's generalized G-Confid system is used to prevent the identification of all data points that are confidential as well as those data points that need to be suppressed to prevent the residual disclosure of those confidential data points.

Revisions and seasonal adjustment

Revisions are made for the previous survey reference period, with the initial release of the current data, as required. The purpose is to address any significant issues with the data that were found between survey cycles. The actual period of revision depends on the nature of the issue, but rarely exceeds three years. For the most current data please refer to CANSIM tables 153-0052 to 153-0056 and 153-0117 to 153-0120. The data are not seasonally adjusted.

Data accuracy

The accuracy of data collected in a sample survey is affected by both sampling and non-sampling errors. Sampling errors arise from the fact that the information obtained from a sample of the population is applied to the entire population. The sampling method as well as the estimation method, the sample size and the variability associated to each measured variable determine the sampling error. A possible measure of sampling error is the coefficient of variation (CV). It represents the proportion of the estimate that comes from the variability associated to it. The CVs were calculated and are indicated on the data tables. As for non-sampling errors, they arise from coverage error, data response error, non-response error, and processing errors. Every effort was made to reduce these types of errors including verification of keyed data, consistency and validity edits, extensive follow up and consultation with government departments and industry associations.

Data response error may be due to questionnaire design, the characteristics of a question, inability or unwillingness of the respondent to provide correct information, misinterpretation of the questions or definitional problems. These errors are controlled through careful questionnaire design and testing and the use of simple concepts and consistency checks.

Processing errors may occur at various stages of processing such as data entry, editing and tabulation. Measures have been taken to minimize these errors.

Non-response error results when respondents refuse to answer, are unable to respond or are too late in reporting. Total non-response, i.e., when all questions from the survey are left unanswered, was dealt with by adjusting the weights assigned to the responding records, such that one responding record might also represent other non-responding units with similar characteristics (i.e., size, province, industry). Missing data items were imputed for partial non-responses (i.e., when only some questions were left unanswered).

Date modified: