Industrial Water Survey (IWS)

Detailed information for 2007

Status:

Active

Frequency:

Every 2 years

Record number:

5120

This survey is being conducted to provide Canadians with national and regional indicators related to the use of water in industry.

Data release - March 16, 2010

Description

This survey provides information about the intake, costs, sources, treatments and discharge of water used for the manufacturing, mining and thermal-electric power generating industries in Canada. These data are used in the development of environmental accounts and fulfill the requirements for producing water-related indicators as part of the Canadian Environmental Sustainability Indicators (CESI) published by Environment and Climate Change Canada.

Reference period: Calendar year

Subjects

  • Environment
  • Environmental quality

Data sources and methodology

Target population

The target population consists of locations primarily engaged in manufacturing, coal mining, metal ore mining, non-metallic mineral mining (excluding sand and gravel quarrying), nuclear electric power generation and fossil-fuel electric power generation. The population size was 31,101 manufacturing locations (NAICS 31 - 33), 696 mining locations (NAICS 2121, 2122, 2123, excl. 21232) and 112 thermal-electric power generating plants (NAICS 221112, 221113).

Instrument design

The Industrial Water Survey uses three separate questionnaires to collect data from respondents. A separate questionnaire was designed for each of the three sectors being surveyed, i.e. one for manufacturing, one for the mineral extraction industries and another for the thermal-electric power generators.

The questionnaires collect data on the volume of water brought into the facility, including information on the source, purpose, treatment and possible re-circulation of this water, by industrial users. As well, data is collected on the volumes of water discharged and treatment of this discharged water by industrial users. Cost information on the intake and discharge of water is also collected.

The questionnaires were developed in collaboration with data users in order to meet their statistical needs. Respondents were also consulted through individual meetings to ensure the information being asked was available and that the questionnaire could be filled out within a reasonable timeframe.

Sampling

This is a sample survey with a cross-sectional design.

The frame used for sampling purposes is the Statistics Canada Business Register, with the observed population comprised of all manufacturing, selected mining and all thermal-electric locations. The statistical unit is the location. The population size is approximately 97,000 manufacturing locations (NAICS 31 - 33), 800 mining locations (NAICS 2121, 2122, 2123, excl. 21232) and 100 thermal-electric power generating plants (NAICS 221112, 221113).

There is an independent sampling strategy for each of the three sectors. The sample for the thermal-electric power generating stations is a census of the approximately 100 electric power stations. A probability design is used for sample selection in the manufacturing and mineral extraction sectors. In the mining sector, establishments are stratified by province, by 4-digit NAICS industry and by size (revenues). All multi-locations (more than one location for one establishment) and all locations identified as employers of 50 persons or more were selected as "must-take" units and the rest of the population were sampled with varying sampling fractions, depending on the industry. All of the approximately 350 in-sample units receive a questionnaire. In the manufacturing sector, establishments are stratified by major river drainage region, by industry and by size (shipments). To reduce response burden on small units, the smallest units of the industries of interest are excluded from sampling. In each combination of industries, locations that make up the bottom 5% of the size measure by major river basin were excluded. Some specific industries, identified as large consumers of water are selected with certainty; the rest of the population is sampled with varying sampling fractions, depending on the industry. All of the approximately 5150 in-sample units receive a questionnaire.

Data sources

Data collection for this reference period: 2008-05-01 to 2008-12-31

Responding to this survey is mandatory.

Data are collected directly from survey respondents.

Mail out occurs in April of the year following the reference year and is usually directed to an "environment manager or coordinator". Respondents are asked to return the completed questionnaires within thirty days of receipt. Fax reminders are sent to respondents whose questionnaires are outstanding 45 days after the mail out. Collection is generally completed no later than December of the year following the reference year.

View the Questionnaire(s) and reporting guide(s).

Error detection

A number of factors can affect the accuracy of data produced in a survey. For example, respondents may make errors in interpreting questions, answers may be incorrectly entered on the questionnaires, and errors may be introduced during the data capture or tabulation process. Every effort is made to reduce the occurrence of such errors in the survey.

Upon receipt, questionnaires were scanned using an imaging system that captured the data for transfer into a database. Captured data were first checked using an automated edit-check program (BLAISE). This program verified that all mandatory cells were filled in, certain values fell within acceptable ranges, questionnaire flow patterns were respected and totals equalled the sum of their components. Collection officers evaluated the edit failures and concentrated follow-up efforts accordingly. Follow-up for non-response and for data validation was conducted by telephone or fax.

Further data checking was performed by subject matter officers who compared historical data with returned data to determine if differences between survey cycles were reasonable. If not, collection officers were asked to confirm with respondents their responses. Subject matter officers also researched companies (using annual reports, web sites, etc.) in an effort to verify information submitted by respondents.

Imputation

Statistical imputation is used for partial response records. Five methods of imputations were used for the Industrial Water Survey: Deterministic imputation (only one possible value for the field to impute), historical imputation, imputation by ratio, donor imputation (using a "nearest neighbour" approach to find a valid record that is most similar to the record requiring imputation) and manual imputation. Ratios were calculated and donors were selected for imputation purposes based on the same or closest industry group within specified geographic areas.

Estimation

The estimates are calibrated to the size measure variable (shipments or revenues) to account for the uncovered portion of each industry that was excluded from the sample. The response values for sampled units were multiplied by a sampling weight in order to estimate for the entire population (including the uncovered portion).those below the sampling cut-off). The sampling weight was calculated using a number of factors, including the probability of the unit being selected in the sample. Raising the factor (weight) adjustment was used in the estimation process to account for the uncovered portion and for respondents who could not be contacted or who refused to complete the survey.

Quality evaluation

When the Industrial Water Survey was reinstituted for reference year 2005, it had been almost ten years since the survey had last been conducted. In addition to the extended lapse of time between survey years, the use of different industrial classification systems and the different sampling strategies between the survey years made historical comparisons difficult. Reported data for 2005 was evaluated for consistency within the reporting unit and within a reporting unit's industry. However, with the survey being conducted again for reference year 2007, a comparison of the 2 years was possible. An important result of this historical comparison was the discovery of inconsistencies between the 2005 and 2007 results of the survey. Most of these inconsistencies were the result of response errors on the part of some respondents in 2005 and the fact these were not adjusted by us as we now conclude they should have been. Additionally, classification errors led to problems related to imputations for missing data. Revisions to the 2005 data have been made and the revised results are available.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

In order to prevent any data disclosure, confidentiality analysis is done using the Statistics Canada Generalized Disclosure Control System (G-Confid). G-Confid is used for primary suppression (direct disclosure) as well as for secondary suppression (residual disclosure). Direct disclosure occurs when the value in a tabulation cell is composed of or dominated by few enterprises while residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.

Revisions and seasonal adjustment

This methodology does not apply to this survey.

Data accuracy

Sampling errors arise from the fact that the information obtained from a sample of the population is applied to the entire population. The sampling method as well as the estimation method, the sample size and the variability associated to each measured variable determine the sampling error. A possible measure of sampling errors is the coefficient of variation (CV). It represents the proportion of the estimate that comes from the variability associated to it. For the Industrial Water Survey, CVs were calculated for the major variables and are indicated on the data tables. This information is available in the Statistics Canada publication entitled "Industrial Water Use, 2007" (catalogue number 16-401-X), accessible through the 'Publications' link in the side bar menu at the upper left of this screen (scroll up to view).

Data response error may be due to questionnaire design, the characteristics of a question, inability or unwillingness of the respondent to provide correct information, misinterpretation of the questions or definitional problems. These errors are controlled through careful questionnaire design and testing and the use of simple concepts and consistency checks.

Processing errors may occur at various stages of processing such as data entry, editing and tabulation. Measures have been taken to minimize these errors.

Non-response error results when respondents' refuse to answer, are unable to respond or are too late in reporting. Total non-response, i.e. when all questions from the survey are left unanswered, is dealt with by adjusting the weights assigned to the responding records, such that one responding record might also represent other non-responding units with similar characteristics (i.e. size, province, industry). Missing data items are imputed for partial non-responses (i.e. when only some questions are left unanswered).

The response rate for the manufacturing component of the survey was 72%, for the mining component, 79% and 92% for the thermal-electric component in the 2007 reference year.

Date modified: