This survey collects the financial and commodity information used to compile statistics on Canada's manufacturing and logging industries.
Data release – April 3, 2013 (First in a series of releases. Please refer to the left sidebar, under the heading "The Daily")
The Annual Survey of Manufactures and Logging (ASML) is a survey of the manufacturing and logging industries in Canada. It is intended to cover all establishments primarily engaged in manufacturing and logging activities, as well as the sales offices and warehouses which support these establishments.
The details collected include principal industrial statistics (such as revenue, employment, salaries and wages, cost of materials and supplies used, cost of energy and water utility, inventories, etc.), as well as information about the commodities produced and consumed.
Data collected by the Annual Survey of Manufactures and Logging are important because they help measure the production of Canada's industrial and primary resource sectors, as well as provide an indication of the well-being of each industry covered by the survey and its contribution to the Canadian economy. Within Statistics Canada, the data are used by the Canadian System of National Accounts, the Monthly Survey of Manufacturing (record number 2101) and Prices programs. The data are also used by the business community, trade associations, federal and provincial departments, as well as international organizations and associations to profile the manufacturing and logging industries, undertake market studies, forecast demand and develop trade and tariff policies.
The target population comprises all establishments primarily engaged in manufacturing and logging activities. Data for reference years 2004 to 2006 were classified by industry based on the North American Industry Classification System (NAICS) for 2002. Beginning with reference year 2007, the data are classified by industry based on NAICS 2007. Under the North American Industry Classification System (NAICS), logging establishments are classified to NAICS 1133 and manufacturing establishments to NAICS sectors 31, 32 and 33.
The Annual Survey of Manufactures and Logging uses only one detailed paper questionnaire to collect data from respondents. This questionnaire is sent to all establishments above certain thresholds that vary by province, by industry and by survey year.
One hundred eighty five separate "templates" have been developed based on the ASML questionnaire; one for each 5 digit level industry defined in the North American Industry Classification System (NAICS). Each template contains questions asking for standard financial data, as well as data on the commodities that each establishment in a given industry consumed and produced during the reference year. This list includes the commodities which typically account for the largest shares of total outputs produced and inputs used by 5 digit NAICS. These templates are available upon request.
The questionnaire was developed in collaboration with data users in order to meet their statistical needs. Respondents and industry associations were also consulted through focus groups and individual meetings to ensure that the information being asked was available and that the questionnaire could be filled out within a reasonable timeframe.
This is a sample survey with a cross-sectional design.
The frame used for sampling purposes is the Statistics Canada Business Register. The statistical unit is the establishment. The survey population includes all manufacturing and logging establishments above certain thresholds that vary by province, by industry and by reference year.
A sample of establishments is selected from among units in the survey population based on a one phase probability sampling plan. Establishments are stratified by province, by industry and by revenue. "Take-alls" are selected based on their complexity, their size and on their importance in their industry. A "take-some" sample is also drawn. All sampled units receive questionnaires.
Financial data are obtained from administrative files for non-sampled units in the target population. The proportions of data collected with questionnaires and obtained from administrative files vary by province, industry and reference year. These proportions vary based on the resources available, as well as the survey's target coverage at the national, provincial and industry levels. Using administrative files, where possible, reduces both the survey response burden and data collection costs, while maintaining the necessary level of accuracy.
Responding to this survey is mandatory.
Data are collected directly from survey respondents and extracted from administrative files.
Data are collected annually using a mail out/mail back questionnaire. Respondents are asked to return the completed questionnaires within thirty days of receipt. Upon receipt, the collected questionnaires are imaged and the data from these questionnaires are captured using key from image technology. Preliminary editing is also performed to ensure the validity of the collected data. Follow-up for non response and for data validation is conducted by telephone or fax.
Sampled units are prioritized for follow-up and editing based on a tool called the "score function". This tool divides units into 3 categories (priorities 1, 2 and 3) largely based on an estimated value of their sales. Priority 1 records are re-contacted for non-response and all attempts are made to elicit a response since their contribution to the survey estimates (score) is significant. In the case of priority 2 units, re-contact is based on their scores with the highest scored units being contacted first. The units included in this category are substitutable for one another. Priority 2 units which could not be contacted for follow-up are skipped and the next highest scored unit or units in the same group will be followed-up. The replacement unit(s) must represent the same contribution to the collection target coverage as the skipped unit. Full edits are applied to all priority 1 and 2 units required to attain the target coverage level for collection. The remaining respondents and non-respondents (some priority 2 and all priority 3 units) are subject to minimal collection follow-up and edits.
Error detection is an integral part of both collection and data processing activities. Two tiers of automated edits are applied to data records during collection to identify reporting and capture errors. The first tier of edits are applied to all collected records immediately following data capture and are intended to flag gross errors that can generally be resolved without contacting respondents. These edits identify potential errors based on year-over-year changes in key variables, totals and ratios (e.g. total revenue to expenses) that exceed tolerance thresholds, as well as identify problems in the consistency of collected data (e.g. total expenses do not equal the sum of detailed expenses). As well, these edits identify coding problems in the commodity data supplied by respondents. The second tier of collection edits is similar to the first although the tolerances for error are more narrowly defined and the number of edits is larger. These edits are applied only to the data records that are required to reach the target coverage level for collection. In general, these records have the largest impact on the final survey estimates. All edit failures are reviewed.
Prior to imputation, subject matter specialists use a variety of tools to identify and resolve outliers in the year-over-year change and in the relationships among financial variables in both the survey and administrative data. This is a selective review of the micro data that focuses on records which may have a significant impact on the final survey estimates either because of the size of their contribution to an industry or province aggregate or because of their use in imputation. For financial data, problems are resolved using historical and administrative data, data from other related surveys, business specific information from external sources (annual reports, trade publications), and the current or historical average for businesses in the same industry. For commodity data, the focus prior to imputation is to ensure that commodities with large values are correctly coded.
Following imputation, subject matter analysts confront financial estimates for the ASML with those for the Monthly Survey of Manufacturing (MSM). When the estimates are different, data for all comparable units in the ASML and MSM are reviewed to ensure that businesses included in both surveys are reporting consistently. Also, outlier year-over-year changes in the estimates for key variables and in the ratios among estimates for selected variables are identified at the 6-digit NAICS by province level. Analysts can "drill down" on these estimates as necessary to correct any remaining data issues in records for the top contributors. A similar top-down approach is performed for commodity variables.
Imputation is used to determine plausible values for all variables that are missing or inconsistent in both the collected and administrative data records that cover the ASML target population. A number of statistical techniques are employed for this purpose that use survey data collected during the current cycle as well as auxiliary information sources. These auxiliary sources include survey data from a previous cycle (historical) and administrative data. The financial and commodity variables for the ASML are imputed separately.
A number of different approaches are used to impute missing or inconsistent financial variables for ASML. The simplest technique is using a group of deterministic and coherence rules that dictate the acceptable relationships among the variables and derive missing values residually. For example, one rule might be that A plus B equals C. If A is 20 and C is 100, then the value of the missing B must be imputed as 80. Missing variables are also imputed by applying the ratios between variables and historical trends observed in respondent data to data records with partial information. Direct replacement of missing variables with administrative or historical data is another imputation approach used. Lastly, donor imputation is also used. This involves identifying a respondent record that is similar to a non-respondent based on information that is available for both businesses. The data available for the respondent is then used to derive that for non-respondent.
Historic trend imputation and donor imputation (for units new to ASML) are used to derive missing commodity information. There is no administrative information available for use in imputing commodity data.
Imputation generates a complete and coherent micro data file that covers financial variables for all units in the target population and commodity variables for a sample of units. Rules have been put in place to ensure that there is coherence between the financial and commodity data portions of the survey.
For ASML financial variables, a complete micro data file is created for all establishments in the survey portion of the target population, whether in-sample or out-of-sample. The data sources for these units include survey responses and/or administrative records. Each establishment in the survey portion of the target population represents itself. This means that the weight for each establishment is one. Estimation for financial variables is done by simple aggregation of these micro records, for incorporated and non-incorporated businesses in the manufacturing and logging industries. The totals are produced and published at the industry by province or national levels.
Selected variables are estimated from administrative records for establishments falling in the non-survey portion of the target population.
Estimation for commodity variables is sample-based. A sample of establishments from the survey portion of the target population is selected, where each sampled establishment from a stratum represents a number of other establishments in the stratum (a stratum is defined by province, industry and revenue). A sampling weight is calculated for each establishment, to indicate how many other establishments it represents in the stratum. The sampling weight for an establishment is either one or greater than one. Establishments which are "Take-all" have sampling weights of one; those which are "take-some" have sampling weights greater than one. For ASML commodity variables, complete micro-data are available only for the establishments in sample. Prior to estimation, sampling weights are adjusted in such a way that the estimates of commodities consumed are consistent at the industry and province level with the corresponding financial estimates for total inputs, and that the estimates of commodities produced are consistent at the industry and province level with the corresponding financial estimates for total outputs for the survey population.
Commodity estimates are obtained by aggregating the products of the adjusted weights and values for a given commodity, for all sampled units that reported or were imputed with the given commodity. Commodity estimates are published by province and for Canada. All commodity estimates represent only the survey portion of the target population.
The survey estimates are analyzed for comparability with patterns observed in the historical data series for the manufacturing and logging industries, with trends observed in related Statistics Canada data series (e.g. Monthly Survey of Manufacturing (record number 2101), sub-annual manufacturing commodity surveys) and with information from other external sources (e.g. associations, trade publications, newspaper articles).
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
In order to prevent any data disclosure, confidentiality analysis for financial and commodity variables is done using the G-CONFID system. G-CONFID is used for primary confidentiality as well as for the secondary suppression (residual disclosure). Direct disclosure or primary confidentiality occurs when the value in a tabulation cell is composed or dominated by few enterprises while the residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.
The most recent annual data are subject to a one year revision policy.
For financial variables, the Annual Survey of Manufactures and Logging (ASML) has data available for every unit in the population. For commodity variables, data are available for the units in the sample only. These data are obtained from the survey and administrative files or are imputed. Data quality for the financial and the commodity variables is assessed based on measures of sampling errors and non sampling errors. Sampling errors occur as a result of taking a sample of the population. Non-sampling error is not related to sampling and may occur for various reasons during the collection and processing of data. For example, non-response is an important source of non-sampling error. Under or over-coverage of the population, differences in the interpretations of questions and mistakes in recording, coding and processing data are other examples of non-sampling errors. To the maximum extent possible, these errors are minimized through careful design of the survey questionnaire, verification of the survey data, and follow-up with delinquent respondents to maximize response rates.
For commodity variables, data quality measures are based on coefficients of variation (a measure of sampling error) and the extent of non-sampling error resulting from non-response and imputation. A small coefficient of variation and a low imputation rate for a published commodity variable indicates a high data quality. On the other hand, a high coefficient of variation and high non-response rates indicates low data quality. Quality indicators and their guidelines are disseminated with the commodity estimates.
For financial variables the quality indicators in the table below (Total Revenue by Source) are measures of non-sampling errors resulting from non-response and imputation of financial data. These measures are based on total revenue estimates published for the current reference year. Response measures the proportion of the total revenue from reported data. Imputation measures the proportion of total revenue from imputed data.
Relative to other variables, total revenue is generally well-reported and is readily available from administrative sources. Therefore, the quality of this variable is relatively high. However, the distribution of value by source can vary significantly across variables, industries and provinces. This variability is a function of response rates and the availability of administrative data. The proportional distributions for all published variables by industry and province are available on request.