# Monthly Survey of Manufacturing (MSM)

## Detailed information for September 2019

**Status:**

Active

**Frequency:**

Monthly

**Record number:**

2101

The Monthly Survey of Manufacturing (MSM) publishes statistical series for manufacturers -- sales of goods manufactured, inventories, unfilled orders, new orders and capacity utilization rate.

**Data release** - November 19, 2019

### Description

The MSM publishes the values (in current Canadian dollars) of sales of goods manufactured, inventories, orders and capacity utilization rates.

With the release on March 17, 2017, the MSM has published industry-level estimates by industry for real (price deflated) sales of goods manufactured, orders, inventory owned and inventory to sales ratios. The industry-level real data are now available in table 16-10-0013-01.

Results from this survey are used by both the private and public sectors including federal and provincial government departments, the Bank of Canada, the System of National Accounts, the manufacturing community, consultants and research organizations in Canada, the United States and abroad, and the business press. Data collected by the MSM provides a current 'snapshot' of sales of goods manufactured values by the Canadian manufacturing sector, enabling analysis of the state of the Canadian economy, as well as the health of specific industries in the short- to medium-term.

**Reference period: **Month

**Collection period: **Collection of the data begins approximately 7 working days after the end of the reference month, and continues for the duration of that calendar month.

#### Subjects

- Machinery, computers and electronics
- Manufacturing

### Data sources and methodology

#### Target population

Statistics Canada's business register provides the sampling frame for the MSM. The target population for the MSM consists of all statistical establishments on the Business Register that are classified to the manufacturing sector (by NAICS), which are categorized into over 156 industries. An establishment comprises the smallest manufacturing unit capable of reporting the variables of interest. The sampling frame for the MSM is determined from the target population after subtracting establishments that represent the bottom 10% of the total manufacturing sales of goods manufactured estimate for each cell. These establishments were excluded from the frame so that the sample size could be reduced without significantly affecting quality.

#### Instrument design

Both electronic and paper questionnaires are used to collect data for the Monthly Survey of Manufacturing (MSM). The questionnaires were developed at Statistics Canada and were reviewed and tested in the field in both official languages. In the course of redeveloping the MSM, Statistics Canada consulted with a number of manufacturers as well as with industry associations. In February 2016, the capacity utilization rate was added to the MSM questionnaire. Also the MSM questionnaire became available to respondents in electronic format in May 2017.

#### Sampling

This is a sample survey with a cross-sectional design.

The MSM sample is a probability sample comprised of approximately 6,500 establishments.

A new sample was chosen in the fall of 2017, followed by a six-month parallel run (from reference month September 2017 to reference month February 2018). The new sample was used officially for the first time for dissemination with the reference month December 2017.

This marks the first process of refreshing the MSM sample since 2012. The objective of the process is to keep the sample frame as fresh and up-to date as possible. All establishments in the sample are refreshed to take into account changes in their value of sales of goods manufactured, the removal of dead units from the sample and some small units are rotated out of the sample, while others are rotated into the sample.

Prior to selection, the sampling frame is subdivided into industry-province cells. Depending upon the number of establishments within each cell, further subdivisions were made to group similar sized establishments' together (called stratum). An establishment's size was based on revenue variables from the Business Register.

Each industry by province cell has a 'take-all' stratum composed of establishments sampled each month with certainty. This 'take-all' stratum is composed of establishments that are the largest statistical enterprises, and have the largest impact on estimates within a particular industry by province cell. These large statistical establishments comprise about 50% of the national manufacturing sales of goods manufactured estimates.

Each industry by province cell can have at most two 'take-some' strata. Not all establishments within these stratums need to be sampled with certainty. A random sample is drawn from the remaining strata. The responses from these sampled establishments are weighted according to the inverse of their probability of selection. In cells with take-some portion, a minimum sample size of 3 was imposed.

The take-none portion of the sample is now estimated from administrative data and as a result, 100% of the sample universe is covered. Estimation of the take-none portion also improved efficiency as a larger take-none portion was delineated and the sample could be used more efficiently on the smaller sampled portion of the frame.

#### Data sources

Responding to this survey is mandatory.

Data are collected directly from survey respondents and extracted from administrative files.

The complete sample of establishments is sent out for data collection. Collection of the data is performed by Statistics Canada's Regional Offices. Respondents are sent an electronic or paper questionnaire or are contacted by telephone to obtain their sales, inventories, unfilled orders, capacity utilization rates, as well as to confirm the opening or closing of business trading locations. Collection also undertakes follow-up of non-respondents. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that calendar month.

New entrants to the survey are introduced to the survey via introductory questions that confirm the respondent's business activity and contact information.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted.

Use of Administrative Data:

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden, especially for small businesses, the MSM derives sales data for low-revenue establishments from Goods and Service Tax (GST) files using a ratio estimator. The ratio estimator also increases the precision of the surveyed portion of the estimate. For more information on the ratio estimator, see the section on estimation.

View the Questionnaire(s) and reporting guide(s) .

#### Error detection

Data are analyzed within each industry-province cell. Extreme values are listed for inspection by the magnitude of the deviation from average behavior. Respondents are contacted to verify extreme values. Records that fail statistical edits are considered outliers and are not used for imputation.

Values are imputed for the non-responses, for establishments that do not report or only partially complete the survey form. A number of imputation methods are used depending on the variable requiring treatment. Methods include using industry-province cell trends and historical responses. Following imputation, the MSM staff performs a final verification of the responses that have been imputed.

#### Imputation

Imputation in the MSM is the process used to assign replacement values for missing data. This is done by assigning values when they are missing on the record being edited to ensure that estimates are of high quality and that a plausible, internal consistency is created. Due to concerns of response burden, cost and timeliness, it is generally impossible to do all follow-ups with the respondents in order to resolve missing responses. Since it is desirable to produce a complete and consistent microdata file, imputation is used to handle the remaining missing cases.

In the MSM, imputation for missing values can be based on either historical data or administrative data. The appropriate method is selected according to a strategy that is based on whether historical data are available, administrative data are available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that a top quality imputation can result. The historical imputation method that was labeled as the third type above is always the last option in the order for each reference month.

The imputation method using administrative data is automatically selected when historical information is unavailable for a non-respondent. Trends are then applied to the administrative data source (monthly size) depending on whether the unit has a simple structure, e.g. enterprises with only one establishment, or a more complex structure.

#### Estimation

Estimation is a process by which Statistics Canada obtains values for the population of interest so that it can draw conclusions about that population based on information gathered from only a sample of the population. More specifically, the MSM uses a ratio estimator.

Ratio estimation consists of replacing the initial sampling weights (defined as the inverse of the probability of selection in the sample) by new weights in a manner that satisfies the constraints of calibration. Calibration ensures that the total of an auxiliary variable estimated using the sample must equal the sum of the auxiliary variable over the entire population, and that the new sampling weights are as close as possible (using a specific distance measure) to the initial sampling weights.

For example, suppose that the known population total of the auxiliary variable is equal to 100 and based on a sample the estimated total is equal to 90, so that we are underestimating by approximately 10%. Since we know the population total of the auxiliary variable, it would be reasonable to increase the weights of the sampled units so that the estimate would be exactly equal to it. Now since the variable of interest is related to the auxiliary variable, it is not unreasonable to believe that the estimate of the sales based on the same sample and weights as the estimate of the auxiliary variable may also be an underestimation by approximately 10%. If this is in fact the case, then the adjusted weights could be used to produce an alternative estimator of the total sales. This alternate estimator is called the ratio estimator.

In essence, the ratio estimator tries to compensate for 'unlucky' samples and brings the estimate closer to the true total. The gain in variance will depend on the strength of the relationship between the variable of interest and the auxiliary data.

The take-none portion is taken into account by the ratio estimator. This is done by simply including the take-none portion in the control totals for the sample portion. By doing this, the weights for the sampled portion will be increased in such a way that the estimates will be adjusted to take into account the take-none portion.

The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group/geographic area combination and the other totals by industrial group. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time.

For the capacity utilization rate, the estimate for a given domain is calculated by first calculating the total production and monthly production capacity for the domain and then by dividing the total production by the total monthly production capacity.

The measure of precision used for the MSM to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

Estimation of sales by census metropolitan area

Estimates of sales for twelve census metropolitan areas (CMA) have been derived by using a small area estimation (SAE) technique based on a Fay-Herriot methodology. In this methodology, a model that describes the relationship between estimated sales coming from the MSM and sales coming from the Goods and Services Tax (GST) data at a small level of geography is combined with traditional estimates obtained from the weighted MSM sample. The resulting small area estimates are often significantly more precise than standard MSM weighted estimates, particularly for areas where the latter become unreliable due to small area sample sizes. This increase in precision is obtained at the expense of introducing model assumptions. Unlike standard MSM estimates, small area estimates may thus be subject to model misspecification errors, which may result in biases. Careful model validation has been performed before releasing the estimates in order to decrease the risk of bias. More information concerning the SAE methodology is available in the additional documentation.

Real manufacturing sales of goods manufactured, inventories, and orders

Changes in the values of the data reported by the Monthly Survey of Manufacturing (MSM) may be attributable to changes in their prices or to the quantities measured, or both. To study the activity of the manufacturing sector, it is often desirable to separate out the variations due to price changes from those of the quantities produced. This adjustment is known as deflation.

Deflation consists in dividing the values at current prices obtained from the survey by suitable price indexes in order to obtain estimates evaluated at the prices of a previous period, currently the year 2012. The resulting deflated values are said to be "at 2012 prices". Note that the expression "at current prices" refer to the time the activity took place, not to the present time, nor to the time of compilation.

The deflated MSM estimates reflect the prices that prevailed in 2012. This is called the base year. The year 2012 was chosen as base year since it corresponds to that of the price indexes used in the deflation of the MSM estimates. Using the prices of a base year to measure current activity provides a representative measurement of the current volume of activity with respect to that base year. Current movements in the volume are appropriately reflected in the constant price measures only if the current relative importance of the industries is not very different from that in the base year.

The deflation of the MSM estimates is performed at a very fine industry detail, equivalent to the 6-digit industry classes of the North American Industry Classification System (NAICS). For each industry at this level of detail, the price indexes used are composite indexes which describe the price movements for the various groups of goods produced by that industry.

With very few exceptions the price indexes are weighted averages of the Industrial Product Price Indexes (IPPI). The weights are derived from the annual Canadian Input-Output tables and change from year to year. Since the Input-Output tables only become available with a delay of about two and a half years, the weights used for the most current years are based on the last available Input-Output tables.

The same price index is used to deflate sales of goods manufactured, new orders and unfilled orders of an industry. The weights used in the compilation of this price index are derived from the output tables, evaluated at producer's prices. Producer prices reflect the prices of the goods at the gate of the manufacturing establishment and exclude such items as transportation charges, taxes on products, etc. The resulting price index for each industry thus reflects the output of the establishments in that industry.

The price indexes used for deflating the goods / work in progress and the finished goods inventories of an industry are moving averages of the price index used for sales of goods manufactured. For goods / work in process inventories, the number of terms in the moving average corresponds to the duration of the production process. The duration is calculated as the average over the previous 48 months of the ratio of end of month goods / work in progress inventories to the output of the industry, which is equal to sales of goods manufactured plus the changes in both goods / work in progress and finished goods manufactured inventories.

For finished goods manufactured inventories, the number of terms in the moving average reflects the length of time a finished product remains in stock. This number, known as the inventory turnover period, is calculated as the average over the previous 48 months of the ratio of end-of-month finished goods manufactured inventory to sales of goods manufactured.

To deflate raw materials and components inventories, price indexes for raw materials consumption are obtained as weighted averages of the IPPI. The weights used are derived from the input tables evaluated at purchaser's prices, i.e. these prices include such elements as wholesaling margins, transportation charges, and taxes on products, etc. The resulting price index thus reflects the cost structure in raw materials and components for each industry.

The raw materials and components inventories are then deflated using a moving average of the price index for raw materials consumption. The number of terms in the moving average corresponds to the rate of consumption of raw materials. This rate is calculated as the average over the previous four years of the ratio of end-of-year raw materials and components inventories to the intermediate inputs of the industry.

The estimation system generates estimates using the NAICS. National estimates are produced for all variables collected by MSM, however only provincial estimates for sales of goods manufactured are produced. A measure of quality (CV) will also be produced. Seasonally adjusted series are available for the main aggregates.

#### Quality evaluation

The final data sets are subject to rigorous analysis that includes comparison to historical series and comparisons to other sources of data in order to put the economic changes in context. Information available from the media, other government organizations and economic think tanks is also used in the validation process.

#### Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible direct disclosure, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

#### Revisions and seasonal adjustment

In conjunction with preliminary estimates for the current month, estimates for the previous three months are revised to account for any late returns. Data are revised when late responses are received or if an incorrect response was recorded earlier.

Up to and including 2003, the MSM was benchmarked to the Annual Survey of Manufactures and Logging (ASML). Benchmarking was the regular review of the MSM estimates in the context of the annual data provided by the ASML. Benchmarking re-aligned the annualized level of the MSM based on the latest verified annual data provided by the ASML.

Significant research by Statistics Canada in 2006-2007 was completed on whether the benchmark process should be maintained. The conclusion was that benchmarking of the MSM estimates to the ASML should be discontinued. With the refreshing of the MSM sample in 2007, it was determined that benchmarking would no longer be required (retroactive to 2004) because the MSM now accurately represented 100% of the sample universe. Data confrontation will continue between MSM and ASML to resolve potential discrepancies.

As of the December 2017 reference month, a new sample was introduced. It is standard practice that every few years the sample is refreshed to ensure that the survey frame is up to date with births, deaths and other changes in the population. The refreshed sample is linked at the detailed level to prevent data breaks and to ensure the continuity of time series. It is designed to be more representative of the manufacturing industry at both the national and provincial levels.

Economic time series contain the elements essential to the description, explanation and forecasting of the behavior of an economic phenomenon. They are statistical records of the evolution of economic processes through time. In using time series to observe economic activity, economists and statisticians have identified four characteristic behavioral components: the long-term movement or trend, the cycle, the seasonal variations and the irregular fluctuations. These movements are caused by various economic, climatic or institutional factors. The seasonal variations occur periodically on a more or less regular basis over the course of a year. These variations occur as a result of seasonal changes in weather, statutory holidays and other events that occur at fairly regular intervals and thus have a significant impact on the rate of economic activity.

In the interest of accurately interpreting the fundamental evolution of an economic phenomenon and producing forecasts of superior quality, Statistics Canada uses the X12-ARIMA seasonal adjustment method to seasonally adjust its time series. This method minimizes the impact of seasonal variations on the series and essentially consists of adding one year of estimated raw data to the end of the original series before it is seasonally adjusted per se. The estimated data are derived from forecasts using ARIMA (Auto Regressive Integrated Moving Average) models of the Box-Jenkins type.

The X-12 program uses primarily a ratio-to-moving average method. It is used to smooth the modified series and obtain a preliminary estimate of the trend-cycle. It also calculates the ratios of the original series (fitted) to the estimates of the trend-cycle and estimates the seasonal factors from these ratios. The final seasonal factors are produced only after these operations have been repeated several times. The technique that is used essentially consists of first correcting the initial series for all sorts of undesirable effects, such as the trading-day and the Easter holiday effects, by a module called regARIMA. These effects are then estimated using regression models with ARIMA errors. The series can also be extrapolated for at least one year by using the model. Subsequently, the raw series, pre-adjusted and extrapolated if applicable, is seasonally adjusted by the X-12 method.

The procedures to determine the seasonal factors necessary to calculate the final seasonally adjusted data are executed every month. This approach ensures that the estimated seasonal factors are derived from an unadjusted series that includes all the available information about the series, i.e. the current month's unadjusted data as well as the previous month's revised unadjusted data.

While seasonal adjustment permits a better understanding of the underlying trend-cycle of a series, the seasonally adjusted series still contains an irregular component. Slight month-to-month variations in the seasonally adjusted series may be simple irregular movements. To get a better idea of the underlying trend, users should examine several months of the seasonally adjusted series.

The aggregated Canada level series are now seasonally adjusted directly, meaning that the seasonally adjusted totals are obtained via X12-ARIMA. Afterwards, these totals are used to reconcile the provincial total series which have been seasonally adjusted individually.

For other aggregated series, indirect seasonal adjustments are used. In other words, their seasonally adjusted totals are derived indirectly by the summation of the individually seasonally adjusted kinds of business.

Trend

A seasonally adjusted series may contain the effects of irregular influences and special circumstances and these can mask the trend. The short term trend shows the underlying direction in seasonally adjusted series by averaging across months, thus smoothing out the effects of irregular influences. The result is a more stable series. The trend for the last month may be subject to significant revision as values in future months are included in the averaging process.

### Data accuracy

While considerable efforts have been taken to ensure high standards throughout all stages of collection and processing, the resulting estimates are inevitably subject to a certain degree of non-sampling error. Non-sampling error is not related to sampling and may occur for various reasons. For example, non-response is an important source of non-sampling error. Population coverage, differences in the interpretations of questions and mistakes in recording, coding and processing data are other examples of non-sampling errors.

Non-sampling errors are controlled through a careful design of the questionnaire, the use of a minimal number of simple concepts and consistency checks. Measures such as response rates are used as indicators of the possible extent of non-sampling errors.

The MSM's average weighted response rate for collected and edit sales of goods manufactured data at national level is in the range of 94% to 96% in 2017. Table 2 in the 'Concepts, Definitions and Data Quality' document shows the weighted response or edit and imputation rates for collected data as well as for take-none portion data based on the GST for the following five characteristics: sales of goods manufactured, raw materials and components inventories, goods / work in progress inventories, finished goods manufactured inventories, unfilled orders and capacity utilization rate.

Sampling error can be measured by the standard error (or standard deviation) of the estimate. The coefficient of variation (CV) is the estimated standard error percentage of the survey estimate. Estimates with smaller CVs are more reliable than estimates with larger CVs. Table 1 in the 'Concepts, Definitions and Data Quality' document shows the national level CVs for the following five characteristics: sales of goods manufactured, raw materials and components inventories, goods / work in process inventories, finished goods manufactured inventories and unfilled orders and capacity utilization rates.

Measures of Sampling and Non-sampling Errors

1. Sampling Error Measures

The sample used in this survey is one of a large number of all possible samples of the same size that could have been selected using the same sample design under the same general conditions. If it was possible that each one of these samples could be surveyed under essentially the same conditions, with an estimate calculated from each sample, it would be expected that the sample estimates would differ from each other.

The average estimate derived from all these possible sample estimates is termed the expected value. The expected value can also be expressed as the value that would be obtained if a census enumeration were taken under identical conditions of collection and processing. An estimate calculated from a sample survey is said to be precise if it is near the expected value.

Sample estimates may differ from this expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

The standard error is a measure of precision in absolute terms. The coefficient of variation (CV), defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. For comparison purposes, one may more readily compare the sampling error of one estimate to the sampling error of another estimate by using the coefficient of variation.

In this publication, the coefficient of variation is used to measure the sampling error of the estimates. However, since the coefficient of variation published for this survey is calculated from the responses of individual units, it also measures some non-sampling error.

2. Non-sampling Error Measures

The exact population value is aimed at or desired by both a sample survey as well as a census. We say the estimate is accurate if it is near this value. Although this value is desired, we cannot assume that the exact value of every unit in the population or sample can be obtained and processed without error. Any difference between the expected value and the exact population value is termed the bias. Systematic biases in the data cannot be measured by the probability measures of sampling error as previously described. The accuracy of a survey estimate is determined by the joint effect of sampling and non-sampling errors.

Sources of non-sampling error in the MSM include non-response error, imputation error and the error due to editing. To assist users in evaluating these errors, weighted rates are given in Text table 2. The following is an example of what is meant by a weighted rate. A cell with a sample of 20 units in which five respond for a particular month would have a response rate of 25%. If these five reporting units represented $8 million out of a total estimate of $10 million, the weighted response rate would be 80%.

The definitions for the weighted rates noted in Text table 2 follow. The weighted response and edited rate is the proportion of a characteristic's total estimate that is based upon reported data and includes data that has been edited. The weighted imputation rate is the proportion of a characteristic's total estimate that is based upon imputed data. The weighted take-none fraction rate is the proportion of the characteristic's total estimate modeled from administrative data.

Joint Interpretation of Measures of Error

The measure of non-response error as well as the coefficient of variation must be considered jointly to have an overview of the quality of the estimates. The lower the coefficient of variation and the higher the weighted response rate, the better will be the published estimate.

In the case of estimates of sales by CMA, the quality of the estimates is measured using a global variance that takes in account the variance due to sampling, the variance due to imputation and the mean square error of the SAE model. More details concerning the quality of estimation of sales by CMA is available in the additional documentation.