Retail Trade Survey (Monthly) (MRTS)

Detailed information for December 2016

Status:

Active

Frequency:

Monthly

Record number:

2406

The Monthly Retail Trade Survey collects sales, e-commerce sales, and the number of retail locations by province, territory, and selected Census Metropolitan Areas (CMA) from a sample of retailers.

Data release - February 22, 2017

Description

The Monthly Retail Trade Survey collects sales, e-commerce sales, and the number of retail locations by province, territory, and selected Census Metropolitan Areas (CMA) from a sample of retailers.
Retail sales estimates are a key monthly indicator of consumer purchasing patterns in Canada. Furthermore, retail sales are an important component of the Gross Domestic Product, which measures Canada's production, and are part of many economic models used by public and private agencies. The Bank of Canada relies partly on monthly retail sales estimates when making decisions that influence interest rates. Businesses use retail sales estimates to track their own performance against industry averages and to prepare investment strategies.

Reference period: Month

Collection period: Collection of the data begins approximately 7 working days after the end of the reference month, and continues for the duration of that calendar month.

Subjects

  • Retail and wholesale
  • Retail sales by type of store

Data sources and methodology

Target population

The target population consists of all statistical establishments on Statistics Canada's Business Register (BR) that are classified to the retail sector using the North American Industry Classification System (NAICS 2012). The NAICS code range for the retail sector is 441100 to 454110.

The exclusions to the target population are establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

- 4542 (vending machine operators)
- 45431 (fuel dealers)
- 45439 (other direct selling establishments)

Additional documentation (hyperlink to NAICS 2012)

Instrument design

Both electronic and paper questionnaires are used to collect data for the MRTS. The questionnaires were developed at Statistics Canada and were reviewed and tested in the field in both official languages. In the course of redeveloping the MRTS, Statistics Canada consulted with a number of retailers as well as with industry associations. In 2016, NAICS 454110 was added to the MRTS questionnaire and the questionnaire became available to respondents in electronic format. The questionnaire for Sales and Inventories of Alcoholic Beverages is unchanged.

Sampling

This is a sample survey with a cross-sectional design.

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register (BR). A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industry and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by industrial groups (mainly, but not only four digit level NAICS), and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions (Montreal, Toronto, and Vancouver). We further stratify the population by size. The size measure is created using a combination of independent survey data and three administrative variables: the annual profiled revenue, the GST sales expressed on an annual basis, and the declared tax revenue (T1 or T2).

The size strata consist of one take-all (census), at most two take-some (partially sampled) strata, and one take-none (none sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales. Instead of sending questionnaires to these businesses, the estimates will be produced through the use of administrative data.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and industrial groups by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximization of monthly sample overlap. The sample is kept month after month and every month new units are added (births) to the sample. MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR's latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths also occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS. Methods to treat dead units and misclassified units are part of the sample and population update procedures.

Data sources

Data collection for this reference period: 2017-01-12 to 2017-01-30

Responding to this survey is mandatory.

Data are collected directly from survey respondents and extracted from administrative files.

Collection of the data is performed by Statistics Canada's Regional Offices. Respondents are sent an electronic or paper questionnaire or are contacted by telephone to obtain their sales, internet sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection also undertakes follow-up of non-respondents. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that month.

New entrants to the survey are introduced to the survey via an introductory letter that informs the respondent that a representative of Statistics Canada will be calling. This call is to introduce the respondent to the survey, confirm the respondent's business activity, establish and begin data collection, as well as to answer any questions that the respondent may have.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available. To minimize total non-response for all variables, partial responses are accepted.

Managing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the MRTS has reduced the number of simple establishments in the sample that are surveyed directly and instead derives sales data for these establishments from Goods and Service Tax (GST) files using a statistical model. The model accounts for differences between sales and revenue (reported for GST purposes) as well as for the time lag between the survey reference period and the reference period of the GST file.

For more information on the methodology used for modeling sales from administrative data sources, refer to 'Monthly Retail Trade Survey - Use of Administrative Data' under 'Documentation'.

View the Questionnaire(s) and reporting guide(s).

Error detection

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the MRTS, data editing is done at two different time periods.

Editing is performed during data collection. Once data are collected via the telephone, or via the receipt of completed questionnaires, the data are captured and/or edited using customized data capture applications. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month's responses are edited against the respondent's previous month's responses and/or the previous year's responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data are regularly transmitted to the head office in Ottawa.

Statistical editing is also conducted after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The first set of edit checks is based on the Hidiriglou-Berthelot method whereby a ratio of the respondent's current month data over historical (last month, same month last year) or auxiliary data is analyzed. When the respondent's ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

The second set of edit checks consists of share of market edits. With this method, one is able to edit all respondents, even those where historical and auxiliary data is unavailable. The method relies on current month data only. Therefore, within a group of respondents, that are similar in terms of industrial group and/or geography, if the weighted contribution of a respondent to the group's total is too large, it will be flagged as an outlier.

For edit checks based on the Hidiriglou-Berthelot method, data that are flagged as an outlier will not be included in the imputation models (those based on ratios). Also, data that are flagged as outliers in the share of market edit will not be included in the imputation models where means and medians are calculated to impute for responses that have no historical responses.

In conjunction with the statistical editing after data collection of reported data, there is also error detection done on the extracted GST data. Modeled data based on the GST are also subject to an extensive series of processing steps which thoroughly verify each record that is the basis for the model as well as the record being modeled. Edits are performed at a more aggregate level (industry by geography level) to detect records which deviate from the expected range, either by exhibiting large month-to-month change, or differing significantly from the remaining units. All data which fail these edits are subject to manual inspection and possible corrective action.

Imputation

In the MRTS, imputation is based on historical data or administrative data (GST sales). The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month, previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that a top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation method using administrative data is automatically selected when historical information is unavailable for a non-respondent. Trends are then applied to the administrative data source (monthly size) depending on whether the structure is simple, e.g. enterprises with only one establishment, or the unit has a more complex structure.

Estimation

Estimation is a process that approximates unknown population parameters using only the part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

For retail sales, the population is divided into a survey portion (take-all and take-some strata) and a non-survey portion (take-none stratum). From the sample that is drawn from the survey portion, an estimate for the population is determined through the use of a Horvitz-Thompson estimator where responses for sales are weighted by using the inverses of the inclusion probabilities of the sampled units. Such weights (called sampling weights) can be interpreted as the number of times that each sampled unit should be replicated to represent the entire population. The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the BR for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time. For the non-survey portion, the sales are estimated with statistical models using monthly GST sales.

Sales in volume: The value of retail trade is measured in two ways; including the effects of price change on sales and net of the effects of price change. The first measure is referred to as retail trade in current dollars and the latter as retail trade in constant dollars. The method of calculating the current dollar estimate is to aggregate the weighted value of sales for all retail outlets. The method of calculating the constant dollar estimate is to first adjust the sales values to a base year, using the Consumer Price Index, and then sum up the resulting values. See on this topic the following document 'Monthly Retail Trade Survey - Sales in volume' (in the 'Documentation' section below).

The measure of precision used for the MRTS to evaluate the quality of a population parameter estimate and to obtain valid inferences is the variance. The variance from the survey portion is derived directly from a stratified simple random sample without replacement.

Sample estimates may differ from the expected value of the estimates. However, since the estimate is based on a probability sample, the variability of the sample estimate with respect to its expected value can be measured. The variance of an estimate is a measure of the precision of the sample estimate and is defined as the average, over all possible samples, of the squared difference of the estimate from its expected value.

Quality evaluation

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for the largest companies), general economic conditions, and historical trends.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible direct disclosure, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous years. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. The revision period can be extended when historical revisions or restratification are done.

Retail trade data are seasonally adjusted using the X12-ARIMA method. This consists of extrapolating a year's worth of raw data with the ARIMA model (auto-regressive integrated moving average model), and of seasonally adjusting the raw time series. Finally, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

The seasonally adjusted data also need to be revised. In part, they need to reflect the revisions identified for the raw data. Also, the seasonally adjusted estimates are calculated using X-12-ARIMA, and are sensitive to the most recent values reported in the raw data. For this reason, with the release of each month of new data, the seasonally adjusted values for the previous three months are revised. A seasonally adjusted time series is a time series that has been modified to eliminate the effect of seasonal and calendar influences. For this reason, the seasonally adjusted data allows for more meaningful comparisons of economic conditions from month to month.

Once a year, seasonal adjustments options are reviewed to take into account the most recent data. Revised seasonally adjusted estimates for each month in the previous years are released at the same time as the annual revision to the raw data. The actual period of revision depends on the number of years the raw data was revised.

Documentation

Date modified: