Monthly Retail Trade Survey (MRTS)

Detailed information for April 2022

Status:

Active

Frequency:

Monthly

Record number:

2406

The Monthly Retail Trade Survey collects data on sales, e-commerce sales, and the number of retail locations by province, territory, and selected census metropolitan areas from a sample of retailers.

Data release - June 21, 2022

Description

The Monthly Retail Trade Survey collects data on sales, e-commerce sales, and the number of retail locations by province, territory, and selected census metropolitan areas from a sample of retailers.

Retail sales estimates are a key monthly indicator of consumer purchasing patterns in Canada. Furthermore, retail sales are an important component of the Gross Domestic Product, which measures Canada's production, and are part of many economic models used by public and private agencies. The Bank of Canada relies partly on monthly retail sales estimates when making decisions that influence interest rates. Businesses use retail sales estimates to track their own performance against industry averages and to prepare investment strategies.

Reference period: Month

Collection period: Collection of the data begins approximately 7 working days after the end of the reference month, and continues for the duration of that calendar month.

Subjects

  • Retail and wholesale
  • Retail sales by type of store

Data sources and methodology

Target population

The target population consists of all statistical establishments on Statistics Canada's Business Register (BR) that are classified to the retail sector using the North American Industry Classification System (NAICS 2017). The NAICS code range for the retail sector is 441100 to 454110.

The exclusions to the target population are establishments with a missing or a zero gross business income value on the BR and establishments in the following non-covered NAICS:

- 4542 (Vending machine operators)
- 45431 (Fuel dealers)
- 45439 (Other direct selling establishments)

Instrument design

Both electronic and paper questionnaires are used to collect data for the Monthly Retail Trade Survey (MRTS). The questionnaires were developed at Statistics Canada and were reviewed and tested in the field in both official languages. In the course of redeveloping the MRTS, Statistics Canada consulted with a number of retailers as well as with industry associations. In 2016, the code 454110 of the North American Industry Classification System was added to the MRTS questionnaire, and the questionnaire became available to respondents in electronic format.

Sampling

This is a sample survey with a cross-sectional design.

The Business Register is a repository of information reflecting the Canadian business population and exists primarily for the purpose of supplying frames for all economic surveys in Statistics Canada. It is designed to provide a means of coordinating the coverage of business surveys and of achieving consistent classification of statistical reporting units. It also serves as a data source for the compilation of business demographic information.

The major sources of information for the Business Register are updates from the Statistics Canada survey program and from Canada Revenue Agency's (CRA) Business Number account files. This CRA administrative data allows for the creation of a universe of all business entities.

Data sources

Responding to this survey is mandatory.

Data are collected directly from survey respondents and extracted from administrative files.

Collection of the data is performed by Statistics Canada's Regional Offices. Respondents are sent an electronic or paper questionnaire or are contacted by telephone to obtain their sales, internet sales and inventory values, as well as to confirm the opening or closing of business trading locations. Collection also undertakes follow-up of non-respondents. Collection of the data begins approximately 7 working days after the end of the reference month and continues for the duration of that calendar month.

New entrants to the survey are introduced to the survey via introductory questions that confirm the respondent's business activity and contact information.

If data are unavailable at the time of collection, a respondent's best estimates are also accepted, and are subsequently revised once the actual data become available.

To minimize total non-response for all variables, partial responses are accepted.

ADMINISTRATIVE DATA SOURCES
Reducing response burden is an ongoing challenge for Statistics Canada. In an attempt to alleviate response burden and survey costs, especially for smaller businesses, the Monthly Retail Trade Survey derives sales data for low-revenue establishments from Goods and Service Tax (GST) files using a ratio estimator. The ratio estimator also increases the precision of the surveyed portion of the estimate.

View the Questionnaire(s) and reporting guide(s).

Error detection

Data editing is the application of checks to detect missing, invalid or inconsistent entries or to point to data records that are potentially in error. In the survey process for the Monthly Retail Trade Survey (MRTS), data editing is done at two different time periods.

Editing is performed during data collection. Once data are collected via the telephone, or via the receipt of completed questionnaires, the data are captured and edited using customized data capture applications. Edits during data collection are referred to as field edits and generally consist of validity and some simple consistency edits. They are used to detect mistakes made during the interview by the respondent or the interviewer and to identify missing information during data collection in order to reduce the need for follow-up later on. Another purpose of the field edits is to clean up responses. In the MRTS, the current month's responses are edited against the respondent's previous month's responses or the previous year's responses for the current month. Field edits are also used to identify problems with data collection procedures and the design of the questionnaire, as well as the need for more interviewer training.

Follow-up with respondents occurs to validate potential erroneous data following any failed preliminary edit check of the data. Once validated, the collected data are regularly transmitted to the head office in Ottawa.

Statistical editing is also conducted after data collection and this is more empirical in nature. Statistical editing is run prior to imputation in order to identify the data that will be used as a basis to impute non-respondents. Large outliers that could disrupt a monthly trend are excluded from trend calculations by the statistical edits. It should be noted that adjustments are not made at this stage to correct the reported outliers.

The first step in statistical editing is to identify which responses will be subjected to the statistical edit rules. Reported data for the current reference month will go through various edit checks.

The edit checks are based on the Hidiriglou-Berthelot method whereby a ratio of the respondent's current month data over historical (i.e. last month or same month last year) or auxiliary data is analyzed. When the respondent's ratio differs significantly from ratios of respondents who are similar in terms of industry and/or geography group, the response is deemed an outlier.

Data that are flagged as an outlier will not be included in the imputation models (those based on ratios).

Imputation

In the Monthly Retail Trade Survey, imputation is based on historical data or administrative data. The appropriate method is selected according to a strategy that is based on whether historical data is available, auxiliary data is available and/or which reference month is being processed.

There are three types of historical imputation methods. The first type is a general trend that uses one historical data source (previous month, data from next month or data from same month previous year). The second type is a regression model where data from previous month and same month, previous year are used simultaneously. The third type uses the historical data as a direct replacement value for a non-respondent. Depending upon the particular reference month, there is an order of preference that exists so that a top quality imputation can result. The historical imputation method that was labelled as the third type above is always the last option in the order for each reference month.

The imputation method using administrative data is automatically selected when historical information is unavailable for a non-respondent. Trends are then applied to the administrative data source (monthly size) depending on whether the structure is simple, e.g. enterprises with only one establishment, or the unit has a more complex structure.

Estimation

Estimation is a process by which Statistics Canada obtains values for the population of interest so that it can draw conclusions about that population based on information gathered from only a sample of the population. More specifically, the Monthly Retail Trade Survey (MRTS) uses a ratio estimator.

Ratio estimation consists of replacing the initial sampling weights (defined as the inverse of the probability of selection in the sample) by new weights in a manner that satisfies the constraints of calibration. Calibration ensures that the total of an auxiliary variable estimated using the sample must equal the sum of the auxiliary variable over the entire population, and that the new sampling weights are as close as possible (using a specific distance measure) to the initial sampling weights.

For example, suppose that the known population total of the auxiliary variable is equal to 100 and based on a sample the estimated total is equal to 90, so that we are underestimating by approximately 10%. Since we know the population total of the auxiliary variable, it would be reasonable to increase the weights of the sampled units so that the estimate would be exactly equal to it. Now since the variable of interest is related to the auxiliary variable, it is not unreasonable to believe that the estimate of the sales based on the same sample and weights as the estimate of the auxiliary variable may also be an underestimation by approximately 10%. If this is in fact the case, then the adjusted weights could be used to produce an alternative estimator of the total sales. This alternate estimator is called the ratio estimator.

In essence, the ratio estimator tries to compensate for 'unlucky' samples and brings the estimate closer to the true total. The gain in variance will depend on the strength of the relationship between the variable of interest and the auxiliary data.

The take-none portion is taken into account by the ratio estimator. This is done by simply including the take-none portion in the control totals for the sample portion. By doing this, the weights for the sampled portion will be increased in such a way that the estimates will be adjusted to take into account the take-none portion.

The calculated weighted sales values are summed by domain, to produce the total sales estimates by each industrial group / geographic area combination. A domain is defined as the most recent classification values available from the Business Register for the unit and the survey reference period. These domains may differ from the original sampling strata because units may have changed size, industry or location. Changes in classification are reflected immediately in the estimates and do not accumulate over time.

Sales in volume:
Changes in current dollar retail sales can be decomposed into two elements: a price element, or the part of the growth linked to price variations and a volume element, which covers the change in quantities and quality of the goods and services sold. Volume measures can be obtained by removing the price variations, as measured by appropriate price indexes, from the current dollar value of sales. This process is known as deflation. As of the January 2021 reference period, constant dollar estimates of the Monthly Retail Trade Survey use industry based price indexes from the Retail Services Price Index (RSPI). The Retail Services Price Index provides information from a wide range of retail industries on the average monthly purchase price (amount paid by the business unit for the acquisition of a given product) and the average monthly selling price (amount received by the business unit for selling the same product), excluding taxes. The Monthly Retail Trade Survey uses the changes in the retail selling price variable of the RSPI. To calculate monthly retail trade volume data, current dollar sales by store type are divided by their respective retail selling price index to derive constant price sales by store type. The volume of total retail sales at constant prices is the sum of the volume of sales at constant prices by store type.

Volume estimates of monthly retail sales are published about fifty-two days after the end of the reference period, but the RSPI data that are essential to produce them is available quarterly within 92 days of the reference quarter. Therefore, a machine learning method is used to project the RSPI for missing months. The specific machine learning method used is artificial neural networks in which each NAICS is modelled separately. The response variable is the index relative for the NAICS at each month using 13 months of historic relatives as well as information from the Consumer Price Index (CPI), Bureau of Labor Statistics (BLS) Producer Price Index relatives, 12 month indicators and other alternative data sources.

Constant Dollar Estimates prior to the January 2021 reference period use a methodology that combines information from the Retail Commodity Survey and the Consumer Price Index. More information on this methodology can be found at the bottom under Documentation Monthly Retail Trade Survey (MRTS) - Sales in Volume - ARCHIVED. For the purposes of time-series continuity; this methodology is remaining in place to deflate current dollar sales of the following industries:

• 4413 Automotive parts accessories and tire stores
• 4422 Home furnishings stores
• 4452 Specialty food stores
• 4481 Clothing stores
• 4521 Department stores
• 453 Miscellaneous stores
• 453993 Cannabis stores

Quality evaluation

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for the largest companies), general economic conditions, and historical trends.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible direct disclosure, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Revisions and seasonal adjustment

Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

Raw data are revised, on a monthly basis, for the month immediately prior to the current reference month being published. That is, when data for December are being published for the first time, there will also be revisions, if necessary, to the raw data for November. In addition, revisions are made once a year, with the initial release of the February data, for all months in the previous years. The purpose is to correct any significant problems that have been found that apply for an extended period. The actual period of revision depends on the nature of the problem identified, but rarely exceeds three years. The revision period can be extended when historical revisions or restratification are done.

Retail trade data are seasonally adjusted using the X12-ARIMA method. This consists of extrapolating a year's worth of raw data with the ARIMA model (auto-regressive integrated moving average model), and of seasonally adjusting the raw time series. Finally, the annual totals of the seasonally adjusted series are forced to the annual totals of the original series.

The seasonally adjusted data also need to be revised. In part, they need to reflect the revisions identified for the raw data. Also, the seasonally adjusted estimates are calculated using X-12-ARIMA, and are sensitive to the most recent values reported in the raw data. For this reason, with the release of each month of new data, the seasonally adjusted values for the previous three months are revised. A seasonally adjusted time series is a time series that has been modified to eliminate the effect of seasonal and calendar influences. For this reason, the seasonally adjusted data allows for more meaningful comparisons of economic conditions from month to month.

Once a year, seasonal adjustments options are reviewed to take into account the most recent data. Revised seasonally adjusted estimates for each month in the previous years are released at the same time as the annual revision to the raw data. The actual period of revision depends on the number of years the raw data was revised.

Data accuracy

The methodology of this survey has been designed to control errors and to reduce their potential effects on estimates. However, the survey results remain subject to errors, of which sampling error is only one component of the total survey error. Sampling error results when observations are made only on a sample and not on the entire population.

All other errors arising from the various phases of a survey are referred to as non-sampling errors. For example, these types of errors can occur when a respondent provides incorrect information or does not answer certain questions; when a unit in the target population is omitted or covered more than once; when a unit that is out of scope for the survey is included by mistake or when errors occur in data processing, such as coding or capture errors. While the impact of non-sampling errors is difficult to evaluate, certain measures such as response and imputation rates can be used as indicators of the potential level of non-sampling error.

Coefficients of variation and response rates are major data quality measures used to validate results from the Monthly Retail Trade Survey (MRTS).

The coefficient of variation, defined as the standard error divided by the sample estimate, is a measure of precision in relative terms. Since the coefficient of variation is calculated from responses of individual units, it also measures some non-sampling errors.

RESPONSE RATES
The average collection response rate for 2021 is 89.9%.

NON-SAMPLING ERROR
Revisions in the raw data are required to correct known non-sampling errors. These normally include replacing imputed data with reported data, corrections to previously reported data, and estimates for new births that were not known at the time of the original estimates.

NON-RESPONSE BIAS
Non-response has two effects on data: first it introduces bias in estimates when non-respondents differ from respondents in the characteristics measured; and second, it contributes to an increase in the sampling variance of estimates because the effective sample size is reduced from that originally sought.

The degree to which efforts are made to get a response from a non-respondent is based on budget and time constraints, its impact on the overall quality and the risk of non-response bias.

The main method to reduce the impact of non-response at sampling is to inflate the sample size through the use of over-sampling rates that have been determined from similar surveys.

Besides the methods to reduce the impact of non-response at sampling and collection, the non-responses to the survey that do occur are treated through imputation. In order to measure the amount of non-response that occurs each month, various response rates are calculated. For a given reference month, the estimation process is run at least twice (a preliminary and a revised run). Between each run, respondent data can be identified as unusable and imputed values can be corrected through respondent data. As a consequence, response rates are computed following each run of the estimation process.

For the MRTS, two types of rates are calculated (un-weighted and weighted). In order to assess the efficiency of the collection process, un-weighted response rates are calculated. Weighted rates, using the estimation weight and the value for the variable of interest, assess the quality of estimation.

In summary, the response rates are calculated as follows:

Weighted rate:
Sum of weighted sales of units that have reported data / Sum of survey weighted sales

Un-weighted rate:
Number of questionnaires that have reported data / Number of questionnaires sent to collection

COVERAGE ERROR
Coverage errors consist of omissions, erroneous inclusions, duplications and misclassification of units in the survey frame.

Statistics Canada's Business Register (BR) provides the frame for the Monthly Retail Trade Survey. The BR is a data service centre updated through a number of sources including administrative data files, feedback received from conducting Statistics Canada business surveys, and profiling activities including direct contact with companies to obtain information about their operations and Internet research findings. Using the BR will ensure quality, while avoiding overlap between surveys and minimizing response burden to the greatest extent possible.

OTHER NON-SAMPLING ERRORS
These errors may occur at various stages of data processing such as coding, data entry, verification, editing, weighting, and tabulation, etc. Non-sampling errors are difficult to measure. More important, non-sampling errors require control at the level at which their presence does not impair the use and interpretation of the results.

Measures have been undertaken to minimize the non-sampling errors. For example, units have been defined in a most precise manner and the most up-to-date listings have been used. Questionnaires have been carefully designed to minimize different interpretations. As well, detailed acceptance testing has been carried out for the different stages of data editing and processing and every possible effort has been made to reduce the non-response rate as well as the response burden.

Documentation

Date modified: