Retail Commodity Survey (RCS)

Detailed information for second quarter 2017

Status:

Active

Frequency:

Quarterly

Record number:

2008

This survey collects detailed information about retail commodity sales in Canada to produce estimates of the distribution of the sales of various commodities at the national level, for different types of retail outlets in Canada.

Data release - October 16, 2017

Description

The Retail Commodity Survey (RCS) collects detailed information about retail commodity sales in Canada. The objective is to produce estimates of the distribution of the sales of various commodities at the national level, for different types of retail outlets in Canada. The survey is a complement to the Monthly Retail Trade Survey (MRTS- Survey ID 2406). MRTS gathers total monthly retail sales, while RCS collects a breakdown of these sales by commodity.

The retailers in the Monthly Survey of Large Retailers (Survey ID 5027) are also included in the sample of the Retail Commodity Survey. The same questionnaire is used for both surveys. The data provided to the Monthly Survey of Large Retailers are incorporated into RCS. Excluding recreational and motor vehicle dealers, these large retailers account for about 33% of retail trade. The retailers belonging to the Monthly Survey of Large Retailers are included based on their sales size and contribution to the food, clothing, home furnishings, electronics, sporting goods and general merchandise sectors of retail trade.

The information provided by RCS can be used to track commodity sales within and across various types of retail stores, as well as to calculate commodity market share, and to gain a better understanding of the rapidly changing retail industry. The data show the type of outlets where consumers prefer to buy certain commodities, and the shifts in the different types of commodities retailers decide to sell. Analysis of these data assists in establishing trends in commodity sales over time.

The RCS data are used by the Statistics Canada's System of National Accounts with respect to the estimates of personal expenditure. Other users of the data include federal and provincial government departments, retail analysts, market researchers, industry experts and independent consultants.

Reference period: Quarter

Collection period: the month following the reporting period.

Subjects

  • Retail sales by type of product

Data sources and methodology

Target population

The Retail Commodity Survey has almost the same target population as the Monthly Retail Trade Survey (MRTS), the exception being the Electronic Shopping and Mail Order Houses industry (NAICS 454110) which is included in the MRTS but not the RCS.

The MRTS target population consists of all statistical establishments on Statistics Canada's Business Register (BR) that are classified to the retail sector using the North American Industry Classification System (NAICS 2012). The NAICS code range for the retail sector is 441100 to 454110.

The exclusions to the target population are ancillary establishments (producers of services in support of the activity of producing goods and services for the market of more than one establishment within the enterprise, and serves as a cost centre or a discretionary expense centre for which data on all its costs including labour and depreciation can be reported by the business), future establishments, establishments with a missing or a zero gross business income (GBI) value on the BR and establishments in the following non-covered NAICS:

- 4542 (vending machine operators)
- 45431 (fuel dealers)
- 45439 (other direct selling establishments)

Instrument design

The questionnaires were developed at Statistics Canada and were reviewed and tested in the field in both official languages. In the course of developing the survey, Statistics Canada consulted with a number of retailers as well as with industry associations. The questionnaire underwent significant changes in 2016 moving to an electronic questionnaire and the North American Product Classification System (NAPCS).

Sampling

This is a sample survey with a cross-sectional design.

The RCS sample consists of a subset of retailers in the Monthly Retail Trade Survey (MRTS). The RCS sample contains all the retailers in MRTS with the exception of those in NAICS 454110 (which are excluded since they are non-store retailers.)

The MRTS sample consists of 10,000 groups of establishments (clusters) classified to the Retail Trade sector selected from the Statistics Canada Business Register. A cluster of establishments is defined as all establishments belonging to a statistical enterprise that are in the same industry and geographical region. The MRTS uses a stratified design with simple random sample selection in each stratum. The stratification is done by sampling groups using the NAICS-three, four or five-digit level, depending on the subsector, and the geographical regions consisting of the provinces and territories, as well as three provincial sub-regions (Montreal, Toronto, and Vancouver). We further stratify the population by size. The size measure is created using a combination of independent survey data and three administrative variables: the GBI, the GST sales, and the T2 revenue (from corporation tax return).

The size strata consist of one take-all (census), at most two take-some (partially sampled) strata, and one take-none (none sampled) stratum. Take-none strata serve to reduce respondent burden by excluding the smaller businesses from the surveyed population. These businesses should represent at most ten percent of total sales.

The sample was allocated optimally in order to reach target coefficients of variation at the national, provincial/territorial, industrial, and sampling group by province/territory levels. The sample was also inflated to compensate for dead, non-responding, and misclassified units.

MRTS is a repeated survey with maximization of monthly sample overlap. The sample is kept month after month and every month births are added to the sample and dead units are identified. MRTS births, i.e., new clusters of establishment(s), are identified every month via the BR's latest universe. They are stratified according to the same criteria as the initial population. A sample of these births is selected according to the sampling fraction of the stratum to which they belong and is added to the monthly sample. Deaths also occur on a monthly basis. A death can be a cluster of establishment(s) that have ceased their activities (out-of-business) or whose major activities are no longer in retail trade (out-of-scope). The status of these businesses is updated on the BR using administrative sources and survey feedback, including feedback from the MRTS.

Methods to treat dead units and misclassified units are part of the sample and population update procedures.

For the RCS, there is one NAICS-five digit industry that is subject to a different sampling treatment - the New Car Dealers industry (NAICS 444110). For this industry, approximately 20 manufacturers and importers of new cars are surveyed through the New Motor Vehicle Dealer Commodity Survey to collect information on behalf of their dealers.

Data sources

Responding to this survey is mandatory.

Data are collected directly from survey respondents.

If a respondent finds it more convenient to report their commodity data to Statistics Canada on a monthly basis, they are allowed to do so. Respondents can report annually when the distribution of their sales does not vary throughout the year. The reporting period refers to the period that the commodities were actually sold in the retail stores. The collection period is the time period in which the data is collected.

Data are principally collected by electronic questionnaire and the Statistics Canada Regional Offices. A selected number of units are collected through the head office in Ottawa.

Respondents are given a choice of collection methods: electronic or paper questionnaire or telephone. They also have the choice to report commodity data in dollars or as a percentage of total sales and receipts. Telephone follow-up is conducted to resolve edit problems with mail-back questionnaires and to collect data from respondents who have not returned the questionnaire.

The initial contact with the respondent consists of sending the respondent a package including an introductory letter informing the respondent that a Statistics Canada representative will be calling. A sample questionnaire is also included. This package is followed by a telephone conversation to introduce the survey to the respondent, identify the person best able to provide the data and obtain a detailed profile of what the business sells over a one-year time frame. A profile is a list of all the commodities sold by the retailer. The electronic questionnaire is then tailored to the commodities sold by the retailer.

Commodity indices were developed to assist interviewers and respondents in choosing the most appropriate commodity codes to classify the type of items being sold by retailers. There are two indices -- one is organised by the North American Product Classification System (NAPCS) code and the other one is an alphabetical listing by product within the 5 digit classes of NAPCS.

View the Questionnaire(s) and reporting guide(s).

Error detection

During data collection, on-line edits are performed to check for consistency between the current period's data and the last period's data. If the commodities reported for the current period are inconsistent with the previous period, the data are verified with the respondent. Edits to ensure that the captured information is numerically valid and that all data fields are completed are also performed, as well as edits to ensure that the reporting period dates are valid.

Once the data are received back at head office an extensive series of processing steps is undertaken to thoroughly verify each record received. Edits are performed at the micro level to ensure that: the commodities sold make sense for the type of store; the sum of the individual commodities equals the total sales reported and that there are no missing fields; the total sales reported to this survey is in line with the sales reported to MRTS; and there are no large fluctuations in commodity sales from period to period. Records failing these edits are subject to manual inspection and possible corrective action.

Imputation

An automated imputation system is used to impute for missing or erroneous data. Non-respondents, as well as respondents with one or more fields flagged for imputation (due to incomplete or inconsistent data identified during the editing process), are subject to imputation. Since the RCS sample is monthly-based, the imputation system processes the data for one reference month at a time. The system makes use of the auxiliary information available from MRTS. Since all retailers in RCS are also in MRTS, the total sales for each record is obtained from the MRTS file after the MRTS edit and imputation process has been completed. The commodity fields are then imputed one at a time using the following methods.

For non-respondents, the system uses the most recent historical data available to determine which commodities are sold by the retailer. RCS uses adjusted historical imputation to impute for total non-response. Data from the retailer for the same month of the previous year is used. If that data is unavailable, the previous month data is used.

For respondents with fields requiring imputation and for non-respondents the system uses recent adjusted historical data available to impute commodity values. Where there is no historical data available, commodity values are imputed using nearest-neighbor donor imputation, and where that is not possible, commodity values are imputed by ratio imputation using a current auxiliary variable. Imputation groups of similar retailers are formed on the basis of type of store, and geographic region. Values imputed to a unit will be derived from the values of respondents belonging to the same group. Respondents that are considered to be outliers (either due to extremely large fluctuations in their commodity distributions when compared to their previous data, or due to unusual commodity sales for the type of store) are excluded from the group. For each commodity requiring imputation, the ratio of the group's commodity sales to the group's total sales is applied to the unit's total sales. When there are not sufficient respondents in an imputation group, groups at successively more aggregated levels of type of store and geographic region are used.

Since the commodity fields are imputed one at a time, the imputation process is followed by a prorating step to ensure that all parts add up to the corresponding totals.

The commodity values for the new car dealers industry (NAICS 441110) are derived in a different manner than the other industries. Since commodity distributions are collected in the responses to the New Motor Vehicle Dealer Commodity Survey, these distributions are applied to the MRTS retail sales for this industry to derive commodity distributions for each individual new car dealer.

Estimation

Estimation is a process that approximates unknown population parameters using only the part of the population that is included in a sample. Inferences about these unknown parameters are then made, using the sample data and associated survey design. This stage uses Statistics Canada's Generalized Estimation System (GES).

The goal of RCS is to produce quarterly estimates for the distribution of total retail sales among various commodities. The source for the level of the total retail sales is MRTS. RCS total sales are benchmarked at the 3 digit NAICS level to the MRTS sales estimates. By benchmarking RCS to MRTS, the total sales for the RCS will match the total sales for the MRTS.

Since all missing commodity information is imputed (both for non-respondents and for respondents with some missing data), there is no adjustment at the estimation stage for non-response. The estimation weight that is applied to units in the RCS sample is made up of two components that are multiplied together. The first component is a weight reflecting the need to sample from the population (i.e. a weight to inflate the sample data to represent the entire population). The second component is an adjustment factor to ensure that the RCS total sales estimate equals the MRTS sales estimate at the 3 digit NAICS level.

Since the MRTS and RCS samples are monthly-based, commodity estimates and their variances are calculated for each month of the quarter. The variances are derived directly from a stratified simple random sample without replacement. The monthly estimates are summed to obtain commodity estimates for the quarter. An approximation of the variances of the quarterly estimates is then calculated. The monthly commodity values are summed to produce quarterly commodity values and the variances are approximated as if respondents had reported on a quarterly basis.

Quality evaluation

Prior to publication, combined survey results are analyzed for comparability; in general, this includes a detailed review of individual responses (especially for the largest companies), general economic conditions, and historical trends.

The data are examined at a macro level to ensure that the long-term trends make sense when compared to publicly available information in media reports, company press releases, etc. Large fluctuation in year-over-year sales for commodities are analysed to determine if they are in error or if sales for these commodities accurately reflect retail activity. Subject matter officers follow up with the company to confirm the data and to document reasons for large fluctuations in sales.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible direct disclosure, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Revisions and seasonal adjustment

Each release, current quarter preliminary estimates as well as previous quarter revised estimates are made available. Once a year, annual revisions are performed. The revisions mainly stem from responses received after the initial release of the quarter's data. Data are also revised due to revisions to the retail sales level provided by MRTS.

RCS total sales estimates are benchmarked at the sampling group level to the sales estimates (before seasonal adjustment) from the Monthly Retail Trade Survey (MRTS). Total sales for RCS differ slightly from the sales published by MRTS in that the sales of department store concessions are included in RCS and not in MRTS.

RCS estimates are not adjusted for seasonality.

Data accuracy

The commodity estimates are derived from a sample survey and, as such, are subject to both sampling and non-sampling errors. Sampling errors are present because observations are made only on a sample and not on the entire population. The sampling error depends on factors such as the size of the sample, variability in the population, sampling design and method of estimation. The coefficient of variation (CV), which is the estimated standard error expressed as a percentage of the estimate, is used to measure the degree to which sampling error potentially exists within the sample. Estimates with smaller CVs are more reliable than estimates with larger CVs.

Non-sampling errors are not related to sampling and may occur for many reasons. Population coverage errors, differences in the interpretation of questions, incorrect information from respondents, and mistakes in recording, coding and processing data are examples of non-sampling errors. Non-response is an important source of non-sampling error. While the impact of non-sampling errors is difficult to evaluate, measures such as response rates and imputation rates can be used as indicators of the potential level of non-sampling error.

Each commodity estimate is assigned a code from A to F (where A is most reliable and F is to be used with caution) as an indicator of data quality. This quality indicator code is a joint measure of the magnitude of the CV and the imputation rate. The imputation rate is the proportion of the estimated sales which comes from imputed data. For example, if the total estimated sales for a commodity is $1 million, and $150,000 is from imputed data, then the imputation rate is 15%.

The final quality indicator code is determined by first assigning a code based only on the CV. The code is then adjusted to take into account the imputation rate for that estimate. Estimates with a CV in the range of 0% to 5% are assigned an A; 5% to 10% a B; 10% to 16.5% a C; 16.5% to 25% a D; 25% to 33% an E. If the imputation rate is below 10%, the CV code becomes the final quality indicator code. If the imputation rate is between 10% and 33%, the CV code is downgraded by one (i.e. an A would become a B, a C would become a D). If the imputation rate is between 33% and 60%, the CV code is downgraded by two (i.e. an A would become a C, a C would become an E).

Documentation

Date modified: