Retail Store Survey (Annual)

Detailed information for 2007





Record number:


The purpose of this survey is to collect the financial and operating/production data needed to develop national and regional economic policies and programs.

Data release - March 30, 2009 (As of reference year 2008, the information concerning this survey will be included in record #2447)


The output for Retail Store is derived from the combined data of two surveys - the Annual Retail Store Survey and the Annual Retail Chain Survey (Survey ID 2447). Coverage for the Annual Retail Store Survey includes independent and franchise establishments only.

The Annual Retail Store Survey measures, on an annual basis, the operating and financial characteristics of domiciled independent and franchise establishments. A "franchise" is part of a group of stores that sell the same products and operate similarly, but each franchise is independently owned. An "independent" store generally operates less than four locations.

Data from this survey provide information on revenue, expenses and inventory. The data are used by all levels of government, government agencies, the retail industry and individuals in assessing trends within the industry, measuring performance, benchmarking and to study the evolving structure of the retail industry. The information is also a critical input into the measure of gross margins in the Canadian System of National Accounts (CSNA).

Statistical activity

The survey is administered as part of the Unified Enterprise Survey program (UES). The UES program has been designed to integrate, gradually over time, the approximately 200 separate business surveys into a single master survey program. The UES aims at collecting more industry and product detail at the provincial level than was previously possible while avoiding overlap between different survey questionnaires. The redesigned business survey questionnaires have a consistent look, structure and content. The unified approach makes reporting easier for firms operating in different industries because they can provide similar information for each branch operation. This way they avoid having to respond to questionnaires that differ for each industry in terms of format, wording and even concepts.

Reference period: The calendar year or the 12-month fiscal period for which the final day occurs on or between April 1st of the reference year and March 31st of the following year.

Collection period: March through September


  • Retail and wholesale
  • Retail sales by type of store

Data sources and methodology

Target population

The target population consists of all independent retail establishments operating in Canada for at least one day between January and December of a calendar year. Direct sellers and operators of vending machines are excluded from the target population of this survey.

The survey population is the collection of all retail establishments from which the survey can realistically obtain information. The survey population will differ from the target population due to difficulties in identifying all the units that belong to the target population because of a possible lack of detailed information for some units, particularly small businesses with low sales levels.

The survey population is comprised of all statistical establishments coded to NAICS 441 through 453 on Statistics Canada's Business Register, as well as those small unincorporated businesses not on the Business Register, which are classified to the retail industry.

Instrument design

The questionnaire content was developed by the Content Development Group in conjunction with Subject Matter areas and then field tested with respondents via focus groups. This was to ensure that the questions, concepts and terminology were appropriate from a conceptual and respondent point of view. This included an assessment of respondents' willingness to respond; to determine whether respondents understood the questions and what to report; to investigate the compatibility of questions and response categories with respondents' record-keeping practices; to identify problems or difficulties that respondents may have in retrieving information and in completing questionnaires; to verify the translations were correct; to obtain respondents' suggestions about how to improve the questionnaires and to ensure the questionnaires were respondent-friendly.


This is a sample survey with a cross-sectional design.

In order to reduce the respondent response burden and still produce reliable figures, exclusion thresholds based on industrial, provincial, and size dimensions were implemented. Data for the retailing establishments above the prescribed threshold were collected through questionnaires, and administrative (tax) data were used to estimate for small businesses below the threshold.

Before sample selection, the survey population is delineated into cells representing the provincial, Trade Group and size dimensions required. The establishments in the survey population are first stratified according to their province/territory and trade group based on the NAICS industrial classification. The trade groups are mutually exclusive classifications, each representing similar businesses.

Within each province/territory, by Trade Group combination, four size strata are created to group businesses of a similar size. The boundaries are determined using total estimated revenues for the businesses. The resulting groups are one take-all stratum of the largest businesses (which are all included in the sample), two take-some strata (from which representative samples are selected) and one take-none stratum (containing small businesses which are not eligible to be sampled). Optimal stratum boundaries or thresholds are determined to minimise the total sample size.

Following the sample selection process, data for the take-all and take-some strata are collected through questionnaires. For those units belonging to the take-none stratum, a sample of administrative (tax) records is used to collect selected financial information.

All sampled units are assigned a sampling weight. An initial weight equal to the inverse of the original probability of selection is assigned to each entity. The sampling weight is a raising factor attached to each sampled unit to obtain estimates for the population. For example, if two units are selected at random and with equal probability out of a population of 10 units, then each selected unit represents five units in the population, and it is given a sampling weight of five. These weights are subsequently adjusted, at the time of producing survey results, to reflect the most up-to-date population counts. The final set of weights therefore reflects as closely as possible the characteristics of the population in this industry.

On the Business Register, there were approximately 211,223 retail establishments having operated for at least one day during the reference year 2007. The sample comprised approximately 37,010 establishments.

Data sources

Data collection for this reference period: 2008-04-07 to 2008-10-24

Responding to this survey is mandatory.

Data are collected directly from survey respondents and extracted from administrative files.

The questionnaires are mailed to the respondents to the survey after the end of the calendar year. The larger reporting units receive a detailed questionnaire while the smaller reporting units receive an abridged version. An automatic fax reminder is sent to non-reporters around 15 days after mailing out the questionnaires. A telephone contact is made with non-reporting companies 15 days after the first fax follow-up to discuss reporting delinquency and possible special arrangements. A second fax is sent to persistent non-reporters later on in the collection period before collection is closed.

Respondents can report to the survey by fax or by mail. Information may also be transmitted by the Internet or by telephone. In exceptional cases a company may not be able to comply with the legal reporting deadlines and special reporting arrangements are determined.

For 'simple' businesses, that is, those that operate in a single province and conduct all their activities in the same industry, tax data is substituted for survey collection. As well, tax data is used in imputing for survey non-respondents.

The survey sample comprised of 7,761 collection entities. Out of this, 4,656 units were mailed questionnaires. Tax data is used for the remaining 3,105 units which were 'simple' businesses

View the Questionnaire(s) and reporting guide(s) .

Error detection

Several checks are performed on the collected data to verify internal consistency and identify extreme values. Data are analysed within each trade group and geographic region. Extreme values are reviewed and corrective action taken. These extreme values are excluded from use in the calculation of imputation variables by the imputation system.


Units which do not respond in the current period are imputed (their characteristics are estimated). Units are imputed by applying a growth factor to previously reported data when available. The growth factor is estimated using the survey responses for the units that are most similar to the unit being imputed.

When partial survey data covering three key variables (total operating revenue, total operating expenses and cost of goods sold) are received, the imputation factors are calculated at the unit level using these partial data. For records without historical information, a donor imputation system (nearest neighbor) is used. Information on the size of the non-respondent is obtained and a similar sized respondent is found. The size information consists of the three key variables (total operating revenue, total operating expenses and costs of goods sold). If this information is not available, sales from the Monthly Retail Trade Survey (Survey ID 2406) are used. In this case, the monthly sales are directly copied over to the non-respondent and the rest of the key variables are calculated using the sales data. In other cases, tax data is used as a proxy for non-response.


Estimates are computed at several levels of interest such as trade group and province, based on the most recent classification information available from the Business Register for the statistical entity and the survey reference period. It should be noted that this classification information might differ from the original sampling strata because records may have changed in size, industry, or location. Changes in classification are reflected immediately in the estimates.

Quality evaluation

Prior to the data release, combined survey results are analyzed for comparability; in general, this includes a detailed review of: individual responses (especially for the largest companies), general economic conditions, historic trends, and comparisons with annualised monthly survey data and industry and trade association sources.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Confidentiality analysis includes the detection of possible direct disclosure, which occurs when the value in a tabulation cell is composed of a few respondents or when the cell is dominated by a few companies.

Revisions and seasonal adjustment

This methodology type does not apply to this statistical program.

Data accuracy

While considerable effort is made to ensure high standards throughout all stages of collection and processing, the resulting estimates are inevitably subject to a certain degree of non-sampling error. Examples of non-sampling error are coverage error, data response error, non-response error and processing error.

Measures such as response rate (total number of completed questionnaires as a percentage of the total active, in-scope survey sample) and response fraction (the proportion of the estimate based upon reported data) can be used as indicators of the possible extent of non-sampling errors. For the 2007 survey, at the Canada level, the response fractions (RF) for total operating revenue (TOR) was 96%.

For a more detailed discussion of the data accuracy, as well as for response fractions and CVs by province and territory, see the document below.


Date modified: