Survey of Innovation and Business Strategy

Detailed information for January 1, 2015 to December 31, 2017

Status:

Active

Frequency:

Occasional

Record number:

5171

The Survey of Innovation and Business Strategy collects information on the strategic decisions, innovation activities, operational tactics and global value chain activities of businesses in Canada.

Data release - March 26, 2019

Description

The Survey of Innovation and Business Strategy (SIBS) is a cross-economy survey of business enterprises and industrial non-profit organizations in Canada.

The concepts and definitions employed in the collection and dissemination of innovation data are provided in the Oslo Manual: Guidelines for Collecting and Interpreting Innovation Data, 3rd Edition (Organisation for Economic Co-operation and Development, [OECD and Eurostat, 2005]). According to the manual:

"An innovation is the implementation of a new or significantly improved product (good or service), or process, a new marketing method, or a new organisational method in business practices, workplace organisation or external relations."

The SIBS collects complementary qualitative business information, including market characteristics, use of advanced technologies, business strategy, business practices and participation in global value chains. The survey also collects quantitative information on total sales, innovation expenditures, purchase of goods and services and personnel, supplemented by percentage distributions by specified subgroups.

Reference period: The innovation reference period refers to three calendar years (2015 to 2017).

Collection period: January through April of the year following the reference period.

Subjects

  • Business performance and ownership
  • Innovation
  • Science and technology

Data sources and methodology

Target population

The target population for the Survey of Innovation and Business Strategy is limited to enterprises within the following 14 sectors defined according to the North American Industry Classification System (NAICS, Statistics Canada, 2012):

- Agriculture, forestry, fishing and hunting (11)
- Mining, quarrying, and oil and gas extraction (21)
- Utilities (22)
- Construction (23)
- Manufacturing (31-33)
- Wholesale trade (41)
- Retail trade (44-45)
- Transportation and warehousing (48-49)
- Information and cultural industries (51)
- Finance and insurance (52)
- Real estate and rental and leasing (53)
- Professional, scientific and technical services (54)
- Management of companies and enterprises (55)
- Administrative and support, waste management and remediation services (56)

To reduce response burden on small businesses, only enterprises with at least 20 employees and revenues of at least $250,000 were considered for sample selection.

Instrument design

The SIBS uses an electronic questionnaire (EQ). The questionnaire was developed to conform to international standards for innovation concepts (OECD and Eurostat, 2005). EQs are the principal mode of collection, and these were tested with enterprise respondents in English and French to confirm respondents' understanding of terminology, concepts and definitions, as well as their ability to provide the requested data and to navigate the EQ applications. Questionnaire content testing occurred in March 2017 in English in Ottawa, Toronto and Montreal, and in French in Gatineau and Montreal. This first round of testing concentrated on validating respondents' understanding of concepts, questions, terminology, the appropriateness of response categories and the availability of requested information. The subsequent round of testing occurred in June 2017 in English in Toronto and in French in Montréal.

Sampling

This is a sample survey with a cross-sectional design.

The target population was stratified by the North American Industry Classification System (NAICS) Canada 2012, region and three size classes based on number of employees per enterprise.

The survey design required a census of all large enterprises within the 14 sectors. The sample was randomly selected to meet two sets of objectives:

1) Produce estimates of proportions for defined characteristics for Canada and for selected provinces or regions with a target standard error (quality measure) to satisfy the requirements for estimates by employment size, by industry. The sample was not designed to produce estimates by employment size, by geography, by industry.

2) Permit microdata analysis within a linked file environment.

Sampling unit:
Enterprise

Stratification method:
The national stratification corresponds to enterprises with at least $250,000 in revenues within the target industrial groupings and the enterprise size (based on the number of employees):

- Small enterprises (20 to 99)
- Medium-sized enterprises (100 to 249)
- Large enterprises (250 or more)

The regional stratification corresponds to enterprises with at least $250,000 in revenues within the target industrial groupings for the following regions: Atlantic; Quebec; Ontario; and the Rest of Canada. Provinces not listed were not individually sampled.

Sampling and sub-sampling:
The sample size for the survey is 13,252 enterprises that will be selected from the target population (69,519 enterprises), with an expected response rate of 50% by stratum. The target standards errors(S.E.) for the calculation of proportions are defined as follows:

- 7.2% for NAICS2 (11,22,23,44_45,49,53,55,56) by size
- 5.0% for NAICS2 (21,31-33,41,48,51,52,53,54) by size
- 9.1% for NAICS3 by size
- 10.0% for NAICS4 by size
- 12.0% for NAICS5 by size
- 12.0% for NAICS6 by size
- 9.0% for Atlantic, Quebec and Ontario
- 12.0% for the rest of Canada

Data sources

Data collection for this reference period: 2018-01-23 to 2018-04-13

Responding to this survey is mandatory.

Data are collected directly from survey respondents.

The data were collected through an electronic questionnaire (EQ), with non-response follow-up and failed edit follow-up for priority questions.

Administrative data (T2, PD7 and Exporter data) were used for data validation and to assist with imputation only.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Error detection is an integral part of both data collection and data processing activities. Automated edits are applied to data records during collection to identify reporting errors. During data processing, other edits are used to automatically detect errors or inconsistencies that remain in the data following collection. Incoherent data are corrected based on responses to key "gate" questions. If the gates are not reported, based on any reported data for the section and otherwise by imputation based on responses in the stratum.

Respondents are asked to report sales and expenditures in thousands of dollars. Totals are reviewed to ensure no order of magnitude reporting errors.

Imputation

Imputation is performed only to treat item non-responses.

The imputation of item non-responses is performed using the nearest neighbour donor imputation procedure in the generalized system BANFF. This procedure finds, for each record requiring item imputation, its most similar valid record thereby allowing the imputed recipient record to pass the specified imputation edits and post edits rules.

These similar records are found by taking into account other variables that are correlated with the missing values via the customized imputation classes and matching variables (if required) for each item (variable) to be imputed. If nearest neighbour donors are not found for all recipients, then it is necessary to be less restrictive by extending the imputation classes and reprocessing the data. This imputation process continues using a predetermined sequence until nearest neighbour donors are assigned to all records requiring imputation or until no nearest neighbour donors are available. During imputation, edits and post edits are applied to ensure that the resulting record does not violate any of the specified edits.

Questions 2, 17 a to l, 25 (total), 43 and 54 were mandatory.

Estimation

The response values for sampled units were multiplied by a final weight in order to provide an estimate for the entire population. The final weight was calculated using a certain number of factors, such as the probability for a unit to be selected in the sample, and adjustment of the units that could not be contacted or that refused to respond. Using a statistical technique called calibration, the final set of weights is adjusted in such a way that the sample represents as closely as possible the entire population.

Sampling error was measured by the standard error (SE) for proportions and by the coefficient of variation (CV) which represents the proportion of the estimate that comes from the variability associated to it. The SEs and CVs were calculated and are indicated in the data tables by quality flags.

A process known as Random Tabular Adjustment (RTA) changes estimates by a random amount and adds a degree of uncertainty to the accuracy of the estimate to prevent the disclosure of individual values. As a result, estimates that could disclose an individual's response are not released. (Note that if the adjusted estimates are part of a table with totals or sub-totals, the related total and sub-total estimates will also be adjusted.)

Quality evaluation

The Survey of Innovation and Business Strategies, 2017 comprised two main types of questions:

- Qualitative questions - tick box (Yes/No; Yes/No/Don't know; Yes/No/Not applicable; Select one; Select all that apply; Likert scale)
- Quantitative questions - currency (thousands of dollars); employment (FTE personnel); percentage distributions (components add to 100%)

Data validation for qualitative questions involved ensuring the flows in the questionnaire where respected to confirm the correct population answered each question, coherence within a question and across questions in a thematic module and coherence across questions within the questionnaire.

For quantitative questions there were two main validation processes: data coherence and data confrontation. Data coherence involved reviewing different parts of the questionnaire that covered questions which are either directly or indirectly related to ensure that responses were consistent with what was observed in practice.

Example: revenues per employee will fall within a given range for each strata (industry group by size of enterprise). If a record falls far outside the range of typical values, then data confrontation can be used to identify a possible error, such as reporting in dollars instead of thousands of dollars or reporting for one component of an enterprise rather than the entire enterprise.

Data confrontation involves comparing the response from the respondent with other sources of information about either that particular respondent or other enterprises within that industry. For SIBS, various administrative data were obtained for data confrontation, including:

- Tax data (T2 revenue data used to classify by size and compare responses)
- Employment data (PD7)
- Exporter register (administrative data on exports)
- Historical comparisons (where available)
- Publicly available information for individual companies (Annual Reports of publicly traded companies, information on the internet)

Typically, administrative data are available for periods earlier than the reference period for the survey but they can be used to identify order of magnitude issues and instances where the respondent is reporting for a different level of the business entity.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the disclosure of any information deemed confidential.

A process known as Random Tabular Adjustment (RTA) changes estimates by a random amount and adds a degree of uncertainty to the accuracy of the estimate to prevent the disclosure of individual values. As a result, estimates that could disclose an individual's response are not released. (Note that if the adjusted estimates are part of a table with totals or sub-totals, the related total and sub-total estimates will also be adjusted.)

Revisions and seasonal adjustment

This methodology type does not apply to this statistical program.

Data accuracy

There are two types of errors which can impact the data: sampling errors and non-sampling errors. Non-sampling errors may occur for various reasons during the collection and processing of the data. For example, non-response is an important source of non-sampling error. Under or over-coverage of the population, differences in the interpretations of questions and mistakes in recording and processing data are other examples of non-sampling errors. To the maximum extent possible, these errors are minimized through careful design of the survey questionnaire and verification of the survey data.

The data accuracy indicators used are the standard error and the coefficient of variation. The standard error is a commonly used statistical measure indicating the error of an estimate associated with sampling. The coefficient of variation is the standard error expressed as a percentage of the estimate.

Sampling error, response rate and imputation rate are combined into one quality rating code. This code uses letters that range from A to F, where A means the data is of excellent quality and F means it is unreliable. Estimates with a quality of F will not be published. Details on these quality rating codes can be requested and should always be taken into consideration when analyzing the data.

Response rates:
The response rate at the estimation phase is 77.4%.

Non-sampling error:
Non-sampling errors may occur for various reasons during the collection and processing of the data. For example, non-response is an important source of non-sampling error. Under or over-coverage of the population, differences in the interpretations of questions and mistakes in recording and processing data are other examples of non-sampling errors. To the maximum extent possible, these errors are minimized through careful design of the survey questionnaire and verification of the survey data.

Non-response bias:
In addition to increasing variance, non-response can result in biased estimates if non-respondents have different characteristics from respondents. Non-response is addressed through survey design, respondent follow-up, reweighting, and verification and validation of microdata.
When non-response occurs, it is taken into account and the quality of the estimate is reduced based on its importance to the estimate.

Coverage error:
Coverage errors consist of omissions, erroneous inclusions, duplications and misclassifications in the survey frame.

The Business Register (BR) was used as the frame. The BR is a data service centre updated through a number of sources including administrative data files, feedback received from conducting Statistics Canada business surveys and profiling activities including direct contact with companies to obtain information about their operations and Internet research findings.

Other non-sampling errors:

Response errors
Response errors result when data are incorrectly requested, provided, received or recorded. These errors include:
- Poor questionnaire design
- Interview bias
- Respondent errors
- Problems with the survey process

Non-response errors
Non-response errors are the result of not having obtained sufficient answers to survey questions. There are two types of non-response errors: complete and partial.

Processing errors
Processing errors may emerge during the preparation of final data files. For example, errors can occur while data are being coded, captured, edited or imputed.

Estimation errors
If an inappropriate estimation method is used, then bias can still be introduced at the estimation stage, regardless of how errorless the survey had been before estimation.

Analysis errors
Analysis errors are those that occur when using the wrong analytical tools or methods. Errors that occur during the publication of data results are also considered analysis errors.

Date modified: