Annual Survey of Research and Development in Canadian Industry (RDCI)
Detailed information for 2016
This survey collects research and development expenditures and personnel data used to monitor science and technology related activities of companies and industrial non-profit organizations in Canada.
Data release - April 19, 2017 (intentions)
- Questionnaire(s) and reporting guide(s)
- Data sources and methodology
- Data accuracy
The Research and Development in Canadian Industry (RDCI) survey is a cross economy survey of companies and industrial non-profit organizations in Canada that 1) perform or fund research and development (R&D) or 2) have previously reported R&D expenditures and have recent payments or receipts for technology. The survey comprises companies and industrial non-profit organizations in all NAICS industries other than universities (NAICS 61131) and all levels of government (NAICS 91 public administration).
The concepts and definitions employed in the collection and dissemination of research and development (R&D) data are provided in the Frascati Manual 2015: Guidelines for Collecting and Reporting Data on Research and Experimental Development (Organisation for Economic Cooperation and Development (OECD), 2015). According to this definition:
"Research and experimental development (R&D) comprise creative and systematic work undertaken in order to increase the stock of knowledge -including knowledge of humankind, culture and society- and to devise new applications of available knowledge."
The RDCI collects in-house R&D expenditures and personnel, outsourced R&D expenditures and payments and receipts for technology.
In-house R&D expenditures include current costs (comprised of wages and salaries of permanent, temporary and casual employees; services to support R&D; R&D materials; and all other current costs) and capital costs (comprised of software; land; buildings and structures; and equipment, machinery and all other capital costs). In-house R&D expenditures are characterized by their geographic distribution (provinces and territories), sources of funds (originating sector inside or outside Canada), fields of research and development and nature of R&D activity (basic research, applied research and experimental development).
In-house R&D personnel include researchers and research managers; R&D technical, administrative and support staff; and other R&D occupations and these are available by geographic distribution (provinces and territories).
Outsourced R&D expenditures comprise payments made to other organizations to perform R&D and may be directed to other organizations (comprising companies; private non-profit organizations; industrial research institutes or organizations; hospitals; universities; federal government departments or agencies; provincial government departments, ministries and agencies; provincial research organizations or other organizations or individuals) inside or outside Canada.
Technology payments include payments made or received for patents, copyrights, trademarks, industrial designs, integrated circuit topography designs, original software, packaged off-the-shelf software, databases with a useful life exceeding one year, other technical assistance, industrial processes and know-how. Technology payments can be made to, or received from affiliated or unaffiliated organizations.
Payments made and received for original software and packaged or off-the shelf software can also be reported as research and development expenditures on software related capital.
Payments made for other intellectual property and technology related service may include outsourced research and development payments.
Payments received for other intellectual property and technology related services may include sources of funds from parent, affiliated, subsidiary, or other companies for R&D expenditures.
The survey is administered as part of the Integrated Business Statistics Program (IBSP). The IBSP program has been designed to integrate approximately 200 separate business surveys into a single master survey program. The survey instrument conforms to the common look, structure and content for business surveys in the integrated program.
Reference period: The fiscal year for fiscal year end date between April 1, RY and March 31, RY+1
Collection period: December to April after the reference period
- Research and development
- Science and technology
Data sources and methodology
The target population for the survey of Research and Development in Canadian Industry (RDCI) comprises all companies and industrial non-profit organizations which perform and/or fund research and development (R&D) or have had R&D expenditures in the past and continue to make or receive technology payments within the reference period. The survey is a cross economy survey and includes all NAICS codes except NAICS 61131 (universities) and NAICS 91 (public administration).
The Research and Development in Canadian Industry (RDCI) uses two questionnaires: one for companies and another for industrial non-profit organizations. These questionnaires were developed to conform to international standards for research and development concepts (OECD, Frascati Manual 2015). Electronic questionnaires (EQ) are the principal mode of collection and these were tested with company respondents in English and French to confirm respondents' understanding of terminology, concepts and definitions as well as their ability to provide the requested data and to navigate the EQ applications. Questionnaire content testing occurred in March 2014 in English in Ottawa, Toronto and Montreal and in French in Gatineau and Montreal. This first round of testing concentrated on validating respondents' understanding of concepts, questions, terminology, the appropriateness of response categories and the availability of requested information. The subsequent round of testing in June 2015 occurred in English in Toronto and French in Montreal. This final round of testing confirmed that respondents could navigate through the EQ application with ease while providing the requested information.
This is a sample survey with a cross-sectional design.
The RDCI is a stratified sample of companies classified by: 57 unique industry groups, R&D size and geography.
Data collection for this reference period: 2015-12-02 to 2016-03-31
Responding to this survey is mandatory.
Data are collected directly from survey respondents and extracted from administrative files.
Electronic questionnaire with non-response follow-up and failed edit follow-up.
Administrative data are those data that have been collected for administrative purposes (ex: to administer, regulate or tax activities of companies or individuals) as opposed to statistical purposes. The use of administrative data reduces data collection costs and respondent burden. Concepts or definitions of administrative data variables differ from those identified in survey design. The administrative data source does not cover the entire target population and as such sampling error will be present. The portion not covered by tax data have been identified as Must-Take units to address possible sampling error issues. Non-sampling errors and bias may be present as a result of data collection methodology.
Administrative data are used for many different statistical purposes: replacing or complementing direct data collection to reduce costs and respondent burden; achieving efficiencies in statistical operations, such as the creation of survey frames, design of survey samples, imputation, and estimation. In collaboration with data providers, Statistics Canada uses its mandate under Section 13 of the Statistics Act to access administrative data for statistical purposes.
The confidentiality of administrative data relating to individual persons, companies or organizations (referred to as identifiable administrative data) must be strictly maintained as required by Subsection 17(1) of the Statistics Act.
Scientific Research and Experimental Development (SR&ED) tax incentive program data are used for data replacement for Take-None units, however SR&ED does not collect: capital R&D expenditures and lease costs, R&D expenditures in the social sciences and humanities or payments for R&D performed by organizations outside Canada.
Corporation income tax return data (T2) provided by the Canada Revenue Agency (CRA) is used to provide revenue information previously collected on the survey.
Payroll deduction tax data (PD7) provided by the Canada Revenue Agency (CRA) is used to generate employment size categories (based on number of employees variable) for dissemination purposes.
In addition to data collected through the survey, the RDCI uses administrative data from CRA (approved Scientific Research and Experimental Development tax credit applications) for the "take-none" component of the sample. This is done to reduce response burden for smaller companies. These data are also used to assist in imputation for non-response.
Records are matched by Business Number root (BN). While the definition of R&D used by CRA and Statistics Canada are not identical, they are similar, with the Statistics Canada including elements which are not included by CRA. These elements comprise: all capital expenditures related to R&D, current costs for rental of capital goods and R&D in the social sciences and humanities.
View the Questionnaire(s) and reporting guide(s).
Data editing occurs at a number of steps in the survey process: during data collection, during failed edit follow-up and during processing.
During collection: the electronic survey questionnaire (EQ) has embedded edits which activate while the respondent completes the EQ if a pre-specified likely error condition is met (example: if the sources of funds for in-house R&D do not equal total in-house R&D expenditures within +/- 5%). The respondent receives an error message and they can either correct the data or accept their response and proceed.
During failed edit follow-up: once the electronic questionnaire has been submitted, the same edits are applied and an error report generated. If a key edit fails the respondent will be called to correct the information or obtain an explanation. The record is then corrected or an explanation note is added to the file to explain the information provided.
During processing: editing in processing involves a series of pre-specified conditions which identify an error (example: components do not add to total, totals do not equal each other across questions (in-house R&D by category, provincial and territorial distribution, sources of funds, fields of R&D). All errors are flagged so that they can be corrected through an imputation process.
Non-response occurs when respondents do not completely answer the questionnaire, or when reported data are considered incorrect during the error detection steps, imputation is used to fill in the missing information and modify the incorrect information. Many methods of imputation may be used to complete a questionnaire, including manual changes made by an analyst. The automated, statistical techniques used to impute the missing data include:
¿ deterministic imputation (for example adding components to create a total),
¿ replacement using previously reported anticipated values for the current period values (the survey asks for reference year (RY) and RY+1 and RY+2 values for key variables),
¿ replacement using historical data (with a trend calculated, when appropriate),
¿ replacement using auxiliary information available from other sources,
¿ replacement based on known data relationships for the sample unit, and
replacement using data from a similar unit in the sample (known as donor imputation).
For the research and development surveys, the key question on expenditures for in-house R&D is verified or imputed first and then these values are used as anchors in subsequent steps to impute other, related, variables.
Imputation generates a complete and coherent micro data file that covers all survey variables.
The sample used for estimation comes from a one phase sampling process. An initial sampling weight (the design weight) is calculated for each unit of the survey and is the inverse of the probability of selection. It is then adjusted to take into account outliers that might have been misclassified.
The weight calculated for each sampling unit indicates how many other units it represents. The final weights are usually either one or greater than one. Sampling units which are selected for certainty (must-take units) have sampling weights of one and only represent themselves; outlier units with larger than expected size are seen as misclassified and their weight is usually adjusted so that they only represent themselves.
The sampling unit being the company, is considered an estimation unit. The characteristics of the estimation units are used to calculate aggregate estimates, including industrial classification. Estimation for the survey portion is done by simple aggregation of the weighted values of all sampled companies that are found in the domain of estimation. Estimates are computed for several domains of estimation, such as industry groups, country of control, company size, based on the most recent classification information available for the company and the survey reference period.
In the case of the ineligible for sampling portion (take-none portion) of the target population, a model estimate is produced using two adjustments: the first, is derived from the relationship between two closely related variables - current in-house expenditures from the questionnaire and current in-house expenditures from tax data; the second adjustment is used to model all other variables based on either the in-house R&D expenditures or the outsourced R&D expenditures in Canada. The overall estimate is composed of estimates from both the surveyed and modeled portions.
Prior to the data release, combined survey results are analyzed for comparability; in general, this includes a detailed review of:
¿ individual responses (especially for the largest companies),
¿ general economic conditions,
¿ coherence with results from related economic indicator,
¿ historical trends, and
¿ information from other external sources (e.g. associations, trade publications, newspaper articles).
The survey estimates are also analyzed with trends observed in data from previous collection cycles, media reports and comparisons of questionnaire data and administrative data for important respondents over multiple reporting periods.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, company, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
In order to prevent any data disclosure, confidentiality analysis is done using the Statistics Canada generalized confidentiality system (G-CONFID). G-CONFID is used for primary suppression (direct disclosure) as well as for secondary suppression (residual disclosure). Direct disclosure occurs when the value in a tabulation cell is composed of or dominated by few enterprises while residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.
Revisions and seasonal adjustment
For RY2013, the former methodology will be used for revision.
Data for RY - 1 reference period are revised in the following way:
- Inclusion of new units identified using tax data for companies not in the frame at the time the sample was drawn.
- Use of new tax data received for the non-response for the sample portion to revise imputed values in some cases.
There are two types of errors to which survey data can be subject: sampling errors and non-sampling errors. Sampling error occurs because population estimates are derived from a sample of the population rather than the entire population. Non-sampling error is not related to sampling and may occur for various reasons during the collection and processing of data.
Non-sampling errors include:
¿ Non-response (both total and partial)
¿ Under or over-coverage of the population
¿ Differences in the interpretations of questions and mistakes in reporting
¿ Coding and processing errors
To the maximum extent possible, these errors are minimized through careful design of the survey questionnaire, verification of the survey data, and follow-up with respondents when needed to maximize response rates.
Imputation rates can be estimated to generate a quality rating code. The imputation rate is calculated based on the contribution of imputed values to the total estimate. The quality indicator code uses letters that ranges from A to F where A means the data are of excellent quality and estimates with a quality of F are too unreliable to be published. These quality rating codes should always be taken into consideration when using the estimates.
Another indicator of quality is the survey response rate.
- Changes to the survey for RY2014
- Date modified: