Annual Survey of Software Development and Computer Services

Detailed information for 2000

Status:

Active

Frequency:

Annual

Record number:

2410

This survey collects the data necessary for the statistical analysis of the Software Development and Computer Services industry.

Data release - July 22, 2002

Description

The survey objective is to collect and publish data from businesses engaged in providing computer services data processing services, Internet service providers, or software publishers in Canada.

The information can be used by businesses for market analysis, by trade associations to study performance and other characteristics of their industries, by government to develop national and regional economic policies, and by other involved in research or policy making.

Statistical activity

This survey is part of the Service Industries Program. The survey data gathered are used to compile aggregate statistics for over thirty service industry groupings. Financial data, including revenue, expense and profit statistics are available for all of the surveys in the program. In addition, many compile and disseminate industry-specific information.

Reference period: Calendar year

Subjects

  • Business, consumer and property services
  • Business performance and ownership
  • Financial statements and performance
  • Information and culture
  • Professional, scientific and technical services

Data sources and methodology

Target population

The target population consists of all statistical establishments (sometimes referred to as firms or units) classified as Computer Systems Design and Related Services (NAICS 541510), Software Publishers (NAICS 511210) and Data Processing, Hosting and Related Services (NAICS 514210) according to the North American Industry Classification System (NAICS) 1997 during the reference year.

Instrument design

The 1999 Annual Survey of Software Development and Computer Services questionnaire was field tested in October of 1999 by Statistics Canada. Based on contacts with respondents and data users, some modifications have been incorporated to the questionnaire since 1999 in order to reflect the changing nature of the industry surveyed.

Sampling

This is a sample survey.

The survey design was based on probability sampling and only covered the portion of the frame subject to direct data collection.

Prior to the selection of a random sample, establishments are classified into homogeneous groups (i.e., groups with the same NAICS codes, same geography (province/territory), and same business type (incorporated/unincorporated) attributes). Quality requirements are targeted, and then each group is divided into sub-groups called strata: take-all, must-take, and take-some.

The take-all stratum includes the largest firms in terms of performance (based on revenue) in an industry. Every firm is sampled, which means each firm represents itself and is given a weight of one. The must-take stratum is also comprised of self-representing units, but these are selected on the basis of complex structure characteristics (multi-establishment, multi-legal, multi-NAICS, or multi-province enterprises). Units in the take-some strata are subjected to simple random sampling.

Finally, the sample size is increased, mostly to compensate for firms that no longer belong in the industry; i.e., they have gone out of business, changed their primary business activity, they are inactive, or are duplicates on the frame. After removing such firms, the sample size for 2000 was 1479 collection entities.

Data sources

Data collection for this reference period: January 2001 to April 2001

Responding to this survey is mandatory.

Data are collected directly from survey respondents and extracted from administrative files.

Data are collected through a mail-out/mail-back process, while providing respondents with the option of telephone or other electronic filing methods.

Follow-up procedures are applied when a questionnaire has not been received after a pre-specified period.

View the Questionnaire(s) and reporting guide(s) .

Error detection

Data are examined for inconsistencies and errors using automated edits coupled with analytical review. Where possible, data will be verified using alternate sources.

Imputation

Partial records are imputed to make them complete. Data for non-respondents are imputed using donor imputation, administrative data, or historical data.

Estimation

As part of the estimation process, survey data are weighted and combined with administrative data to produce final industry estimates.

Quality evaluation

Prior to dissemination, combined survey results are analyzed for overall quality; in general, this includes a detailed review of individual responses (especially for the largest companies), an assessment of the general economic conditions portrayed by the data, historic trends, and comparisons with other data sources.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Data accuracy

While considerable effort is made to ensure high standards throughout all stages of collection and processing, the resulting estimates are inevitably subject to a certain degree of error. These errors can be broken down into two major types: non-sampling and sampling.

Non-sampling error is not related to sampling and may occur for many reasons. For example, non-response is an important source of non-sampling error. Population coverage, differences in the interpretation of questions, incorrect information from respondents, and mistakes in recording, coding and processing data are other examples of non-sampling errors.

The response rate for this survey was 60% in reference year 2000, after accounting for firms that no longer belong in the industry, i.e., they have gone out of business, changed their primary business activity, they are inactive, or are duplicates on the frame.

Sampling error occurs because population estimates are derived from a sample of the population rather than the entire population. Sampling error depends on factors such as sample size, sampling design, and the method of estimation. An important property of probability sampling is that sampling error can be computed from the sample itself by using a statistical measure called the coefficient of variation (CV). The assumption is that over repeated surveys, the relative difference between a sample estimate and the estimate that would have been obtained from an enumeration of all units in the universe would be less than twice the CV, 95 times out of 100. The range of acceptable data values yielded by a sample is called a confidence interval. Confidence intervals can be constructed around the estimate using the CV. First, we calculate the standard error by multiplying the sample estimate by the CV. The sample estimate plus or minus twice the standard error is then referred to as a 95% confidence interval.

For the Annual Survey of Software Development and Computer Services, CVs were calculated for each estimate. Generally, the more commonly reported variables obtained very good CVs (10% or less), while the less commonly reported variables were associated with higher but still acceptable CVs (under 25%). The CVs are available upon request.

Date modified: