Survey of Fraud Against Businesses
Detailed information for April 1 to September 30, 2008
The purpose of this survey is to collect national-level data on the prevalence and types of fraud experienced by certain business sectors. These data are required to respond to the need for better information on the nature and extent of fraud in Canada in order to improve policy and public education with respect to this issue. The survey also collects information on fraud detection and prevention and actions taken in response to incidents of fraud, including use of the criminal justice system.
Data release - December 10, 2009
The Survey of Fraud Against Businesses (SFAB) was conducted in partnership with Public Safety Canada through funds from the Policy Research Initiative. It is a national survey that collects information on the nature and extent of fraud experienced by small, medium and large business establishments from the retail, banking and insurance industries (both property/casualty insurance and health insurance), as defined by the North American Industry Classification System (NAICS).
Some of the types of fraud measured include: credit and debit card fraud; fraudulent use of cheques; use of counterfeit currency; use of false identity or false information in applications or other documents; advance fee schemes; false billing; and insurance fraud. The SFAB also collects information on the method by which the fraud was committed, dollar loss, non-monetary effects of fraud on business establishments, the actions typically taken regarding incidents of fraud (e.g. reporting to police or other enforcement agencies), and prevention and detection measures business establishments have in place.
These data are required to respond to the need for better information on the nature and extent of fraud in Canada in order to improve policy and public education with respect to this issue.
- Crime and justice
- Crimes and offences
Data sources and methodology
The target population for the survey is Canadian businesses from the retail (442, 443, 444, 446, 448, 452, 4532), banking (5221), health and disability insurance (52411), and property and casualty insurance (52412) industrial sectors, as defined by the North American Industry Classification System (NAICS). All establishments of 5 employees or more would be considered in scope.
The Survey of Fraud Against Businesses was developed through consultations and focus testing with various stakeholders including federal partners, law enforcement specialists, private industry fraud specialists and business associations. These members were given a copy of the questionnaire and asked to comment on the content and fraud definitions. Based on their input, the survey instrument was further refined and was further tested and revised following a pilot survey in the fall of 2006.
All SFAB respondents were asked a core set of questions on the following topics:
- General information, number of employees, fiscal year
- Consequences of fraud, dollar loss, impact on company (Section E)
- Fraud detection methods (Section F)
- Fraud prevention methods (Section G)
- Revenue (Section H)
However, as each industry experiences specific types of fraud that reflect some of their inherent business practices, the SFAB had four different questionnaires tailored to the four industries in the survey sample. The industry-specific questions asked respondents about the particular types of fraud experienced by their business establishment (Sections A to D).
This is a sample survey with a cross-sectional design.
The sample was drawn from the Statistics Canada Business Register during October 2007.
Sample unit: 7,597 Establishments
Stratification: Establishments from the following NAICS
Retail (442, 443, 444, 446, 448, 452, 4532),
Health and disability insurance (52411),
Property and casualty insurance (52412).
Only those establishments of 5 employees or more would be considered in scope. For approximately 90% of the sample, information was collected from individual business establishments directly. For the remainder, information was collected from head offices representing multiple establishments, and then, where feasible, broken down to the level of single establishments.
Data collection for this reference period: 2008-03-15 to 2008-12-15
Responding to this survey is voluntary.
Data are collected directly from survey respondents.
The pen and paper questionnaires were mailed out in March 2008 and collection closed 10 months later, in December. Respondents were required to return completed forms by post or fax. Follow-up calls were made throughout the collection process in order to clarify responses or capture additional information where responses were incomplete.
The overall response rate for the survey was about 57%. The approximate response rates by sector were as follows: retail, 57%; banking, 64%; health insurance, 50%; and property insurance, 48%.
View the Questionnaire(s) and reporting guide(s).
Editing of the data was conducted once the collection period was complete. A number of edits were applied using a custom SAS program to detect outliers at the micro level. A univariate approach was used for continuous variables, and several ratio edits were applied to identify records that appear far from the average within industries. Records failing edits were manually reviewed.
Questionnaires were edited in order to detect missing, invalid or inconsistent data entries. Editing included validity checks, as well as consistency and distribution edits. Validity checks identify, for example, blanks or impossible entries; consistency edits ensure that responses are uniform across the questionnaire (e.g., an establishment that said it had incurred direct fraud losses must have indicated it had at least one fraud incident); while distribution edits look at the highest values for certain variables to ensure their correctness and are a means for detecting outliers or anomalies.
Imputation was used to complete partial non-response to the survey. The imputation method used for imputing the majority of the missing values for item non-response was nearest neighbour donor imputation. This method involves replacing missing values for a non-responding unit with the corresponding values obtained for the responding unit (donor) nearest to it in terms of a vector of matching variables (i.e., industry type, region, CMA size, business size, and NAICS three-digit code) and a given distance measure. On average, imputation rates for SFAB variables fell below 10%.
For dealing with total non-response, a weight adjustment was performed so that the weights of the responding units (complete and partial respondents) are increased to represent the total non-responding units.
After imputation was completed, edit checks were again performed on the post-imputation data to ensure that imputed values did not break any significant edit rules. As well, a review of the number of times each donor was used was conducted to ensure that the same donor was not used extensively throughout the imputation process. This review showed that in most cases donors were not used more than once or twice.
Data for businesses with more than three in-scope single establishments were collected at the head office level. Therefore, it was necessary to allocate head office data back to the individual business establishments for which they reported. This allocation was applied first to establish a total number fraud incidents and then the same allocation percentages were applied to other questions that were thought to be fit for this procedure (i.e., where it was believed this model could provide reasonably accurate allocation of the head office data to the establishments).
At the questionnaire level, the weighting was done within SAS using the standard Horvitz-Thompson weights for stratified SRS. Wherever possible, re-weighting was done at the strata level (industry, province, business size and collection type ) to account for total non-response. In cases where not enough responses were available within a stratum, the stratum was combined with a similar stratum for the non-response adjustment. For the re-weighting process, the weights of the units resolved at collection were increased to account for the units unresolved at collection, so that the sum of the weights for the resolved units equalled the population total (at the questionnaire level).
These two calculated weights were multiplied by each other to produce an initial final weight. However, since the weights were calculated at the questionnaire level, a responding unit collected at the head office level is used to represent a non-responding head office collection level unit, regardless of the number of establishments each enterprise represents. As a result, a post-stratification adjustment to this initial final weight was done, based on the known total counts of establishments of all units to be collected at the enterprise level. Doing this post-stratification brings the weighted sum of the head office level establishment counts for responding units up to this known total of establishments for this sub-population.
The adjustment factor was done at the industry level for all questionnaires collected at the enterprise level.
The final estimation weight is calculated as the product of the design weight, the non-response adjustment and the post-stratification weight.
Within the Generalized Estimation System (GES), a single-stage cluster design approach was used to produce estimates, where the questionnaire was considered the cluster.
Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
In order to prevent any data disclosure, confidentiality analysis is done using the Statistics Canada Generalized Disclosure Control System (G-Confid). G-Confid is used for primary suppression (direct disclosure) as well as for secondary suppression (residual disclosure). Direct disclosure occurs when the value in a tabulation cell is composed of or dominated by few enterprises while residual disclosure occurs when confidential information can be derived indirectly by piecing together information from different sources or data series.
Revisions and seasonal adjustment
This methodology does not apply to this survey.
Data accuracy measures for this survey included CV's and imputation rates for all key variables. A CV of higher than 35% was considered too unreliable to be published. When the CV of the estimate was between 25% and 35%, the estimate was used with caution to support a conclusion.