Canadian Housing Statistics Program (CHSP)

Detailed information for 2022

Status:

Active

Frequency:

Annual

Record number:

5257

The Canadian Housing Statistics Program is an innovative data project that leverages existing data sources and transforms them into new and timely indicators on Canadian housing.

Data release - October 12, 2023; November 20, 2023; July 29, 2024; August 21, 2024; October 3, 2024; December 9, 2024 (see the Summary of changes page for available jurisdictions)

Description

Statistics Canada was mandated to create a residential property database: a comprehensive repository of data that covers numerous aspects of the housing sector. The database, under the responsibility of the Canadian Housing Statistics Program (CHSP), will ultimately include all residential properties in Canada and their owners.

The database was developed by combining data from multiple sources (e.g., property assessment rolls, land titles, Census of Population, etc.) and provides detailed information at the property and owner levels.

The CHSP residential property database, initialized in 2017, continues to be expanded with new geographies and variables as new data becomes available. The database currently contains 2955 census subdivisions across twelve provinces and territories. Data from Quebec are not yet available in CHSP.

Collection period: Ongoing

Subjects

  • Business, consumer and property services
  • Families, households and housing
  • Housing and dwelling characteristics
  • Rental and leasing and real estate

Data sources and methodology

Target population

The target population consists of all residential properties and residential property owners in Canada.

At this time, the CHSP covers residential properties and residential property owners in all the provinces and territories, except Quebec (with varying data vintages to the extent that data have been made available by data providers).

The CHSP does not currently cover information about residential properties on Indian reserves, or collective dwellings (e.g., nursing homes, jails or staff residences). Commercial, industrial and institutional properties are out of scope to the CHSP as it focuses exclusively on residential properties. Properties with mixed residential and non-residential portions are included, but the property characteristics reported in the CHSP reflect only the residential portions of mixed properties.

Instrument design

This methodology type does not apply to this statistical program.

Sampling

This methodology type does not apply to this statistical program.

Data sources

Data are extracted from administrative files.

The CHSP leverages existing data from provincial and territorial land registries, property assessment programs and other administrative data files to create a database of all residential properties in Canada.

Property-level data are obtained from land registries and property assessment programs. Owner-level information is also derived from land registries and property assessment programs, and a variety of owner characteristics are linked from tax data, the Business Register, the Census of Population, and the Longitudinal Immigration Database. This owner information is supplemented with indicators of residency in the economic territory of Canada, which are obtained by linkage to various data sources, including tax and the Census of Population data.

The municipalities covered in the source data are assigned to CSDs which are updated on a yearly basis by Statistics Canada's Standard Geographical Classification System. Some CSD types are out of scope, such as Indian Reserves. Values for such CSDs are not part of the estimates.

The record linkage process is implemented using custom software developed at Statistics Canada. G-Link, part of Statistics Canada's suite of generalized systems, was used to perform probabilistic record linkage, while SAS and Mix-Match software were used to perform deterministic linkage.

A range of data sources are used to determine whether or not property owners are residents of Canada. Key amongst these factors is linking an owner to recent Canadian tax data activity; when linking to tax data is successful, an owner is highly likely to be considered a resident of Canada. However, additional criteria such as an indication of emigration from Canada to a foreign country, or a lack of presence on the last Canadian Census of Population may conversely lead to an owner being designated as a non-resident of Canada.

Data for a given reference year reflect the stock of properties available on the property assessment roll in each province or territory for that year. Each assessment agency applies its own reference date for the creation of municipal assessment rolls. "Assessment value" refers to the assessed value of the property for the purpose of determining property taxes. It is important to note that the assessed value does not necessarily represent the market value.

Concepts and terminology used to describe properties are distinct to each jurisdiction, and CHSP harmonizes these differences as much as possible.

Error detection

All microdata records contained in the CHSP are verified in order to identify possible errors (e.g., outliers, unexpected values or formatting issues). Validation edits are used to verify that each field contains values that fall within the allowable range for that data element. Correlation edits are used to check the compatibility of different data elements within a record.

Data abnormalities are resolved in collaboration with data providers and by comparing aggregated values available from alternate sources like the Census of Population and tax data.

The CHSP estimates undergo various levels of error detection from internal checks during data production to post development sampling for detection of linkage errors. Data providers are extensively consulted with respect to the concepts and any data abnormalities pertaining to externally obtained files.

Imputation

Imputation was performed to fill missing data on living area and period of construction on a subset of the properties in Newfoundland and Labrador, Nova Scotia, New Brunswick, Ontario, Manitoba, Saskatchewan, Alberta, British Columbia and Yukon, as well as the assessment value in Newfoundland and Labrador, New Brunswick, and Yukon. For Prince Edward Island, imputation was performed to fill missing data on period of construction for a subset of properties.

Estimation

Estimation methodology is not currently required.

Quality evaluation

A number of strategies have been developed and implemented to assess data quality and to minimize errors.

The contents of administrative databases containing property or owner information are compared between vintages to ensure consistency over time.

Steps were taken to consolidate and standardize variables originating from various data sources to achieve the best possible matches between records.

The linkage results are extensively reviewed during the linkage process to ensure that the methods used are correct and appropriate. Furthermore, samples of linked records are manually reviewed and estimates of linkage error rates are calculated to ensure that linkages are of high quality.

Linkage quality varies among the provinces and territories as a result of the prevalence of common names and the presence of non-civic addresses such as post office boxes in the source data. The variance in quality for linkage can impact some indicators which are derived from these linked data sets, such as residency ownership and property use.

Other minor data quality issues can also affect linkage quality and linkage quality impacts some derived variables more than others. Although the quality estimates for most variables are very strong, the derived non-resident ownership rate in particular is impacted by variation in linkage quality.

The indicator on property use is also impacted by variation in linkage quality. It is determined by a methodology relying on a range of data, particularly civic-style address data used in an algorithm to link between an owner's property address and stated address of residence. This indicator is not available in some areas which lack civic-style addresses.

Quality indicators
A composite quality indicator value is available for each estimate in tables 46-10-0027-01, 46-10-0053-01 46-10-0054-01, 46-10-0030-01, 46-10-0038-01, 46-10-0051-01, 46-10-0052-01 and 46-10-0062-01. The composite quality indicator is created by combining many individual quality indicators, each one representing the quality of different data processing steps (i.e., coding, geocoding, linkage and imputation). The composite quality indicator can take values from A to F, where A indicates an excellent level of quality, and F being too unreliable to be published.

For detailed information about the methodology to produce the composite quality indicators, please refer to the document "Development of a Composite Quality indicator for Statistical Products Derived from Administrative Sources".

Disclosure control

Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge and the consent in writing of that person, business or organization. Various confidentiality protections are applied to all data published to prevent the disclosure of any information deemed confidential. As necessary, data are suppressed or rounded to prevent direct or residual disclosure of identifiable data.

The use of the CHSP data is subject to Statistics Canada's privacy and confidentiality constraints to prevent the disclosure of personal information.

Revisions and seasonal adjustment

As the CHSP is a program in development, published data may be subject to revision. When data are released for the first time for a given jurisdiction, the data are considered experimental for that reference year. Consistency and coherence edits may occur when subsequent vintages are released for those new jurisdictions.

Data accuracy

Data accuracy can be impacted by different types of errors: sampling errors and non-sampling errors. In the context of the CHSP, the quality of the estimates is essentially defined by the non-sampling errors since the CHSP is a census of all residential properties in Canada based on administrative files. Listed below are different quality aspects impacted by non-sampling errors.

Completeness

Since each Canadian municipality, province or territory has a legislated responsibility for property monitoring and assessment, completeness of the administrative data provided by external sources is considered relatively good.

The CHSP database reflects the current content of the external data provider's registry of residential properties as of the date of extraction, which varies by province and territory.

The CHSP assigns properties to a geographic location using data from property assessment rolls.

Duplicates

Initial investigations are performed to ensure that all properties on the data files are unique. Through internal linkages, duplicate records are identified and then suppressed if owners are listed twice for the same property.

Undercoverage

Undercoverage of residential properties may exist for a variety of reasons. For example, properties undergoing unreported changes between assessment periods (e.g., new constructions, demolitions or improvements performed without a building permit) are not captured in the assessment values.

Timeliness

The CHSP is an innovative data project that utilizes new techniques in linkage and processing that may be refined over time leading to improvement in the accuracy and precision of data that is released. The first year of data for each province and territory should be considered preliminary results and may contain a precocity error which may be corrected in future data releases.

The CHSP data are used to produce annual estimates.

Data Processing Errors

Administrative files are going through different CHSP processes in order to produce estimates. Each of these processing steps has inherent error risks, which can impact the quality of the resulting estimates. The composite quality indicator mentioned previously combines relative levels of error introduced in coding, geocoding, linkage and imputation processes. The composite quality indicator values A to F indicate how these data processing steps have impacted the quality of CHSP estimates.

Documentation

Date modified: