Income and Financial Data of Individuals, Preliminary T1 Family File
Detailed information for 2020
This activity is conducted for the development and dissemination of annual small area economic data for Canadians.
Data release - March 17, 2022 (First in a series of releases for this reference period.); April 12, 2022
This activity is conducted for the development and dissemination of annual small area economic data for Canadians. The data, collected from income tax returns submitted to the Canada Revenue Agency, provide income and financial data on Canadians and is used to analyse economic conditions. The data are available for Canada, the provinces and territories and sub-provincial geographic areas (postal areas and selected census areas).
Reference period: Calendar year "y" for income and contributions, end of calendar year "y" for age, point in time (usually April of calendar year "y+1") for address information.
Collection period: Income tax returns are filed mainly in the spring following the year of reference. The preliminary T1 file for calendar year "y" is received from the Canada Revenue Agency in September or October of the year "y+1".
- Household, family and personal income
- Income, pensions, spending and wealth
Data sources and methodology
These data cover all persons who completed a T1 tax return for the year of reference by the date the file was copied for Statistics Canada. This is a preliminary version of the T1 file and therefore the file is missing a certain amount of late tax filers.
This methodology type does not apply to this statistical program.
No sampling is done for this statistical program.
Data are extracted from administrative files.
The individual T1 tax file is received from the Canada Revenue Agency in early fall following the taxation year. This is a preliminary version of the T1 file and therefore this file is missing a certain amount of late tax filers. The input file contains records for 27.9 million unique individuals for the 2020 tax year. Tax filers who died within the year are not counted.
The period of income is the calendar year.
During data processing, there is a combination of automated and manual editing. Some variables with a value of 1 (a type of flag for the Canada Revenue Agency) are converted to zero and variables with values above their absolute maximum are corrected automatically. Those with outliers are identified then examined and those identified as erroneous are corrected manually.
This methodology type does not apply to this statistical program.
The data are aggregated to approximate the standard geographic areas of Statistics Canada. Census metropolitan areas (CMAs) and census agglomerations (CAs) are areas consisting of one or more neighbouring municipalities situated around a major urban core. A CMA must have a total population of at least 100,000 of which 50,000 or more live in the urban core. A CA must have an urban core population of at least 10,000.
Other levels of postal and census geography are also available.
When performing calculations, Canada Revenue Agency tax rules are used.
Since this data is based on the entire preliminary T1 file, and is not a sample, data is left unweighted and unadjusted.
The estimates are evaluated in several ways:
1. The geography is evaluated by comparing the number of tax filers and dependants with population estimates from Statistics Canada for the same areas.
2. The demographic information is evaluated in much the same way - by comparisons with estimates from Statistics Canada for the same areas.
3. The income information is evaluated by trend analysis, by comparing to both the preliminary and final T1 files from the previous year, and by comparisons with data from Canadian Income Survey whenever possible.
4. When Census or National Household Survey data are available, many comparisons are made -- population, income and demographics.
5. In addition, comparisons are made for income of individuals with annual income data produced by the Canada Revenue Agency.
Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential.
Only a small group of people within the Income Statistics Division of Statistics Canada have access to confidential data. Users must specify their requirements to these people who then carry out the retrievals. Before release, data are subjected to stringent non-disclosure practices:
1. There must be a minimum of 100 tax filers in any geographic area before any data will be produced.
2. Any cell must represent a minimum of 15 tax filers, otherwise it is suppressed.
3. Each cell which can be dominated by one tax filer (or one family) is checked for dominance and suppressed if a problem is identified.
4. Once the primary suppressions are made, complementary suppressions are made so that suppressed information cannot be discovered residually. This is an iterative process - each complementary suppression may require an additional complementary suppression. Patterns are created to keep these to a minimum.
5. Finally, the counts and amounts are rounded - counts to the nearest ten, aggregate amounts to the nearest $5,000 and distribution measures such as percentiles to the nearest $10.
6. Averages and percentages are based on rounded counts and amounts to prevent the unravelling of non-disclosure procedures.
Revisions and seasonal adjustment
Once the data are finalized, they are not revised. For analyses, data are sometimes adjusted to constant dollars for comparison with data from other years, but only current dollars are kept on the file.
The data for these products are derived from an early file from the Canada Revenue Agency. They benefit from timeliness, but lose some accuracy because of it. This preliminary T1 tax file contains typically about 97% of the records on the file received four to five months later. In 2019, it is likely that more late filers were missed due to the pandemic.
The data are unadjusted apart from editing and estimation of missing components to achieve a definition of income that is closer to Statistics Canada's definition of income. There are no coefficients of variation from sampling, as the population studied is nearly a census of filers and the data are neither weighted nor adjusted to compensate for the earliness of the file.