Annual Income Estimates for Census Families and Individuals (T1 Family File)
Detailed information for 1999
Status:
Active
Frequency:
Annual
Record number:
4105
This activity is conducted for the development and dissemination of annual small area economic data for Canadians.
Data release - August 10, 2001
Description
This activity is conducted for the development and dissemination of annual small area socio-economic data for Canadians and their families. These data, collected primarily from income tax returns submitted to the Canada Revenue Agency (CRA), provide income and demographic information for sub-provincial geographic areas. Data are used by municipal, provincial and federal government departments to evaluate programs and support policy recommendations. Data are used in business and educational fields to learn more about the markets targeted. Academics and researchers use the data for analyses of socio-economic conditions.
Reference period: Calendar year "y" for income and contributions, end of calendar year "y" for age, point in time (usually April of calendar year "y+1") for address information.
Collection period: Income tax returns are filed mainly in the spring following the year of reference. The T1 files for income year "y" are received from Canada Revenue Agency (CRA) in January of the year "y+2".
Subjects
- Household, family and personal income
- Household spending and savings
- Income, pensions, spending and wealth
- Pension plans and funds and other retirement income programs
- Personal and household taxation
Data sources and methodology
Target population
These data cover all persons who completed a T1 tax return for the year of reference or who received CCTB (Canada Child Tax Benefits), their non-filing spouses (including wage and salary information from the T4 file), their non-filing children identified from three sources (the CCTB file, the births files, and an historical file) and filing children who reported the same address as their parent. Development of the small area family data is based on the census family concept. The census family concept groups individuals either in a census family (parent(s) and children living at the same address) or identifies them as persons not in census families.
Instrument design
This methodology does not apply.
Sampling
This methodology does not apply.
Data sources
Data are extracted from administrative files.
The data collected covers all individuals who filed an individual tax return (T1) or were CCTB recipients. From these records are determined non-filing spouses/partners/children. When complete, the sample is approximately 96% of the population and is left unweighted, unadjusted.
Error detection
During processing, there is a combination of automatic and manual editing. Variables with values of unity (a type of flag for CCRA) are converted to zero and variables with values above their absolute maxima are corrected automatically. Those with outliers are identified then examined and those identified as erroneous are corrected manually.
Imputation
Because the source files have limited direct information on the number and characteristics of non-filing individuals, this information must be derived. The family system creates families by linking filing family members together and estimates non-filing members from information on the taxfilers' returns, based on marital status, deductions and information for tax credits or from the CCTB file or from an historical file. For example, the family system imputes a non-filing spouse whenever a filer has declared him/herself married but was not linked with a filing spouse. Wage and salary income for non-filing spouses is derived from the T4 file when such information exists.
Between 1982 and 1992, information about children was derived directly from the tax file. Starting with the 1993 tax year, a combination of files was used to identify non-filing children: the Canada Child Tax Benefit (CCTB) file, the provincial births files and the T1 Family File (T1FF) of the previous year.
Approximately 70% of the Canadian population files a tax return. A completed T1FF accounts for approximately 96%, the difference being non-filers identified from other files or from filers' information.
Estimation
The production of estimates involves the following major processes:
Edit & Imputation: If a value is identified as being outside the maximum range for its type, the value is often divided by 10 until it is within range. This methodology was chosen because it was found that occasionally values were expressed as if they were dollars, while in fact they were really dollars and cents. If a value is identified as being an erroneous outlier, then manual correction can take the above form, or whatever seems reasonable.
Geocoding: During geocoding, Statistics Canada's Postal Code Conversion File (PCCF) is used to convert postal codes to standard geographic areas (Census Divisions, Census Metropolitan Areas, Census Agglomerations, Census Tracts). In addition, the postal code is used as a building block for postal geography to create "postal areas."
Family formation: Census families are formed through matching by SI number, family name and postal code, while accounting for age, sex and marital status. It is assumed that parents must be a minimum of 15 years older than their children. When a spouse is imputed, their sex is assigned as opposite that of the filing person and their age is probabilistically assigned from husband-wife age distributions. Children are not assigned sex and their ages are usually known. For the small number of cases for whom age is unknown, these are assigned as a probabilistic function of the mother's age.
Income & tax estimation of missing values: There are some non-taxable sources of income missing from the tax return. These are calculated from the information contained within the tax return. The Canada Child Tax Benefit is obtained directly from the CCTB file. The provincial refundable credits and provincial benefits of the National Child Benefit program are calculated using the current year's information as a proxy. The GST/HST credit is calculated for those who applied. Quebec taxes are calculated based on the information contained within the federal return.
Aggregation: The data are aggregated to approximate the standard geographic areas of Statistics Canada and to the postal areas.
Quality evaluation
The 1998 tax family file has 29,230,000 records representing 96% of the population estimates. Coverage is consistent across all provinces. Provincial coverage can be affected by provincial legislation regarding provincial income tax liability and/or eligibility for provincial tax credits. The strongest component of the family data is husband-wife families. The 1998 T1FF has a coverage rate fo 100.6% when comparing the counts of husband-wife families to estimates from Statistics Canada's Demography Division; lone-parent families have a 99.9% coverage rate. Difficulty exists in accurately assigning ages to some imputed children and to imputed spouses.
Disclosure control
Statistics Canada is prohibited by law from releasing any data which would divulge information obtained under the Statistics Act that relates to any identifiable person, business or organization without the prior knowledge or the consent in writing of that person, business or organization. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Only a small group of people within the Division have access to confidential data. Users must specify their requirements to these people who then carry out the retrievals. Before release, data are subjected to stringent non-disclosure practices:
1. There must be a minimum of 100 taxfilers in any geographic area before any data will be produced.
2. Any cell must represent a minimum of 15 taxfilers or families, otherwise it is suppressed.
3. Each cell which can be dominated by one tax filer (or one family) is checked for dominance and suppressed if a problem is identified.
4. Once the primary suppressions are made, complementary suppressions are made so that suppressed information cannot be discovered residually. This is an iterative process - each complementary suppression may require an additional complementary suppression. Patterns are created to keep these to a minimum.
5. Finally, the counts and amounts are rounded -- counts to the nearest ten, aggregate amounts to the nearest $5,000 and distribution measures such as percentiles to the nearest $10.
6. Averages and percentages are based on rounded counts and amounts to prevent the unravelling of non-disclosure procedures.
Revisions and seasonal adjustment
Once the data are finalized, they are not revised. For analyses, data are sometimes adjusted to constant dollars for comparison with data from other years, but only current dollars are kept on the file.
Documentation
- Economic Dependency Profiles - Data Quality Statements
- Family Data - Data Quality Statements
- Labour Force Income Profiles - Data Quality Statements
- Neighbourhood Income and Demographics - Data Quality Statements
- Seniors - Data Quality Statements
- Date modified: