Survey on the Official Language Minority Population (SOLMP)
Detailed information for 2022
Status:
Active
Frequency:
Occasional
Record number:
5355
The Survey on the Official Language Minority Population (SOLMP) is a postcensal survey of the English-speaking population in Quebec and the French-speaking population elsewhere in Canada. The data will provide a better understanding of the current situation of official language minorities regarding issues such as education and access to various services in the minority language.
Data release - December 16, 2024
Description
The Survey on the Official Language Minority Population (SOLMP) was conducted by Statistics Canada in 2022 with the cooperation and support of Canadian Heritage. This survey focuses on Canada's minority official languages, English in Quebec and French in Canada outside Quebec, although the languages with official status in some provinces and territories may differ. It is a postcensal survey of the English-language population in Quebec and the French-language population elsewhere in Canada.
This survey was conducted to gain a better understanding of the situation of official language minority populations. Questions were designed to meet certain data requirements, to provide data on current issues and topics that affect these minority populations and to assess changes in the official language minority populations since the last survey (Survey on the Vitality of Official-Language Minorities [SVOLM]) was conducted by Statistics Canada, following the 2006 Census.
The survey includes two broad samples: one for adults belonging to the official language minority population, and another composed of children who either have a parent belonging to an official language minority population or are themselves eligible for instruction in the minority official language.
Results of the survey were produced to support federal, provincial and territorial governments, as well as municipal administrations and community organizations in providing services, programs and initiatives for official language minorities. The data also serve researchers and other stakeholders in this area.
Subjects
- Education, training and learning
- Languages
- Population and demography
- Society and community
Data sources and methodology
Target population
The target population of the 2022 SOLMP is composed of two segments: adults and children.
A person is included in the adult component of the SOLMP if they 1) were 18 years of age or older on May 16, 2022 (the first day of SOLMP data collection); 2) reside in one of the 10 provinces or the three territorial capitals, and do not live in a collective dwelling, on a First Nations reserve or in an Inuit community in northern Quebec; 3) are Canadian (non-permanent residents are excluded); and 4) are part of the official language minority population.
This includes (1) individuals whose mother tongue is the minority official language (alone or with other languages), (2) individuals with neither English nor French as their mother tongue who know the minority official language, but not the majority official language, and (3) individuals with neither English nor French as their mother tongue who know both official languages but do not speak the majority official language most often at home.
The child component of the SOLMP consists of two parts:
The first part is composed of children living with a parent who is part of the official language minority. These are children that 1) are younger than 18 years of age on May 16, 2022; 2) reside in one of the 10 provinces or the three territorial capitals, and do not live in a collective dwelling, on a First Nations reserve or in an Inuit community in northern Quebec; 3) are not a parent; and 4) have at least one parent who meets the adult component requirements, except the parent must be 15 years of age or older.
The second part is composed of children who are eligible for instruction in the minority official language. To be included, children must meet the following criteria:
1) be younger than 18 years of age on May 16, 2022
2) reside in one of the 10 provinces or the three territorial capitals, and not live in a collective dwelling, on a First Nations reserve or in an Inuit community in northern Quebec
3) not be a parent.
Additionally, they must also meet one of the following criteria:
4a) they live outside Quebec within the same census family as a person aged 15 years or older who has French as their mother tongue, or
4b) they live within the same census family as a person aged 15 years or older who is receiving or has received instruction at the primary school level in the minority official language of their province or territory, or
5) receive or have received instruction in the minority official language of their province or territory at the primary or secondary level
6) live with a sibling within the same census family who has received their instruction in the minority official language of their province or territory at the primary or secondary level.
Because of operational constraints in the territories, only the individuals residing in their capitals were eligible to be surveyed. According to the census, about 77% of French speakers in the territories reside in the capitals.
Instrument design
The 2022 SOLMP questionnaire is based on that of its previous iteration, the Survey on the Vitality of Official Language Minorities (SVOLM), a postcensal survey conducted following the 2006 census. Historical comparability was identified as a priority. While a large majority of the content was repeated, updates to the questionnaire were made based on consultations with Canadian Heritage and key data users, federal and community partners and external advisory committees.
Two rounds of qualitative testing were conducted in August 2020 and March 2021 with respondents who met the survey participation criteria. The objective was to ensure that all questions, particularly the new and modified questions identified as addressing new data needs, were easy to understand and could provide quality data. These tests were organized by experts in questionnaire design. The qualitative tests were held over a period of two weeks each, with approximately 30 interviews held in both English and French. Each interview lasted about one hour. Overall, the survey was well received by the respondents; questions were well understood, and, aside from some suggestions throughout, the feedback was positive.
While most of the modules underwent minor edits, a few required more significant changes following this testing. One major change was in the linguistic trajectory module, which was initially the last module of the questionnaire. Respondents felt the module was redundant when it was placed at the end, so it was moved to appear earlier in the questionnaire. The module itself was also modified. The geographic mobility module was perceived as convoluted and difficult to answer; therefore, major changes were made to simplify and shorten it. Finally, the education section was modified to facilitate understanding.
Sampling
This is a sample survey with a cross-sectional design.
Survey frame:
The SOLMP sample was selected from the linguistic and education profile reported by respondents on the 2021 Census questionnaire. While those eligibility questions appear on the short-form census questionnaire (2A), the sample was selected from responses to the long-form census questionnaire (the 2A-L questionnaire or the 2A-R questionnaire, depending on the geographic location of the dwelling) to enrich SOLMP's answers with those from the long-form questionnaire. Exceptionally, wherever the number of respondents of the long-form questionnaire was insufficient to support the SOLMP's targeted analytical requirements, sampling was carried out from the larger pool of respondents of the short-form questionnaire.
Sampling design and stratification:
SOLMP's sampling design is a two-phase stratified simple random sampling without replacement. The first phase corresponds to the sample selection associated with the long form and the second phase is the selection of SOLMP's sample.
Stratification will produce more accurate estimates if the characteristic of interest is homogeneous within strata and heterogeneous across strata. In addition, it is desirable that the estimation weights be similar for respondents from a same stratum.
The SOLMP sampling frame was first stratified by region and age group. The provinces of Quebec, Ontario and New Brunswick were split into six, five and three regions, respectively. For all other provinces, the region corresponds to the province. For the three territories, the capital is each territory's sole region.
The age groups for the provinces were 1 to 4 years, 5 to 11 years, 12 to 17 years, 18 to 24 years, 25 to 44 years, 45 to 64 years and 65 years or older, and the age groups for the capitals of the territories were 17 years or younger and 18 years or older.
The frame underwent additional stratification, which differs between the adult and child components:
For adults: The stratification was based on their language group.
For children: The stratification was based on the language profile of their parents and on whether the children are eligible for instruction in the minority official language.
In the Montréal area in Quebec, the basic language group was refined by adding the following two strata:
Individuals whose:
Mother tongue is English and another non-official language, and whose known official language is only English; or,
Mother tongue is English and another non-official language, and who know both official languages; or,
Mother tongue includes English, French and another non-official language, and both official languages are known, and the language spoken most often at home is not French; or,
Mother tongue does not include English or French, and the known official language is only English; or,
Mother tongue does not include English or French, and both official languages are known, and the language spoken most often at home is English, but not French.
Individuals whose:
Mother tongue does not include English or French, and both official languages are known, and the language spoken most often at home is a language other than the two official languages; or,
Mother tongue includes English and French as well as another non-official language, and both official languages are known, and the languages spoken most often at home are the two official languages; or,
Mother tongue includes English and French as well as another non-official language, and both official languages are known, and the language spoken most often at home is a language other than the two official languages.
It should be noted that in 2006, for specific analytical needs, a distinct sample of persons with a mother tongue other than French or English and having French as their first official language spoken was added in the Montreal region. In 2022, there was no such addition.
Within a region, community experiences likely vary based on the concentration of the official language minority population, as measured by the proportion of people in the official language minority population living in each census dissemination area (DA). The sample design ensured that within a stratum, the sample represented well the full diversity of concentration levels by preventing an abundance of DAs with similar concentration levels.
Allocation method:
A method for optimal allocation between the substrata of a particular domain was used, taking into account different types of sample size loss, such as expected non-response and the probability of each unit belonging to the target population.
Sample size:
Approximately 59,000 people were selected to participate in the survey, split almost half-and-half between the adult and child components.
Data sources
Data collection for this reference period: 2022-05-16 to 2022-12-16
Responding to this survey is voluntary.
Data are collected directly from survey respondents and derived from other Statistics Canada surveys.
Among the sampled individuals, approximately 30,000 completed the 2022 SOLMP questionnaire for a response rate of 53.4%. (The response rates for adults and children were 50.9% and 55.9%, respectively.)
For the most part, the 2022 SOLMP sample was drawn from respondents who answered the 2021 Census of Population long-form questionnaire - the rest of the sample came from the short form. SOLMP respondents were advised that Statistics Canada planned to combine their SOLMP responses and census responses. Accordingly, the final edited SOLMP master microdata file was linked with the 2021 Census of Population Dissemination Database. In the end, more than 250 census variables (not including geographic) were added to the final SOLMP file for 2022.
The specific benefits of a SOLMP-Census record linkage are reduced response burden for the target population of the SOLMP, the derivation of survey weights which are crucial to providing valid estimates, and the creation of a comprehensive microdata file which can be used by data analysts to extend their learning, and to inform policy and program development for official language minority populations in Canada.
All products containing linked data are disseminated in accordance with Statistics Canada's policies, guidelines and standards. Only aggregate statistical estimates that conform to the confidentially provisions of the Statistics Act are released.
How the data were collected:
Five methods were used to collect data:
Respondent Electronic Questionnaire (rEQ). Respondents received a secure access code to log in and complete the survey online.
Interviewer Electronic Questionnaire (iEQ), also known as Computer-Assisted Telephone Interview (CATI).
As various restrictions were still in place related to the COVID-19 pandemic, Statistics Canada implemented CAPI Lite Plus collection (CLP). Interviewers visited selected individuals in person to schedule an appointment with them to later complete the questionnaire via CATI.
'Knock, Talk and Call' (KTC), which is similar to CAPI Lite Plus, but a CATI interview is scheduled to take place immediately at the time of the visit.
Finally, a pilot collection method was conducted for approximately one month in Iqaluit, where selected individuals were invited to complete their questionnaire via rEQ or CATI at a municipal building, where computers and in-person assistance were available.
Respondents could choose to complete the questionnaire in English or French. On average, the survey took about 45 minutes to complete.
Proxy respondents (when someone fills out the questionnaire on behalf of a selected person) were not permitted for the SOLMP. However, for the child sample, parents were the targeted respondents, completing the questionnaire on behalf of their selected child.
View the Questionnaire(s) and reporting guide(s) .
Error detection
In many cases when a particular response appeared to be inconsistent with previous answers or outside of expected values, the interviewer or respondent was prompted, through message screens on the computer, to confirm answers and, if needed, to modify the information directly at the time of interview. This editing, however, was conducted only with errors that were fairly simple and straightforward to detect and address. These edits were applied at the micro level.
The collected data were then subjected to further editing processes to correct errors that required more complex edit rules. Customized edits consisted of validity checks within and across variables to identify gaps, inconsistencies, and other problems in the data, and corrections were performed based on logical edit rules. Editing at this stage was also applied at the micro level, using SAS (Statistical Analysis System).
Imputation
While no imputation was carried out to address item nonresponse, data editing was performed to rectify inconsistent answers.
Estimation
The initial weight of a unit corresponds to the product of two components: the inverse of its stratum sampling fraction and its Census weight. The stratum sampling fraction is calculated as the number of people selected for the SOLMP in each stratum divided by the total number of available Census respondents for that stratum. The weights were then adjusted for non-response.
Adjustments were made to the weights to account for two types of non-response: non-contact and non-response with contact (mainly refusals). First, a logistic regression model was constructed for each adjustment to predict the probabilities of being contacted or of responding when contacted using the information contained in the Census variables and collection variables known as "paradata" (number of contact attempts, for example). Second, respondents and non-respondents with similar predicted response probabilities were assigned to adjustment classes using cluster analysis. Third, the inverse of the weighted response rate in a class was used as the adjustment factor for that class, and the weights of the responding units within the class were adjusted accordingly.
Next, a post-stratification adjustment was made to align key estimates from the SOLMP with the corresponding ones from the Census. The post-strata used were:
- For adults: region, age group and language group
- For children: region, age group and eligibility for instruction in the minority official language.
The Sigma-gap method was then used to detect and reduce excessively large weights within each post-stratum. After the weights were sorted in descending order, the excessively large weights were reduced to the value of the first non-outlier weight. The mass of the reduced weights was then redistributed proportionally within the post-strata.
For the 2022 SOLMP, the bootstrap method was used to estimate the variance. For the sole purpose of estimating the variance, the 2021 Census was seen to have two phases: the initial sample of approximately 1 in 4 dwellings as the first phase and census respondents as the second phase. Although the final response rate was quite high for the 2021 Census (97.4% for the long form), this second phase ensures that the variance estimate accounts for the non-response that occurred. The two phases of the Census were later combined into a single phase. The SOLMP sampling was treated as a second phase, and then the general bootstrap method for two-phase sampling developed for the 2006 APS was used (see Langlet, É., Beaumont, J.-F., and Lavallée, P. 2008. "Bootstrap Methods for Two-Phase Sampling Applicable to Postcensal Surveys". Paper submitted to Statistics Canada's Advisory Committee on Statistical Methods, May 2008, Ottawa).
For the SOLMP, a set of 1,000 bootstrap weights were generated using this method. The method can lead to negative bootstrap weights. To overcome this issue, a transformation was applied to the bootstrap weights that reduced their variability. Therefore, the variance estimated from the transformed bootstrap weights must be inflated at the outset by a factor of 16, which is the smallest value that makes all bootstrap weights positive.
Quality evaluation
Differences between the SOLMP and other data sources:
Due to a number of differences in methodology between the 2022 SOLMP, its previous cycle in 2006 and other Statistics Canada surveys, comparisons of data between sources should be done with caution.
The 2022 SOLMP and the 2021 Census:
The Census and the SOLMP are both rich sources of information on official language minority populations that complement each other. The SOLMP takes concepts that are touched on in the Census and asks further questions to provide a more in-depth representation of their situation.
The population counts from the 2022 SOLMP for certain subpopulations may differ from those obtained from the Census, even when the Census population universe is made to match that of the SOLMP. Indeed, the weight calibration described above ensures consistency only for certain subpopulations.
In addition, for the same individual, responses to a same concept may, in some cases, differ between the SOLMP and the Census. There are many reasons why responses to these surveys may differ such as:
- Method of collection and effect of proxy reporting
- Different questionnaires
- Different contexts
- Effect of time
2022 SOLMP and 2006 Survey on the Vitality of Official-Language Minorities (SVOLM):
The target population for the 2022 SOLMP contains children eligible for instruction in the minority official language due to the instruction profile of their parents. The addition of this new segment was made possible by the language of instruction questions that were added to the 2021 Census.
Methodologically, the biggest change between the 2006 SVOLM and the 2022 SOLMP concerns the quality indicators. Prior to 2022, the coefficient of variation (CV) was used to report on the quality of estimates in terms of their sampling error. For the 2022 SOLMP, the 95% confidence interval (CI) is used for this purpose instead.
Disclosure control
Statistics Canada is prohibited by law from releasing any information it collects that could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.
Revisions and seasonal adjustment
This methodology type does not apply to this survey.
Data accuracy
Two types of errors occur in surveys: sampling errors and non-sampling errors.
Sampling Errors:
Sampling errors are defined as errors resulting from the estimation of a population characteristic based on the measurement of a part of the population rather than the whole population. For probability sampling surveys, there are methods for estimating sampling errors. These methods derive directly from the sampling design and estimation method used in the survey.
The measure most often used to quantify sampling error is the sampling variance. The sampling variance indicates the extent to which the estimate of a characteristic differs from one sample to the next, given several possible samples of the same size and design. The standard error of an estimate is the square root of the sampling variance. This measure is easier to interpret, because it gives an indication of sampling error using the same scale as the estimate, whereas variance is based on squared differences.
Sampling error is reported for the 2022 SOLMP using the 95% CI instead of the coefficient of variation (CV) which was used for the 2006 SVOLM. These two ways of reporting the sampling error are briefly described below.
The CV of an estimate is a relative measure of sampling error. It is defined as the estimate of the standard error divided by the estimate itself, usually expressed as a percentage (e.g., 10% instead of 0.1). It is very useful for measuring and comparing the sampling error of quantitative variables with large positive values. However, its use is not recommended for estimates such as proportions and estimates of variations or differences, or for variables that can take on negative values.
A CI provides upper and lower bounds around a point estimate and indicates the degree of confidence with which the CI covers the true population value. A 95% CI of an estimate means that if the survey were repeated several times, the CI would cover the true population value 95% of the time (or 19 times out of 20). Statistics Canada's best practice is to report the sampling error of an estimate using its 95% CI.
Non-sampling Errors:
Non-sampling errors arise primarily from the following sources: non-response errors, coverage errors, measurement errors and processing errors. The response rate for the SOLMP was 53.4%, with the response rate of children being 5 percentage points higher than that of adults as reported earlier. Total non-response will produce a bias if non-respondents have different characteristics from respondents and if non-response is not corrected properly. Non-response adjustments, combined with a relatively high response rate, helped reduce this risk of bias substantially. Non-response to specific questions is often due to difficulty understanding the questions. Thorough quality reviews and questionnaire testing were carried out before the survey, which reduced the extent of partial non-response. Cases in which there was a large proportion of missing responses to key questions were treated as a special form of total non-response.
Coverage errors occur when there are differences between the target population and the sampled population (or survey population). In particular, under-coverage can be problematic. Because the SOLMP sample was selected from those who had participated in the 2021 Census, individuals who did not participate in the census could not be sampled for the SOLMP. If this group of individuals is significantly different than the ones who participated in the census with respect to the characteristics measured in the SOLMP, a bias could be introduced. This bias is assumed to be relatively small given the very high response rate obtained in the census (97.4% response rate for the long form questionnaire).
Measurement errors occur when an answer provided differs from the actual value. These errors may be due to respondents, the interviewer, the questionnaire, the collection method, or the data processing system. Extensive efforts were made to develop questions for the 2022 SOLMP that would be understandable, relevant, and culturally appropriate.
Processing errors may occur at various stages, including during the programming of the electronic questionnaire, when the interviewer or respondents enters responses, when coding, and when editing data. Quality control procedures were applied at each stage of the 2022 SOLMP data processing to reduce this type of error. Data collection was conducted using an electronic questionnaire, either administered by an interviewer or self-reported by the respondent. Various edits were built into the system to alert the respondent or interviewer to inconsistencies or unusual values, allowing for the correction of inconsistencies or errors immediately.
- Date modified: