Model-based Principal Field Crop Estimates

Detailed information for 2017

Status:

Active

Frequency:

Annual

Record number:

5225

The model-based crop estimates provide provincial and national yield and production estimates for principal field crops in Canada. The model utilizes data from low resolution satellite imagery, historical field crop survey estimates, and agroclimatic information.

Data release - September 19, 2017

Description

Since 2012-13, Statistics Canada has been collaborating with Agriculture and Agri-Food Canada (AAFC) and Environment Canada (EC) on a model which could derive crop yield estimates for principal crops grown in Canada. Statistics Canada recognised a potential opportunity to make new estimates available with these modelled yields. These estimates could eventually replace collected data and reduce the response burden on crop producers.

The modelled crop yield estimates are produced at the provincial and national levels for dissemination.

The modelled estimates will provide yield and production information for the month of August, at an earlier date than the September Farm Survey, without adding additional response burden on crop producers.

These estimates provide important information for global food security, crop products markets, and planning for transporting crops from the farm to market.

Federal and provincial government agencies, grain marketing agencies, crop insurance companies, researchers and producers are typical users of the yield and production estimate information.

Subjects

  • Agriculture
  • Crops and horticulture

Data sources and methodology

Target population

The target population is the entire agricultural area of Québec, Ontario, Manitoba, Saskatchewan and Alberta.

Instrument design

This methodology does not apply.

Sampling

Data are collected for all units of the target population, therefore no sampling is done.

Data sources

Data are extracted from administrative files and derived from other Statistics Canada surveys and/or other sources.

Three data sources are used as input variables for the crop models; they are: 1) the Normalized Difference Vegetation Index (NDVI), derived from coarse resolution satellite data, 2) survey yield data, and 3) the agroclimatic indices.

The weekly NDVI data is a product of Statistics Canada's Crop Condition Assessment Program (CCAP). The NDVI is a standardized index of vegetation health and allows the direct comparison of changing vegetation conditions within a time series. The mean NDVI value for an individual Census Agriculture Region (CAR) is computed by averaging all of the pixels within a CAR. After the mean NDVI values were computed, they were imported as one of the input variable databases to the crop models as three-week moving averages from Julian week 18 to 36 (May to August). For more information regarding the NDVI data, follow the link to the CCAP IMDB page in the Documentation section of this document.

The Field Crop Reporting Series obtains information on grains and other field crops stored on farms (March, July, and December Farm surveys), seeded area (March, June, July, and November Farm surveys), harvested area, expected yield and production of field crops (July, and November Farm surveys). The resulting estimates are based on sample surveys collected at five points throughout the year, principally via telephone interviews with farm operators. Historical and current year expected crop yield from the July Farm Survey are used as input variables for the model. The historical November survey crop yield estimates are used as the dependent variable in the model. Modelled production estimates are calculated by multiplying the harvested area from the July Farm Survey by the modelled crop yield estimate. For more information regarding the Field Crop Reporting Series surveys, follow the link to the IMDB page in the Documentation section of this document.

The agroclimate information measured during the growing season is the third data source used for modelling crop yields. The station based daily temperature and precipitation data provided by Environment and Climatic Change Canada and other partner institutions are used to generate climate based predictors. In total, 478 climate stations across the crop land extent of Canada are selected to represent the climate of the 82 CARs. The quality control and gap-filling of the missing data is performed by AAFC.

The daily series of air temperature and precipitation are incorporated into a Versatile Soil Moisture Budget (VSMB) model by AAFC to generate agroclimatic indices used in the yield model. The VSMB model outputs are generated at a daily time step and used as potential yield predictors.

Average values of the indices at all stations within the cropland extent of a specific CAR are used to represent the mean agroclimate of that CAR. If a CAR lacks input climate data, stations from neighboring CARs are used.

To form a manageable array of potential crop yield predictors, AAFC aggregated the daily agroclimatic indices which are included in the modelling methodology (Newlands et al. 2014 - http://journal.frontiersin.org/article/10.3389/fenvs.2014.00017/full; Chipanshi et al. 2015 - http://dx.doi.org/10.1016/j.agrformet.2015.03.007).

Error detection

After the modelled estimates have been generated, they are compared against the July yield estimates, although differences between the two sets of estimates are to be expected. Subject-matter experts also review the results to identify any estimates which seem questionable. External sources of field crop yield estimates are also used to identify possible errors.

Estimation

The modelled field crop yield estimates are calculated by a robust linear regression model (using the MM method in SAS) that uses data from 1987 to present. The dependent variable is the historical November field crop survey estimate for crop yield. The independent variables come from the NDVI dataset, the historical July field crop survey estimate for crop yield and historical agroclimatic data. A maximum of five independent variables are included in the model. They are selected by the LASSO (Least Absolute Shrinkage and Selection Operator) variable selection approach in SAS.

The crop yield estimates are modelled at the CAR level and aggregated to provincial and national levels for dissemination. The aggregation is accomplished by using seeded area from the June Farm Survey of the Field Crop Reporting Series to weight the contribution of individual CARs to the respective aggregated level. Certain crops that are less abundant in a province are modelled directly at the provincial level.

Quality evaluation

During the model development phase the quality of the model was tested by predicting the values for historical November survey crop yield estimates using the model and comparing them to the actual values at the provincial and national level. Based upon these observations, the optimal model was chosen. Model diagnostics were also run to confirm that its input data conformed to the properties required for the model.

A number of requirements are defined which must be met before the model's estimate will be eligible for publication. If less than twelve years of July or November Farm Survey yield estimates are available or the July Farm Survey yield estimate or June Farm Survey area estimate for the current year are missing for a CAR, then a yield estimate will not be produced for that CAR. At the provincial level, the total area suppressed at the CAR level is computed and if it is greater than 10% of the total area for that crop within that province, it is suppressed at the provincial level. Similarly, if greater than 10% of total area for that crop at the national level is suppressed, a yield estimate will not be produced for that crop at the national level.

Additionally, if the coefficient of variation (CV) calculated for the yield estimate of a province is greater than 10%, then the yield estimate is not published for that province.

Prior to release, the current year's modelled crop yield estimates are compared with those from the July field crop survey and subject-matter experts review the results to identify any estimates which seem questionable. External sources of field crop yield estimates are also used to identify questionable results.

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

Revisions and seasonal adjustment

This methodology type does not apply to this statistical program.

Data accuracy

Coefficients of variation are produced by the model. Any modelled estimate with a coefficient of variation of greater than 10% will not be released. This threshold is different from that used for the survey-based estimate of field crop yield due to differences in their method of calculation.

Documentation

Date modified: