Data Inventory Project (DIP)

Detailed information for 2012

Status:

Inactive

Frequency:

One Time

Record number:

5190

The Data Inventory Project is a government-wide stock-taking of federal data holdings within departments that are part of the Policy Research Data Group to determine the broad range of data holdings that could address the medium to longer-term priorities. The inventory is comprised of the metadata on datasets held within the various departments and will be linked, when possible, to specific key policy issues.

Data release - April 22, 2013

Description

The Policy Research Data Group (PRDG) is an interdepartmental forum consisting of members from approximately 25 policy departments and central agencies. Their main focus is the identification of data gaps and collaboration in the development of new data products for research in priority horizontal policy areas. PRDG was recently restructured to provide a closer alignment and more timely support to the Deputy Ministers policy community. Given the medium to longer term priorities of the DMs community, PRDG was asked by the Deputy Ministers Steering Committee to bring forth a data strategy which would align data development with the DM policy priorities. In this environment PRDG is moving forward in a three stage approach to address data development around these broader priorities which include the construction of a data inventory, the assessment of data needs and the identification of data gaps.

The Data Inventory Project consists of a government-wide stock-taking of federal data holdings within departments to determine the broad range of data holdings that could address the medium to longer-term priorities. The inventory consists of the metadata on datasets held within the various government departments and will be linked, when possible, to specific key policy issues. Other metadata will include the title, subject area and subtopics, time and geographic coverage, data source and size, ownership and contact information and a description of the dataset.

Subjects

  • Statistical methods

Data sources and methodology

Target population

The Data Inventory Project is targeted towards the approximately 18 policy departments and central agencies that are part of the Policy Research Data Group (PRDG). Each of the responding departments and agencies are expected to have numerous datasets for which metadata will be reported on the questionnaire.

Instrument design

In the fall of 2011 the content of the survey was drafted and reviewed by Questionnaire Design Resource Centre. Following this, members of several of the respondent departments attended a focus group to review content of the questionnaire and to introduce them to e-questionnaire formatting. Feedback was incorporated to improve on the ease and understanding of the survey tool, including "help text" in select certain questions for clarification.

An e-questionnaire is used to collect the information. This was developed to allow each respondent to loop through the questionnaire for each dataset that he/she reported on. The respondent had the ability to save their data as they progress through the e-questionnaire and return to it at a later date. The application received extensive volume and functionality testing to ensure the ability to handle a large number of datasets per questionnaire. Some departments have a considerable amount of datasets to report on and therefore require multiple respondents to complete multiple questionnaires in order to report on the data holdings within that department.

As a result of these consultations, an Excel Collection Tool was developed to assist responding departments collect the relevant information prior to entering in the questionnaire. This is particularly useful for those responding departments with a large number of datasets to report on.

Further focus groups were conducted with respondent departments to identify any significant gaps in the main subject areas as well as the subcategories and provide any other feedback. A formal training session on the web application itself was also provided.

One-on-one questionnaire review sessions took place between some of the potential respondents and Special Surveys Division staff in order to assess how they would actually complete the questionnaire and to answer questions they had related specifically to their department's data holdings.

Finally, minor revisions were made to the questionnaire content as a result of review by the Information Management Committee.

Sampling

This survey is a census with a cross-sectional design.

Data are collected for all units of the target population, therefore no sampling is done.

Data sources

Data collection for this reference period: 2012-04-01 to 2012-05-31

Responding to this survey is voluntary.

Data are collected directly from survey respondents.

Data collection for the Database Inventory Project is through an internet-based e-questionnaire. Each questionnaire can loop 20 times, allowing for up to 20 datasets to be reported each time.

Within each respondent department, a central coordinator is appointed to distribute questionnaires within the department, and to liaise with Statistics Canada. The representatives are provided with an Excel based capture tool to allow them to do some pre-collection leg work to aggregate their dataset information prior to the official start of collection.

Initial contact within each responding department was made prior to collection during focus group presentations and one-on-one meetings.

During collection reminder e-mails are sent on a regular basis to departments who have not completed their questionnaires.

View the Questionnaire(s) and reporting guide(s) .

Disclosure control

Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Various confidentiality rules are applied to all data that are released or published to prevent the publication or disclosure of any information deemed confidential. If necessary, data are suppressed to prevent direct or residual disclosure of identifiable data.

A share question is provided in the questionnaire allowing respondent departments to authorize release of information to one of three levels:

"Who should be allowed access to the information you provided in this questionnaire?"

This question refers to access to the information about the dataset (metadata), not access to the actual data.

1. Accessible to the public
2. Other government departments
3. Only within your own department (intradepartmental use only)

Documentation

Date modified: