Skip to content

Learn Open Data / Maintaining Open Datasets

Maintaining an open dataset involves managing the electronic data assets for the dataset according to the protocols that have been defined. This can be a simple task or very complicated, depending on the complexity and size of the dataset.

maintaining

The following are fundamental considerations when maintaining an open dataset:

  1. Organization - Determine a responsible entity that is suitable to act as steward of the dataset:
    • Ideally this entity should have a certain level of "ownership" and organizational mission that aligns with maintaining the dataset so that there is continued interest in serving in the maintenance role.
    • The organization and technology platform may overlap, for example if cross-jurisdicational nonprofit open data portal organization is formed to host data.
    • The entity should have philosophically agreed to publish the dataset in an open way, without barriers and strings attached, using a suitable machine-readable formats.
    • Suitable open data policy and licenses should be used.
    • Ideally funding is identified to help the entity perform maintenance. Although some volunteer effort may be appropriate, reliance on volunteer efforts can lead to gaps in maintenance.
    • Establish protocols for submitting feedback and implement procedures to update the dataset accordingly.
  2. Technology Platform(s) - Select one or more technology platforms that are cost-effective, sustainable, and functional.
    • Technologies are used to collect, process, store, and publish data.
    • Can be simple (such as a file on a website) or complex (such as an open data portal), depending on requirements.
    • Should support open data formats.
    • Recognize that the use of technology can be a limitation if the technology is costly or difficult to learn, use, and support - strive for simplicity.
  3. Data Format - Maintain and publish datasets in formats that can be sustained.
    • For simple dataset, an Excel workbook may be suitable for maintenance.
    • For complex dataset, a database may be required.
    • The approach that is chosen should be readily supportable with available human resources.
  4. Data Elements - The dataset should contain elements that enable its use:
    • Unique identifiers should be assigned in data records to allow unique identification and joining the data to other datasets.
    • Descriptive names should be provided in data records.
    • Use a simple data model and "flat" representation if possible.
  5. Metadata - Metadata for the dataset should be published:
    • At a minimum, provide a table the lists data elements and description.
    • Use open metadata standards where possible.
  6. Publishing - Publish the dataset and metadata:
    • Easily accessible.
    • Multiple formats to facilitate use.
    • Minimize cost to use the data by selecting appropriate technologies.
    • Include license/terms of use, disclaimer, etc. to define terms for use and distribution.