Skip to content

Learn Open Data / Using Open Datasets

Using open datasets requires:

  • finding the dataset (for example from internet search or open data portal)
  • understanding dataset contents (for example from metadata or visualization)
  • accessing/downloading the dataset (for example using web services or file download)
  • applying the data in an application (for example an analysis, model, or visualization)

See the Resources page for software tools to help use open data.

Download Data with Curl

The open source curl program can be used to automate datasets downloads once the URL of the dataset is known. The curl program is available for all operating systems and is also available as a library in some programming languages such as Python. Calling curl from a script may be necessary to process the data. A simple command-line example is:

curl http://some/dataset/address --output some-dataset.csv

See also:

Useful tips:

  • Use the --location option if the retrieved URL contains a redirect to another location.

Download and Process Data with TSTool

The open source TSTool software (see TSTool software website) is maintained by the Open Water Foundation for the State of Colorado. TSTool can be used to automate download and processing of datasets, for example to process CSV files into time series and perform analysis and visualization. The following TSTool commands are useful:

  • WebGet - retrieve a file using a URL
  • ReadTableFromDelimitedFile - read a CSV or other file into a table for additional processing
  • TableToTimeSeries - convert a table into a time series for analysis and visualization