Learn Open Data / Using Open Datasets
Using open datasets requires:
- finding the dataset (for example from internet search or open data portal)
- understanding dataset contents (for example from metadata or visualization)
- accessing/downloading the dataset (for example using web services or file download)
- applying the data in an application (for example an analysis, model, or visualization)
See the Resources page for software tools to help use open data.
Download Data with Curl
The open source curl
program can be used to automate datasets downloads
once the URL of the dataset is known.
The curl
program is available for all operating systems and is also available as a library in
some programming languages such as Python.
Calling curl
from a script may be necessary to process the data.
A simple command-line example is:
curl http://some/dataset/address --output some-dataset.csv
See also:
Useful tips:
- Use the
--location
option if the retrieved URL contains a redirect to another location.
Download and Process Data with TSTool
The open source TSTool software (see TSTool software website) is maintained by the Open Water Foundation for the State of Colorado. TSTool can be used to automate download and processing of datasets, for example to process CSV files into time series and perform analysis and visualization. The following TSTool commands are useful:
WebGet
- retrieve a file using a URLReadTableFromDelimitedFile
- read a CSV or other file into a table for additional processingTableToTimeSeries
- convert a table into a time series for analysis and visualization