ECReader – Environment Canada Climate Data Reader

ECReader (Environment Canada Climate Data Reader) is a .NET tool to download Canadian climate data from Environment Canada website. It comes from the SWAT modelling project I’m working on. I’m tired of downloading the data from EC website one station after another station, year by year and then processing them into the file format ArcSWAT wants. It’s no fun at all. As a programmer, I rather spend the same time to make a tool to do all these work in just one click. It would also help others too. So, please leave a comment or send me email (hawklorry@gmail.com) if you needs more functions. I would be happy to add it for you. Also please follow me on wordpress or facebook to get notified for any further development.

Program Package
https://drive.google.com/file/d/0B16YhFB_9MejSjE4RlQ2VjZLVlk/view?usp=sharing

Source Code

https://code.google.com/p/environment-canada-climate-data-reader/

Screenshot

The main features are list below.

  • Download climate data for a bunch of stations with just one click
  • A build-in station definition window to help locate stations of interest by name or location.
  • Support different output formats including the SWAT-ready dbf/txt format which could be used in ArcSWAT without any post-process (only for precipitation and temperature).

  • Support daily and hourly data
  • Give a map of all climate stations, which could be used outside the tool. The attribute table gives the station ID, name and the data availability of hourly, daily and monthly data. It’s a good dataset for any research required climate data. It’s available right in the tool and in csv, shapefile and kmz format for different environment.

Several posts have been published on this topic (list below). New functions would be added shortly to help fill the data gaps using nearby stations using IDW (Inverse Distance Weight) method. Lapse rate would also be considered for precipitation and temperature in this process.

December 12, 2013 Environment Canada Climate Data Reader

December 18, 2013 How to Get Environment Canada Climate Station ID From Name

December 28, 2013 Uncompleted Data Bug Fixed, Please Update

December 28, 2013 Environment Canada Climate Stations (Shapefile and KMZ)

December 31, 2013 6 More Columns Added to Environment Canada Climate Stations (Shapefile) to Help Check Data Availability

January 2, 2014 ECRearder 1.1 – New Climate Station Definition Window – No longer need to look up station IDs yourself

January 6, 2014 File Name Convention in ECReader1.1

January 9, 2014 Save and Load Defined Stations in ECReader1.1

 

 

Advertisements

Save and Load Defined Stations in ECReader1.1

Please follow my blog to get the latest updates on ECReader as soon as they are available.

Quick Download Link: ECReader ver 1.1

If working on a bunch of stations, you may want to save the list and load them later. There is no need to do define exactly the same stations twice. New functions has been added to ECReader1.1 to help save and load defined stations.

Contents

  • Save Defined Stations
  • Load Saved Stations
  • Automatically Save Stations

Save Defined Stations

A new button “Save as…” is added to “Define Stations” window to save defined stations. Once the stations are ready, click this button and give a file name. It’s done.

  • The stations is save as CSV file with the same format as ecstations.csv.
  • Please don’t edit the file to avoid any errors.
  • A default file name will be given in following format.

ECReader_Stations_[time stamp]

where time stamp is the time when the button is clicked.

Example

ECReader_Stations_20140108115611

Load Saved Stations

A new button “Load Saved Stations” is added to the main interface just after the “Define Stations…” button. Click this button and select the file just saved, the saved stations will be loaded and the number of stations loaded will be prompted. And it’s ready to download data for all these stations.

Automatically Save Stations

The defined stations would be saved automatically when the interface is closed. A file named ecstations_selected.csv would be saved to the user temp folder. This file would be loaded automatically when the interface is opened next time.

File Name Convention in ECReader1.1

Please follow my blog to get the latest updates on ECReader as soon as they are available.

Quick Download Link: ECReader ver 1.1

To make the data file easy to identify and meet the requirement of ArcSWAT, the file name convention is modified in ver 1.1.

  1. ArcSWAT 2009 dBase format and ArcSWAT 2012 ASCII format

Except for the file extension, these two format share same file name convention.

ArcSWAT 2009/2012 requires the file name no more than 8 chars. The data file is named as following.

Precipitation        P[station ID].dbf/.txt

Temperature        T[station ID].dbf/.txt

where:

station ID is the ID of the station. If the length of station ID is less than 7 letters, 0 will be added to the left to make the file name has 8 letters including P or T.

Discussion: Another option is to use the first 7 letters of station name, which is easier to identify the station from the file name but prone to get same name from two different stations. Which option is better?

Examples:

For station 4922, the data file will be P0004922.dbf/.txt and T0004922.dbf/.txt

For station 10936, the data file will be P0010936.dbf/.txt and T0010936.dbf/.txt

The gage location file is also generated with a fixed name.

Precipitation Gage     pcp.dbf/.txt

Temperature Gage     tmp.dbf/.txt

  1. Free Format Text and CSV

These two format share same file name convention.

[station name]_[province]_[start year]_[end year]

where

  • station name is the name of the station. If there are more than one words in the name, the space will be replaced as underscore.
  • province is the province of the station
  • start year is the bigger one of the starting year of the query time period and the starting year of available data. The purpose of these two format is usually for data analysis and No need to download data for years without data. This is also applied to end year.
  • end year is the smaller one of the end year of the query time period and the last year of available data. If the start year equals to end year, only start year will be used.

Examples:

For station station GEORGETOWN WWTP located in Ontario,

  • if the query time period is from 1965 to 1965, the data file will be GEORGETOWN_WWTP_ONT_1965.txt/.csv
  • if the query time period is from 1965 to 1970, the data file will be GEORGETOWN_WWTP_ONT_1965_1970.txt/.csv

ECRearder 1.1 – New Climate Station Definition Window – No longer need to look up station IDs yourself

Please follow my blog to get the latest updates on ECReader as soon as they are available.

Quick Download Link: ECReader ver 1.1

The file name convention please refer to my other post: File Name Convention in ECReader1.1

In previous version of ECReader, to download climate data of a Environment Canada climate station, its station ID must be looked up first following the instruction. I must say: it’s NOT user-friendly.

To solve this, the basic information (including station ID) of all EC climate stations was downloaded and saved in a csv file. This makes it possible to lookup station ID in a user-friendly way.

ECReader interface is modified and a new “Define Stations” window is added to help choose stations from all 8551 stations, which is also used as tool to help browse and look for stations.

Specify Stations in Previous Version – Station ID is required

Specify Stations in New Version – Dedicated Station Definition Window to Utilize all EC Station Data and a link to download the all EC stations data

Quick Guide

  1. Click “Define Stations…” button to open “Define Stations” window.
  2. There are three tabs: By Name, Browse and From Map, where the last two will be implemented in the near future. The map function should be interesting.
  3. Three search criteria are supported: Station Name, Province and Data Availability.
  4. In case you know part of the station name, input in the station name part, e.g. george.
  5. In case you know the province of the stations, select one from the province list, e.g. ONT.
  6. In case you know the simulation period of SWAT model or the study period of data analysis, select hourly, daily or monthly from the data type list and select start and end year.
  7. Click “Search” button to retrieve stations.
  8. Stations will be given in the list below “Search” button. Two columns – name and province are displayed for each station.
  9. Select one station in the list, its data availability is shown at the bottom of the window. You could check the data availability of hourly, daily and monthly data.
  10. Double-click one station in the list to add this station to the selected list on the right. The selected list is the list of stations which will be used download data.
  11. You also could click “Use All 3 stations” below the list to add all the stations in the list into the selected list. The name of this button will change along with the search result.
  12. For right selected list, select one station to see the data availability one the bottom the same as the search result list. Double-click one station to remove one station from the selected list. Click “Remove All” button to remove all stations.
  13. Click “Use Selected Stations” button to return to main interface. The number of selected stations will be displayed under “Define Stations…” button.

Now the query stations are defined and it’s ready to download data the same as the previous version.

6 More Columns Added to Environment Canada Climate Stations (Shapefile) to Help Check Data Availability

Quick Download Links

CSV, Shapefile, KMZ, ZIP

Please note that this data was generated on Dec 31, 2013 and the last day of data is depend on the day when the data is retrieved from EC website. If one of the last day is 2013-12-30, it doesn’t mean there is no data after Dec 30, 2013. I will update the data once a while to make sure the dataset is up-to-date.

Background

  1. Environment Canada climate data is available in hourly, daily and monthly. However, this is not true for every station. Some may just have daily and monthly data and don’t have hourly data.

  2. Some stations stopped working at some point and don’t have up-to-date data. For these stations, only a period of data is available.

For climate station selection, it’s useful to know which type of data is available from a station and the time period of available data. These information could be compared with simulation period of a model to help select right climate stations.

Solution

Fortunately,  these information could be retrieved from EC website. Following columns was added to the CSV and shapefile published in previous blog

  1. HOURLY_FIRST_DAY: First day of hourly data; will be null if there is no hourly data

  2. HOURLY_LAST_DAY: Last day of hourly data; will be null if there is no hourly data

  3. DAILY_FIRST_DAY: First day of daily data; will be null if there is no daily data

  4. DAILY_LAST_DAY: Last day of daily data; will be null if there is no daily data

  5. MONTHLY_FIRST_DAY: First day of monthly data; will be null if there is no monthly data

  6. MONTHLY_LAST_DAY: Last day of monthly data; will be null if there is no monthly data

Please note that the name of these columns in shapefile is shortened due to the limitation of shapefile on column name.

Environment Canada Climate Stations (Shapefile and KMZ)

Quick Download Links

CSV, ShapefileKMZ, ZIP

Updates

Dec 31, 2013

To help determine the data availability of each station, 6 more columns (HOURLY_FIRST_DAY, HOURLY_LAST_DAY, DAILY_FIRST_DAY, DAILY_LAST_DAY, MONTHLY_FIRST_DAY, MONTHLY_LAST_DAY) are added to show the available time range of hourly, daily and monthly data. Please note that the name of these columns in shapefile is shortened due to the limitation of shapefile on column name.

Available Data

EC stations data is the basic information of all EC climate stations. It’s available in three formats.

  1. CSV, which has following columns: ID, NAME, PROVINCE, LATITUDE, LONGITUDE, ELEVATION. The ID column is the station ID required in ECReader .
  2. Shapefile, which is created from the CSV file with “Add XY Data” tool in ArcMap. Geographic coordinate system (GCS_North_American_1983) is used. The columns are same with CSV file.

Please note that the 5 stations in the right bottom corner has missing latitude and longitude in the Environment Canada website. In the data here, the latitude and longitude is set as 0.

  1. KMZ, which is converted from shapefile with “Layer To KML” tool in ArcGIS. It’s intent to be used in Google Earth for further analysis.

A zip file, which contains all these three formats, is also available.

Calculation of Latitude and Longitude

Latitude and longitude is calculated based on information given in the daily data report page. The degree, minute and second format displayed on the page is converted to decimal degree using following equation.

Decimal Degree = Degree + (minute + second/60.0) / 60.0

For climate station 100 Mile House in BC,

Latitude = 51 + (38 + 49.2/60)/60 = 51.64695

Longitude = 121+ (18 + 9.06/60) 60 = 121.302516666667 (saved as -121.302516666667)

Steps Implemented in ECReader to Generate the CSV File

  1. Get all stations using “Search by Province” in Advance Search. 8467 stations are displayed in several pages. In each page, read station ID, station name, province and available data type (hourly, daily and monthly) and corresponding time period of each station. The station ID and available data type is save as hidden input in the result page.

  1. For each station, go to the daily report page and read latitude, longitude and elevation.

  1. For each station, write station ID, name, province, latitude, longitude and elevation into the CSV file. More information is available.

These steps are implemented in ECReader as simple as clicking a button. Please note that this button is not added to the interface for performance consideration. In case you want to test this function yourself, please reply at the bottom of this blog.

Background

When a SWAT model is setup, it would be convenient if the location of all EC climate stations is available in some GIS formats, which could be used to determine which stations should be used in the model.

EC doesn’t give this map. Instead, a tool is supplied in the Advanced Search to help locate stations. It’s not good enough for GISers, who wants to work in a more visual way. I’m that guy.

Google internet, a shapefile is found here. Interestingly, it happens to be created by University of Guelph, where I worked for three years starting from 2010. I was in Department of Geography, which also should be where the data come from. I could know the professor or student who created this data. Anyway, this data is just meant to handle stations with climate normals and there are 1480 stations compared to 8467 stations got from advance search in EC website. Apparently, we couldn’t use this data for the modelling work.

Solution

EC give the location (latitude and longitude) of each station in the daily data report page. Retrieving this page for all 8467 stations and then reading the location from the page would be a perfect solution for computers.

At the same time, it’s possible to get the station ID required in ECReader and create a simple lookup relation between station name and station ID, which means user won’t need to find the station ID themselves as before. ECReader would be more friendly.

Implementation

ECReader is modified to do this work. The response from EC website is in html format. To read this, System.Xml.XmlDocument is first used. It didn’t work well on parsing html. Exceptions everywhere. Then, 3rd party library HtmlAgilityPack was found and it was perfect.

The process is implemented as a static method of class EC. It saves all the information in CSV format. To get all the stations, depends on the computer configuration, half an hour is needed.

The final data is compared with data generated by University of Guelph. All the stations are in the same location.

Future Work

There are several ways to utilize the station data in ECReader in the future development.

  1. Lookup station ID from station name. No one want to get station ID himself.
  2. Add more methods to define working stations, including the interactive map selection (import watershed boundary or select manually on the map) which could be implemented with DotSpatial.

Uncompleted Data Bug Fixed, Please Update

Please follow my blog to get the latest updates on ECReader as soon as they are available.

After releasing ECReader (Environment Canada Climate Data Reader) , several users reports that the data of some years is uncompleted. It does start from Jan 1st but would end at any arbitrary day of that year, which will make the result files unusable. Since the program doesn’t check if the data is completed, the problem could be only found in manually data analysis. It’s a vital bug. Sorry for any inconvenience.

Good news is the bug has been fixed. It’s highly recommended to update your program and re-download your data. The new version is located in the same place as the previous version. Please click here to download.

Other minor changes are also added to this version and the structure of the package has changed. Some files are removed. It looks like this.

Other minor changes are list below.

  1. The program got an icon .
  2. No more DLLs. Only one executable file.
  3. Button is removed. Instead a help link is added.
  4. is added to select all data types for free format txt/csv.