Importing Files
Todd Huss has submitted the file import code for Wind and Tides. Periodically the site needs to fetch data from the NOAA and update the local database. Here’s his import code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
class Wind < ActiveRecord::Base validates_presence_of :created, :wind_location_id belongs_to :wind_location # Get all of the wind locations, then fetch, parse, and save the data def self.batch_update!(wind_locations = WindLocation.all) wind_locations.each do | wind_location | raw_noaa_data = Net::HTTP.get_response(URI.parse(wind_location.url)).body raise "Received unexpected wind response from NOAA: " + raw_noaa_data unless raw_noaa_data=~/TIME/ save_wind!(raw_noaa_data, wind_location) end end # Take a text based NOAA forecast, parse it, populate the model objects, and save it def self.save_wind!(noaa_text, wind_location) # ... end # NOAA publishes 181809Z for Wind which we need to turn into YEAR/MONTH/18 18:09 UTC def self.convert_noaa_to_time(noaa_time, now = Time.now.getutc) # ... end end |
There’s one major structural change I’d make, and a few minor enhancements.
The structural change I’d make is to move the downloading and parsing code out of the model and into a utility class. This lets you keep your models focused on your application, and have the file import code tucked away in the lib directory. At the same time, if we tie the importer to a single WindLocation and remove the loop in the batch_update! method, we make the focus tighter:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
class WindImporter class << self def batch_update!(wind_location) raw_noaa_data = Net::HTTP.get_response(URI.parse(wind_location.url)).body raise "Received unexpected wind response from NOAA: " + raw_noaa_data unless raw_noaa_data=~/TIME/ save_wind!(raw_noaa_data, wind_location) end # Take a text based NOAA forecast, parse it, populate the model objects, and save it def save_wind!(noaa_text, wind_location) # ... end # NOAA publishes 181809Z for Wind which we need to turn into YEAR/MONTH/18 18:09 UTC def convert_noaa_to_time(noaa_time, now = Time.now.getutc) # ... end end end class WindLocation < ActiveRecord::Base has_many :winds def batch_update! WindImporter.batch_update!(self) end end |
By making WindLocation#batch_update! delegate to the importer, we can work with it in a more object oriented manner. (Also, using delegation makes it possible to use a “mock” object for the importer during testing.)
1 2 3 4 5 |
# Import the files for one location @wind_location.batch_update! # or, import them all WindLocation.find(:all).each(&:batch_update!) |
With that change out of the way, we can make a few enhancements to the implementation of some of the methods. batch_update! has a big chunk of net/http boilerplate which can be replaced by a tiny piece of open-uri.
1 2 3 4 5 6 |
require 'open-uri' def batch_update!(wind_location) noaa_text = open(wind_location.url).read # ... end |
There are a few snippets which can be simplified in save_wind! too. Here’s the original:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
def self.save_wind!(noaa_text, wind_location) # Filter and append lines with data lines = Array.new noaa_text.each_line do | line | next unless line=~/^\d\d\d\d\d\dZ/ lines << line end raise "Received unexpected wind report from NOAA: " + noaa_text unless lines.length > 0 # Process each line in reverse order (oldest first) lines.reverse_each do | line | created = convert_noaa_to_time(line[0,6]) wind = Wind.find_or_create_by_created_and_wind_location_id(created, wind_location.id) class << data = line.split # Convert field to int unless it's "--" then nil def to_i(index) (self[index]=~/^-+/)?nil:self[index].to_i end end wind.direction = data.to_i(wind_location.position) wind.sustained = data.to_i(wind_location.position + 1) wind.peak = data.to_i(wind_location.position + 2) wind.save! end end |
The initialization of lines can be done in one line by using Array#grep and a slightly simplified regular expression. (Array#grep simply returns all elements that match the given regex.)
lines = noaa_text.grep(/^\d{6}Z/) |
Next, instead of calling the unwieldy find_or_create_by_created_and_wind_location_id we can use class methods on associations to initialize wind. (For more about class methods on associations, you can read the article I wrote when we were dissecting the Tracks application.)
wind = wind_location.winds.find_or_create_by_created(created) |
Finally we can simplify the code which converts integers. In the NOAA format, ”—” represents nil, and other integers are as expected. While the original code converted the data by defining a method on the data array, we can instead create a new array containing just the integers with Enumerable#map. After creating the new array, we can safely retrieve the figures.
1 2 3 4 5 6 |
data = line.split.map{|d| d.starts_with?("--") ? nil : d.to_i }
pos = wind_location.position
wind.direction = data[pos]
wind.sustained = data[pos + 1]
wind.peak = data[pos + 2] |
Another option for parsing integers is to use the Integer method. It will raise an exception if the argument cannot be converted, so you could just do something like:
data = line.split.map{ |d| Integer(d) rescue nil } |
That won’t work, of course, if you want any non-integer value besides ”—” to map to 0, but it does reduce the line-noise a bit.
This makes the code much easier to follow and a little more explicit. The fully refactored save_wind! is presented below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
def save_wind!(noaa_text, wind_location) lines = noaa_text.grep(/^\d{6}Z/) raise "Received unexpected wind report from NOAA: " + noaa_text unless lines.length > 0 # Process each line in reverse order (oldest first) lines.reverse_each do | line | created = convert_noaa_to_time(line[0,6]) wind = wind_location.winds.find_or_create_by_created(created) data = line.split.map{|d| d.starts_with?("--") ? nil : d.to_i } pos = wind_location.position wind.direction = data[pos] wind.sustained = data[pos + 1] wind.peak = data[pos + 2] wind.save! end end |
Thanks again to Todd for making the submission.

