Thursday, September 6, 2012

US National Weather Service info feeds

I'm looking at updating my old desktop-weather script, so today I researched some of the weather -related information sources I can use.

There is a lot out there, quite a different landscape than three years ago. Location services, for example, to tell you where the system thinks you are, seem to be pretty mature. Gnome includes one based on both IP address and GPS.

In the US, the National Weather Service (NWS at weather.gov) has a nifty service to translate approximate locations (like postal codes) into approximate lat/lon coordinates. This is handy because most of their services use lat/lon coordinates. They have a "use us, but don't overuse us to the point of a Denial-of-Service-attack" policy.

The good old METAR aviation weather system keeps chugging along. Indeed, my old desktop script scraped METAR current condition reporting, and I likely will again. It's great for current conditions, and in places where METAR airports are frequent. It's not so good for forecasts or alerts...or for places not close to an airport.

Weather Underground is really incredible. Lots of worldwide data, an apparently stable API for it...but their terms of service seem oriented toward paid app developers. Maybe another time.

Looking at most of the desktop-weather info (not research or forecasting) systems out there, most seem to use METAR data...often funnelled through an intermediary like weather.com or Google.

It's a geographically and systemically fragmented market. So this time let's see how I can improve my existing NWS feed.



1) Geolocation

I change locations. It would be nice if the system noticed.

The National Weather Service (NWS) has a geolocation service...but they don't advertise it.
These services are intended for their customers - don't spam them with unrelated requests!

NWS Geolocation converts a City/State pair, a Zipcode, or an ICAO airport code into the appropriate latitude/longitude pair.

Here's an example of geolocation. Let's use an ICAO airport code (kmke), and see how the server redirects to an URL with Lat/Lon:

$ wget -O - -S --spider http://forecast.weather.gov/zipcity.php?inputstring=kmke
Spider mode enabled. Check if remote file exists.
--2012-10-03 15:08:40--  http://forecast.weather.gov/zipcity.php?inputstring=kmke
Resolving forecast.weather.gov (forecast.weather.gov)... 64.210.72.26, 64.210.72.8
Connecting to forecast.weather.gov (forecast.weather.gov)|64.210.72.26|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 302 Moved Temporarily
  Server: Apache/2.2.15 (Red Hat)
  Location: http://forecast.weather.gov/MapClick.php?lat=42.96&lon=-87.9
  Content-Type: text/html; charset=UTF-8
  Content-Length: 0
  Cache-Control: max-age=20
  Expires: Wed, 03 Oct 2012 20:09:01 GMT
  Date: Wed, 03 Oct 2012 20:08:41 GMT
  Connection: keep-alive
Location: http://forecast.weather.gov/MapClick.php?lat=42.96&lon=-87.9 [following]
Spider mode enabled. Check if remote file exists.
--2012-10-03 15:08:41--  http://forecast.weather.gov/MapClick.php?lat=42.96&lon=-87.9
Connecting to forecast.weather.gov (forecast.weather.gov)|64.210.72.26|:80... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Server: Apache/2.2.15 (Red Hat)
  Content-Type: text/html; charset=UTF-8
  Cache-Control: max-age=82
  Expires: Wed, 03 Oct 2012 20:10:03 GMT
  Date: Wed, 03 Oct 2012 20:08:41 GMT
  Connection: keep-alive
Length: unspecified [text/html]
Remote file exists and could contain further links,
but recursion is disabled -- not retrieving.

So kmke is located near lat=42.96&lon=-87.9

We can script this to reduce the output. Let's try it with a zipcode:

zipcode="43210"
header=$(wget -O - -S --spider -q "http://forecast.weather.gov/zipcity.php?inputstring=$zipcode" 2>&1)
radr=$(echo "$header" | grep http://forecast.weather.gov/MapClick.php? | cut -d'&' -f3 | cut -d'=' -f2)
lat=$(echo "$header" | grep http://forecast.weather.gov/MapClick.php? | cut -d'&' -f4 | cut -d'=' -f2)
lon=$(echo "$header" | grep http://forecast.weather.gov/MapClick.php? | cut -d'&' -f5 | cut -d'=' -f2)
echo "Result: $radr  $lat  $lon"
Result: ILN  39.9889  -82.9874

Let's try it with a City, ST pair. Replace all spaces ' ' with '+', and the comma is important!

$ wget -O - -S --spider -q http://forecast.weather.gov/zipcity.php?inputstring=San+Francisco,+CA


Finally, it also works with zip codes:

$ wget -O - -S --spider -q http://forecast.weather.gov/zipcity.php?inputstring=43210

Alternative: This is a small script that estimates location, or accepts a manually-entered zipcode for a location. If run during during network connection (by Upstart or by the /etc/network/if-up.d/ directory) it will determine an approximate Latitude and Longitude (within a zip code or two, perhaps)...close enough for weather. This assumes, of course, that GPS is not available, and that the system is not travelling far while online.

#/bin/sh
# Usage $ script zipcode

strip_tags () { sed -e 's/<[^>]*>//g'; }

# Determine latitude and longitude from USA zipcode using the 
# Weather Service's National Digital Forecast Database (NDFD) 
manual_zipcode () {   
   xml_location=$(wget -q -O - http://graphical.weather.gov/xml/sample_products/browser_interface/ndfdXMLclient.php?listZipCodeList=${1})
   lat=$(echo $xml_location | strip_tags | cut -d',' -f1)
   lon=$(echo $xml_location | strip_tags | cut -d',' -f2)
   zipcode=$1;}

# Try to get a close lat/lon using the IP address
ip_lookup () { ip_location=$(wget -q -O - http://ipinfodb.com )
   lat=$(echo "$ip_location" | grep "li>Latitude" | cut -d':' -f2 | cut -c2-8)
   lon=$(echo "$ip_location" | grep "li>Longitude" | cut -d':' -f2 | cut -c2-8)
   zipcode=$(echo "$ip_location" | grep "li>Zip or postal code" | cut -d':' -f2 | cut -c2-6 )
   echo "Estimating location as zipcode ${zipcode}";}

# Test that a Zip Code was included in the command.
if [ "$(echo $1 | wc -c)" -eq 6 ]; then
   manual_zipcode $1
   # Test that the manual zipcode is valid.
   if [ $(echo "$lat" | wc -c) -eq 1 ]; then
      echo "$1 is not a valid US zipcode. Trying to calculate based on IP address..."
      ip_lookup
   fi

else
   ip_lookup
fi

echo "Zip code $zipcode is located at approx. latitude $lat and longitude $lon"

This is just a beginning. It doesn't include Gnome's  (or other Desktop Environment) geolocation daemon, nor Ubuntu's geolocation IP lookup service, nor caching points to prevent repeat lookups, nor associating networks (or other events) with locations.



2) National Weather Service location-based elements.

NWS has many types of feeds, but they are all based on three elements: Current Observations are based on the local Reporting Station. Radar images are based on the local Radar Location. Forecasts and watches/warnings/alerts are based on the State Zone.

There is no simple way to grab those three elements (Reporting Station, Radar Location, Zone), but they are built into the forecast pages, so I wrote a web scraper to figure them out from lat/lon.

#/bin/sh
# Usage $ script latitude longitude

# Pull a USA National Weather Service forecast page using lat and lon, 
# and scrape weather station, radar, and zone information.
web_page=$(wget -q -O - "http://forecast.weather.gov/MapClick.php?lat=${1}&lon=${2}")

Station=$(echo "$web_page" | \
          grep 'div class="current-conditions-location">' | \
          cut -d'(' -f2 | cut -d')' -f1 ) 

Station_Location=$(echo "$web_page" | \
                   grep 'div class="current-conditions-location">' | \
                   cut -d'>' -f2 | cut -d'(' -f1 ) 

Radar=$(echo "$web_page" | \
        grep 'div class="div-full">.*class="radar-thumb"' | \
        cut -d'/' -f8 | cut -d'_' -f1 )

radar_web_page="http://radar.weather.gov/radar.php?rid="
radar_1=$(echo $Radar | tr [:upper:] [:lower:])
Radar_Location=$(wget -q -O - "${radar_web_page}${radar_1}" | \
                 grep "title>" | \
                 cut -d' ' -f5- | cut -d'<' -f1)

Zone=$(echo "$web_page" | \
       grep 'a href="obslocal.*>More Local Wx' | \
       cut -d'=' -f3 | cut -d'&' -f1)

echo "This location is in Weather Service zone $Zone"
echo "The closest weather station is $Station at $Station_Location"
echo "The closest radar is $Radar at $Radar_Location"



3) Current Conditions

NWS takes current conditions at least once each hour. Each reading is released as METAR reports (both raw and decoded), and non-METAR reports.

Raw METAR reports look like this:

$ wget -q -O - http://weather.noaa.gov/pub/data/observations/metar/stations/KMKE.TXT
2012/09/04 03:52
KMKE 040352Z 00000KT 10SM BKN140 BKN250 25/20 A2990 RMK AO2 SLP119 T02500200

Raw METAR can be tough to parse - lots of brevity codes to expand, and the number of fields can be variable. For example, if there are multiple layers of clouds, each gets reported.

Here's the same observation in  decoded format. Note that the raw format is included on the next-to-last line:

$ wget -q -O - http://weather.noaa.gov/pub/data/observations/metar/decoded/KMKE.TXT
GEN MITCHELL INTERNATIONAL  AIRPORT, WI, United States (KMKE) 42-57N 87-54W 206M
Sep 03, 2012 - 11:52 PM EDT / 2012.09.04 0352 UTC
Wind: Calm:0
Visibility: 10 mile(s):0
Sky conditions: mostly cloudy
Temperature: 77.0 F (25.0 C)
Dew Point: 68.0 F (20.0 C)
Relative Humidity: 73%
Pressure (altimeter): 29.9 in. Hg (1012 hPa)
ob: KMKE 040352Z 00000KT 10SM BKN140 BKN250 25/20 A2990 RMK AO2 SLP119 T02500200
cycle: 4

Raw and decoded METAR reports are available from NWS for all international METAR stations, too. For example, try it for station HUEN (Entebbe ariport, Uganda).

Finally, METAR reports are available in XML, too:

$ wget -q -O - "http://www.aviationweather.gov/adds/dataserver_current/httpparam?datasource=metars&requesttype=retrieve&format=xml&hoursBeforeNow=1&stationString=KMKE"
<?xml version="1.0" encoding="UTF-8"?>

  <request_index>2288456</request_index>
  <data_source name="metars" />
  <request type="retrieve" />
  <errors />
  <warnings />
  <time_taken_ms>4</response>
  <data num_results="2">
    <METAR>
      <raw_text>KMKE 072052Z 36009KT 8SM -RA SCT023 BKN029 OVC047 18/15 A2982 RMK AO2 RAB31 SLP094 P0003 60003 T01780150 53005</raw_text>
      <station_id>KMK</station_id>
      <observation_time>2012-09-07T20:52:00Z</observation_time>
      <latitude>42.95</latitude>
      <longitude>-87.9</longitude>
      <temp_c>17.8</temp_c>
      <dewpoint_c>15.0</dewpoint_c>
      <wind_dir_degrees>360</wind_dir_degrees>
      <wind_speed_kt>9</wind_speed_kt>
      <visibility_statute_mi>8.0</visibility_statute_mi>
      <altim_in_hg>29.819881</altim_in_hg>
      <sea_level_pressure_mb>1009.4</sea_level_pressure_mb>
      <quality_control_flags>
        <auto_station>TRUE</auto_station>
      </quality_control_flags>
      <wx_string>-RA</wx_string>
      <sky_condition sky_cover="SCT" cloud_base_ft_agl="2300" />
      <sky_condition sky_cover="BKN" cloud_base_ft_agl="2900" />
      <sky_condition sky_cover="OVC" cloud_base_ft_agl="4700" />
      <flight_category>MVFR</flight_category>
      <three_hr_pressure_tendency_mb>0.5</three_hr_pressure_tendency_mb>
      <precip_in>0.03</precip_in>
      <pcp3hr_in>0.03</pcp3hr_in>
      <metar_type>METAR</metar_type>
      <elevation_m>206.0</elevation_m>
    </METAR>
  </data>
</response>

Non-METAR reports are somewhat similar to the METAR XML, but there are some important differences. Non-METAR looks like this:

$ wget -q -O - http://w1.weather.gov/xml/current_obs/KMKE.xml
<current_observation version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nonamespaceschemalocation="http://www.weather.gov/view/current_observation.xsd">
 <credit>NOAA's National Weather Service</credit>
 <credit_url>http://weather.gov/</credit_url>
 <img />
  <url>http://weather.gov/images/xml_logo.gif</url>
  <title>NOAA's National Weather Service</title>
  <link>http://weather.gov</link>
 
 <suggested_pickup>15 minutes after the hour</suggested_pickup>
 <suggested_pickup_period>60</suggested_pickup_period>
 <location>Milwaukee, General Mitchell International Airport, WI</location>
 <station_id>KMKE</station_id>
 <latitude>42.96</latitude>
 <longitude>-87.9</longitude>
 <observation_time>Last Updated on Sep 3 2012, 9:52 pm CDT</observation_time>
        <observation_time_rfc822>Mon, 03 Sep 2012 21:52:00 -0500</observation_time_rfc822>
 <weather>Mostly Cloudy</weather>
 <temperature_string>75.0 F (23.9 C)</temperature_string>
 <temp_f>75.0</temp_f>
 <temp_c>23.9</temp_c>
 <relative_humidity>79</relative_humidity>
 <wind_string>Southeast at 4.6 MPH (4 KT)</wind_string>
 <wind_dir>Southeast</wind_dir>
 <wind_degrees>150</wind_degrees>
 <wind_mph>4.6</wind_mph>
 <wind_kt>4</wind_kt>
 <pressure_string>1011.4 mb</pressure_string>
 <pressure_mb>1011.4</pressure_mb>
 <pressure_in>29.88</pressure_in>
 <dewpoint_string>68.0 F (20.0 C)</dewpoint_string>
 <dewpoint_f>68.0</dewpoint_f>
 <dewpoint_c>20.0</dewpoint_c>
 <visibility_mi>10.00</visibility_mi>
  <icon_url_base>http://w1.weather.gov/images/fcicons/</icon_url_base>
 <two_day_history_url>http://www.weather.gov/data/obhistory/KMKE.html</two_day_history_url>
 <icon_url_name>nbkn.jpg</icon_url_name>
 <ob_url>http://www.nws.noaa.gov/data/METAR/KMKE.1.txt</ob_url>
 <disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url>
 <copyright_url>http://weather.gov/disclaimer.html</copyright_url>
 <privacy_policy_url>http://weather.gov/notice.html</privacy_policy_url>

There's a lot of good stuff here - how often the data is refreshed, and when each hour to do so, all the current observations in a multitude of formats, and even a suggested icon URL and history. However, observations tends to update about 10-15 minutes later than METAR reports. 

Yeah, the NWS uses (at least) two different XML servers, plus an http server to serve the same current condition observations in (at least) four different formats. I don't understand why, either.

I use the following in my Current Conditions display: Station, Time, Sky Conditions, Temp, Humidity, Wind Direction, Wind Speed. So my script below handles only them.

#!/bin/sh
# Usage $ script station [ metar | nonmetar ]
# $1 is the Station Code (KMKE)
# $2 is the metar/nonmetar flag

strip_tags () { sed -e 's/<[^>]*>//g'; }

# The information is the same, but formatted differently.
case $2 in
   metar)
      file=$(wget -q -O - http://weather.noaa.gov/pub/data/observations/metar/decoded/${1}.TXT)
      Observation_zulu=$(echo $file | grep -o "${Station}).* UTC" | cut -d' ' -f14-15)
      Observation_Time=$(date -d "$Observation_zulu" +%H:%M)
      Sky_Conditions=$(echo $file | grep -o "Sky conditions: .* Temperature" | \
                       cut -d' ' -f3- | cut -d'T' -f1)
      Temperature="$(echo $file | grep -o "Temperature: .* F" | cut -d' ' -f2)F"
      Humidity=$(echo $file | grep -o "Humidity: .*%" | cut -d' ' -f2) 
      Wind_Direction=$(echo $file | grep -o "Wind: .* degrees)" | cut -d' ' -f4)
      Wind_Speed=$(echo $file | grep -o "degrees) .* MPH" | cut -d' ' -f3-4);;

   nonmetar)
      file=$(wget -q -O - http://w1.weather.gov/xml/current_obs/${1}.xml)
      Observation_Time=$(echo $file | grep -o ".*" | cut -d' ' -f7)
      Sky_Conditions=$(echo $file | grep -o ".*" | strip_tags)
      Temperature="$(echo $file | grep -o '.*' | strip_tags)F"
      Humidity="$(echo $file | grep -o '.*' | strip_tags)%"
      Wind_Direction=$(echo $file | grep -o '.*' | strip_tags)
      Wind_Speed="$(echo $file | grep -o '.*' | strip_tags) MPH";;
esac

echo "Observations at ${1} as of ${Observation_Time}"
Spacer='   '
echo "${Sky_Conditions} ${Spacer} ${Temperature} ${Spacer} ${Humidity} ${Spacer} ${Wind_Direction} ${Wind_Speed}"

The output is very close, but not exactly identical. See the example below - both are based on the same observation, but the humidity and wind information are slightly different. That's not my format error...the data is coming from NWS that way.

$ sh current-conditions KMKE metar
Observations at KMKE as of 11:52
partly cloudy      81.0F     71%     ESE 9 MPH

$ sh current-conditions KMKE nonmetar
Observations at KMKE as of 11:52
Partly Cloudy     81.0F     72%     East 9.2 MPH

Incidentally, this shows that METAR and non-METAR data use the same observation, so there's no improvement using both data sources. METAR updates sooner and has smaller files, but non-METAR is easier to parse.



4) Forecasts 

NWS has three sources for forecasts: Scraping the web pages, downloading normal text, and the National Digital Forecast Database xml server. Scraping the web page is easy, and scraping techniques are well beyond the scope of what I want to talk about. But here's an example of scraping a forecast using lat/lon:

$ wget -q -O - "http://forecast.weather.gov/MapClick.php?lat=43.0633&lon=-87.9666" | grep -A 10 '"point-forecast-7-day"'
<ul class="point-forecast-7-day">
<li class="row-odd"><span class="label">This Afternoon</span> A 20 percent chance of showers and thunderstorms.  Mostly sunny, with a high near 85. Southeast wind around 5 mph. </ul>
<li class="row-even"><span class="label">Tonight</span> A 30 percent chance of showers and thunderstorms after 1am.  Mostly cloudy, with a low around 69. Calm wind. </li>
<li class="row-odd"><span class="label">Wednesday</span> Showers and thunderstorms likely.  Mostly cloudy, with a high near 83. Southwest wind 5 to 10 mph.  Chance of precipitation is 70%. New rainfall amounts between a quarter and half of an inch possible. </li>
<li class="row-even"><span class="label">Wednesday Night</span> Mostly clear, with a low around 59. Northwest wind 5 to 10 mph. </li>
<li class="row-odd"><span class="label">Thursday</span> Sunny, with a high near 78. Northwest wind 5 to 10 mph. </li>
<li class="row-even"><span class="label">Thursday Night</span> A 20 percent chance of showers.  Partly cloudy, with a low around 60. West wind around 5 mph. </li>
<li class="row-odd"><span class="label">Friday</span> A 30 percent chance of showers.  Mostly cloudy, with a high near 73. West wind around 5 mph becoming calm  in the afternoon. </li>
<li class="row-even"><span class="label">Friday Night</span> A 30 percent chance of showers.  Mostly cloudy, with a low around 57. Calm wind becoming north around 5 mph after midnight. </li>
<li class="row-odd"><span class="label">Saturday</span> A 20 percent chance of showers.  Mostly sunny, with a high near 69.</li>
<li class="row-even"><span class="label">Saturday Night</span> Partly cloudy, with a low around 56.</li>

The same information is available in a much smaller file by downloading the forecast text for the zone. The text isn't as pretty, but extra information (like the release time) are included. Reformatting to lower case can be a bit of sed work:

$ wget -q -O - http://weather.noaa.gov/pub/data/forecasts/zone/wi/wiz066.txt
Expires:201209042115;;043092
FPUS53 KMKX 041354 AAA
ZFPMKX
SOUTH-CENTRAL AND SOUTHEAST WISCONSIN ZONE FORECAST...UPDATED
NATIONAL WEATHER SERVICE MILWAUKEE/SULLIVAN WI
854 AM CDT TUE SEP 4 2012

WIZ066-042115-
MILWAUKEE-
INCLUDING THE CITIES OF...MILWAUKEE
854 AM CDT TUE SEP 4 2012
.REST OF TODAY...PARTLY SUNNY. A 20 PERCENT CHANCE OF
THUNDERSTORMS IN THE AFTERNOON. HIGHS IN THE MID 80S. NORTHEAST
WINDS UP TO 5 MPH SHIFTING TO THE SOUTHEAST IN THE AFTERNOON. 
.TONIGHT...PARTLY CLOUDY UNTIL EARLY MORNING THEN BECOMING MOSTLY
CLOUDY. A 30 PERCENT CHANCE OF THUNDERSTORMS AFTER MIDNIGHT. LOWS
IN THE UPPER 60S. SOUTHWEST WINDS UP TO 5 MPH. 
.WEDNESDAY...THUNDERSTORMS LIKELY. HIGHS IN THE MID 80S.
SOUTHWEST WINDS UP TO 10 MPH. CHANCE OF THUNDERSTORMS 70 PERCENT.
.WEDNESDAY NIGHT...PARTLY CLOUDY THROUGH AROUND MIDNIGHT THEN
BECOMING CLEAR. LOWS AROUND 60. NORTHWEST WINDS 5 TO 10 MPH. 
.THURSDAY...SUNNY. HIGHS IN THE UPPER 70S. NORTHWEST WINDS 5 TO
15 MPH. 
.THURSDAY NIGHT...PARTLY CLOUDY WITH A 20 PERCENT CHANCE OF LIGHT
RAIN SHOWERS. LOWS AROUND 60. 
.FRIDAY...MOSTLY CLOUDY WITH A 30 PERCENT CHANCE OF LIGHT RAIN
SHOWERS. HIGHS IN THE LOWER 70S. 
.FRIDAY NIGHT...MOSTLY CLOUDY THROUGH AROUND MIDNIGHT THEN
BECOMING PARTLY CLOUDY. A 30 PERCENT CHANCE OF LIGHT RAIN
SHOWERS. LOWS IN THE UPPER 50S. 
.SATURDAY...PARTLY SUNNY WITH A 20 PERCENT CHANCE OF LIGHT RAIN
SHOWERS. HIGHS IN THE UPPER 60S. 
.SATURDAY NIGHT...PARTLY CLOUDY WITH A 20 PERCENT CHANCE OF LIGHT
RAIN SHOWERS. LOWS IN THE MID 50S. 
.SUNDAY...MOSTLY SUNNY. HIGHS AROUND 70. 
.SUNDAY NIGHT...PARTLY CLOUDY. LOWS IN THE UPPER 50S. 
.MONDAY...MOSTLY SUNNY. HIGHS IN THE LOWER 70S. 
$$

Finally, the National Digital Forecast Database is a server that organizes many discrete bits of bits of forecast data, like the high temperature for a specific 12-hour period, or the probability of precipitation. it's the data without all the words. Each forecast element applies to a 5km square and a 12-hour period. For example, try looking at this 7-day snapshot of a specific lat/lon (it's too big to reproduce here):

$ wget -q -O - "http://graphical.weather.gov/xml/SOAP_server/ndfdXMLclient.php?whichClient=NDFDgen&lat=38.99&lon=-77.01&product=glance&begin=2004-01-01T00%3A00%3A00&end=2016-09-04T00%3A00%3A00&Unit=e&maxt=maxt&pop12=pop12&sky=sky&wx=wx&wwa=wwa"

All the elements of the URL are described here. It's a really amazing system, but it's not oriented toward a casual end user...but at the same time, some pieces may be very useful, like future watches/warnings, the 12-hour probability of precipitation, expected highs and lows, etc. Here's an example script to download and reformat the straight text for a zone forecast, followed by a sample run. All the reformatting has been moved into functions. You can see how it's not difficult to separate and reformat the various forecast elements.

#/bin/sh
# Usage $ script zone
# Example: $ script wiz066
# $1 is the zone (wiz066)

# Starts each forecast with a "@"
# Replace ". ."  with "@" This separates the second and subsequent forecasts.
# Replace the single " ."  with "@" at the beginning of the first forecast.
# Trim the "$$" at the end of the message. 
separate_forecasts () { sed -e 's|\. \.|@|g' \
                            -e 's| \.|@|g' \
                            -e 's| \$\$||g'; }

# Make uppercase into lowercase
# Then recapitalize the first letter in each paragraph.
# Then recapitalize the first letter in each new sentence.
# Then substitute a ":" for the "..." and capitalize the first letter.
lowercase () { tr [:upper:] [:lower:] | \
               sed -e 's|\(^[a-z]\)|\U\1|g' \
                   -e 's|\(\.\ [a-z]\)|\U\1|g' \
                   -e 's|\.\.\.\([a-z]\)|: \U\1|g'; }

State=$(echo $1 | cut -c1-2)
raw_forecast=$(wget -q -O - http://weather.noaa.gov/pub/data/forecasts/zone/${State}/${1}.txt)
for period in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; do
   echo ""
   if [ ${period} -eq 1 ]; then
      header=$(echo $raw_forecast | separate_forecasts | cut -d'@' -f${period})
      header_size=$(echo $header | wc -w)
      header_zulu="$(echo $header | cut -d' ' -f4 | cut -c3-4):$(echo $header | cut -d' ' -f4 | cut -c5-6)Z"
      issue_time="$(date -d "$header_zulu" +%H:%M)"
      expire_time="$(echo $header | cut -d':' -f2 | cut -c9-10):$(echo $header | cut -d':' -f2 | cut -c11-12)"
      echo "Issue Time ${issue_time}, Expires ${expire_time}"
   else
      echo $raw_forecast | separate_forecasts | cut -d'@' -f${period} | lowercase
   fi
done
echo ""

And when you run it, it looks like:
 
$ sh forecast wiz066

Issue Time 15:35, Expires 09:15

Tonight: Partly cloudy until early morning then becoming mostly cloudy. Chance of thunderstorms in the evening: Then slight chance of thunderstorms in the late evening and overnight. Lows in the upper 60s. Southwest winds up to 5 mph. Chance of thunderstorms 30 percent

Wednesday: Thunderstorms likely. Highs in the lower 80s. Southwest winds 5 to 10 mph. Chance of thunderstorms 70 percent

Wednesday night: Partly cloudy. Lows in the lower 60s. Northwest winds 5 to 10 mph

Thursday: Sunny. Highs in the upper 70s. Northwest winds up to 10 mph

Thursday night: Partly cloudy through around midnight: Then mostly cloudy with a 20 percent chance of light rain showers after midnight. Lows in the upper 50s. West winds up to 5 mph

Friday: Mostly cloudy with chance of light rain showers and slight chance of thunderstorms. Highs in the lower 70s. Chance of precipitation 30 percent

Friday night: Mostly cloudy with a 40 percent chance of light rain showers. Lows in the upper 50s

Saturday: Mostly sunny with a 20 percent chance of light rain showers. Highs in the upper 60s

Saturday night: Mostly clear. Lows in the mid 50s

Sunday: Sunny. Highs in the lower 70s

Sunday night: Mostly clear. Lows in the upper 50s

Monday: Sunny. Highs in the lower 70s

Monday night: Partly cloudy with a 20 percent chance of light rain showers. Lows in the upper 50s
 


5) Radar Images

NWS radar stations are evenly spread across the United States, to form a (more-or-less) blanket of coverage in the lower 48 states, plus spots in Alaska, Hawaii, Guam, Puerto Rico, and others.

Radars have their own station codes that may not correspond with local reporting stations. Each radar takes about 10 seconds to complete a 360 degree sweep. NWS radar images are in .png format, and are a composite of an entire sweep. The images are subject to ground clutter, humidity, smoke (and other obscurants). NWS does not clean up clutter or obsurants in the images. An animated (time-lapse) radar is just a series of images. NWS releases updated images irregularly...about every 5-10 minutes.

There are three scales: local, regional composite, and national composite. The larger ones are just the locals quilted together. For our purpose, I think the locals are adequate. There are two ways to download radar images - the Lite method and the Ridge method. Both use the same data, create the same size 600x550 pixel image, and can be used in animated loops. Lite images have no options - you get a current radar image in .png format (27k) with a set of options that you cannot change. It's a very easy way to get a fast image. Here's a Lite image example of base reflectivity:

$ wget http://radar.weather.gov/lite/N0R/MKX_0.png
 

In Lite, the end of the URL,  /N0R/MKX_0.png, N0R is the radar image type (see here for the list of N** options), MKX is the radar station, 0.png is the latest image (1.png is the next oldest).

For more customized maps, the ridge method provides a series of hideable overlays (base map, radar image, title and legend, county/state lines, major highways, etc). For an example of Ridge in action, see this page. When downloading images, it's a little more complex - each overlay must be downloaded separately, then combined on your system (not at NWS). Here's an example of a script that caches the overlays that don't change (counties), and compares the server update time with it's own cached version to avoid needless updates.

#/bin/sh
# Usage $ script radar_station [clear-radar-cache]
# Example: $ script MKX
# Example: $ script MKX clear-radar-cache

#$1 is the radar station (MKX)
#$2 is a flag to clear the cache of radar overlays. Most overlays don't 
#   change, and don't need to be re-downloaded every few minutes.

cache="/tmp/radar-cache"

# Test for the clear-cache-flag. If so, delete the entire cache and exit.
[ "$2" = "clear-radar-cache" ] && echo "Clearing cache..." && \
                                  rm -r /tmp/radar-cache && \
                                  exit 0

# Test that the radar cache exists. If not, create it.
[ -d ${cache} ] || mkdir ${cache}

# Test for each of the overlays for the N0R (Base Reflectivity) radar image.
# If the overlay is not there, download it.
[ -f ${cache}/${1}_Topo_Short.jpg ] || wget -q -P ${cache}/ http://radar.weather.gov/ridge/Overlays/Topo/Short/${1}_Topo_Short.jpg
[ -f ${cache}/${1}_County_Short.gif ] || wget -q -P ${cache}/ http://radar.weather.gov/ridge/Overlays/County/Short/${1}_County_Short.gif
[ -f ${cache}/${1}_Highways_Short.gif ] || wget -q -P ${cache}/ http://radar.weather.gov/ridge/Overlays/Highways/Short/${1}_Highways_Short.gif
[ -f ${cache}/${1}_City_Short.gif ] || wget -q -P ${cache}/ http://radar.weather.gov/ridge/Overlays/Cities/Short/${1}_City_Short.gif

# Test for the radar timestamp file. Read it. If it doesn't exist, create it.
[ -f ${cache}/radar_timestamp ] || echo "111111" > ${cache}/radar_timestamp
latest_local=$(cat ${cache}/radar_timestamp)

# Get the latest radar time from the server and compare it to the latest known.
# This avoids downloading the same image repeatedly.
radar_time_string=$(wget -S --spider http://radar.weather.gov/ridge/RadarImg/N0R/${1}_N0R_0.gif | \
                    grep "Last-Modified:" | cut -d':' -f2)
radar_time=$(date -d "$radar_time_string" +%s)
echo "Current image is ${radar_time}, cached is ${latest_local}"

# If the local timestamp is different from the server,
# Download a new image and update the timestamp file.
# Then create a new final radar-image.gif file.
if [ "${radar_time}" -ne "${latest_local}" ]; then
   echo "Downloading updated image..."
   echo "${radar_time}" > ${cache}/radar_timestamp

   # Delete the old radar, warning, and legend layers, and replace them.
   [ -f ${cache}/${1}_N0R_0.gif ] && rm ${cache}/${1}_N0R_0.gif
   wget -q -P ${cache}/ http://radar.weather.gov/ridge/RadarImg/N0R/${1}_N0R_0.gif
   [ -f ${cache}/${1}_Warnings_0.gif ] && rm ${cache}/${1}_Warnings_0.gif
   wget -q -P ${cache}/ http://radar.weather.gov/ridge/Warnings/Short/${1}_Warnings_0.gif
   [ -f ${cache}/${1}_N0R_Legend_0.gif ] && rm ${cache}/${1}_N0R_Legend_0.gif
   wget -q -P ${cache}/ http://radar.weather.gov/ridge/Legend/N0R/${1}_N0R_Legend_0.gif


   # Delete the old final radar-image. We are about to replace it.
   [ -f ${cache}/radar-image.jpg ] && rm ${cache}/radar-image.jpg

   # Create the final radar-image using imagemagick.
   composite -compose atop ${cache}/${1}_N0R_0.gif ${cache}/${1}_Topo_Short.jpg ${cache}/radar-image.jpg
   composite -compose atop ${cache}/${1}_County_Short.gif ${cache}/radar-image.jpg ${cache}/radar-image.jpg
   composite -compose atop ${cache}/${1}_Highways_Short.gif ${cache}/radar-image.jpg ${cache}/radar-image.jpg
   composite -compose atop ${cache}/${1}_City_Short.gif ${cache}/radar-image.jpg ${cache}/radar-image.jpg
   composite -compose atop ${cache}/${1}_Warnings_0.gif ${cache}/radar-image.jpg ${cache}/radar-image.jpg
   composite -compose atop ${cache}/${1}_N0R_Legend_0.gif ${cache}/radar-image.jpg ${cache}/radar-image.jpg

   echo "New radar image composite created at ${cache}/radar-image.jpg"
fi
exit 0 

And here's the result of the script using $ sh radar MKX:


Another handy use of imagemagick is to pad the sides or top/bottom of an image, enlarging the canvas without distorting or resizing the original image, to move the image around the desktop instead of leaving it in the center. This is especially handy so a top menu bar doesn't block the date/time title.

For example, this imagemagick command will add a transparent bar 15 pixels high to the top of the image, changing it from 600px wide by 550px tall to 600x565. See these instructions for more on how to use splice.

convert image.jpg -background none -splice 0x15 image.jpg
 


6) Warnings and Alerts

Warnings and Alerts are easily available in RSS format based on zone. For example, here is a zone with two alerts:
 
$ wget -q -O - "http://alerts.weather.gov/cap/wwaatmget.php?x=LAZ061&y=0"

<?xml version = '1.0' encoding = 'UTF-8' standalone = 'yes'?>
<!--
This atom/xml feed is an index to active advisories, watches and warnings 
issued by the National Weather Service.  This index file is not the complete 
Common Alerting Protocol (CAP) alert message.  To obtain the complete CAP 
alert, please follow the links for each entry in this index.  Also note the 
CAP message uses a style sheet to convey the information in a human readable 
format.  Please view the source of the CAP message to see the complete data 
set.  Not all information in the CAP message is contained in this index of 
active alerts.
-->

<feed xmlns:cap="urn:oasis:names:tc:emergency:cap:1.1" xmlns:ha="http://www.alerting.net/namespace/index_1.0" xmlns="http://www.w3.org/2005/Atom">

<!-- TZN = <cdt> -->
<!-- TZO = <-5> -->
<!-- http-date = Wed, 05 Sep 2012 11:49:00 GMT -->
<id>http://alerts.weather.gov/cap/wwaatmget.php?x=LAZ061&y=0</id>
<generator>NWS CAP Server</generator>
<updated>2012-09-05T06:49:00-05:00</updated>
<author>
<name>w-nws.webmaster@noaa.gov</name>
</author>

<title>Current Watches, Warnings and Advisories for Upper Jefferson (LAZ061) Louisiana Issued by the National Weather Service</title>
<link href="http://alerts.weather.gov/cap/wwaatmget.php?x=LAZ061&y=0"></link>

<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=LA124CC366E514.FlashFloodWatch.124CC367BC50LA.LIXFFALIX.706be3299506e82fae0eb45adc4650b7</id>
<updated>2012-09-05T06:49:00-05:00</updated>
<published>2012-09-05T06:49:00-05:00</published>
<author>
<name>w-nws.webmaster@noaa.gov</name>
</author>
Flash Flood Watch issued September 05 at 6:49AM CDT until September 05 at 12:00PM CDT by NWS
<link href="http://alerts.weather.gov/cap/wwacapget.php?x=LA124CC366E514.FlashFloodWatch.124CC367BC50LA.LIXFFALIX.706be3299506e82fae0eb45adc4650b7"></link>
<summary>...FLASH FLOOD WATCH REMAINS IN EFFECT THROUGH 7 AM CDT... .A LARGE CLUSTER OF THUNDERSTORMS WITH VERY HEAVY RAINFALL WAS MOVING OFF THE MISSISSIPPI COAST AND ADVANCING TOWARDS LOWER SOUTHEAST LOUISIANA. ANY HEAVY RAINFALL WILL EXACERBATE ANY ONGOING FLOODING REMAINING FROM ISAAC. ...FLASH FLOOD WATCH IN EFFECT UNTIL NOON CDT TODAY...</summary>
<cap:event>Flash Flood Watch</cap:event>
<cap:effective>2012-09-05T06:49:00-05:00</cap:effective>
<cap:expires>2012-09-05T12:00:00-05:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgtype>Alert</cap:msgtype>
<cap:category>Met</cap:category>
<cap:urgency>Expected</cap:urgency>
<cap:severity>Severe</cap:severity>
<cap:certainty>Possible</cap:certainty>
<cap:areadesc>LAZ069; Lower Jefferson; Lower St. Bernard; Orleans; Upper Jefferson; Upper Plaquemines; Upper St. Bernard</cap:areadesc>
<cap:polygon></cap:polygon>
<cap:geocode>
<valuename>FIPS6</valuename>
<value>022051 022071 022075 022087</value>
<valuename>UGC</valuename>
<value>LAZ061 LAZ062 LAZ063 LAZ064 LAZ068 LAZ069 LAZ070</value>
</cap:geocode>
<cap:parameter>
<valuename>VTEC</valuename>
<value>/O.EXB.KLIX.FF.A.0012.120905T1200Z-120905T1700Z/
/00000.0.ER.000000T0000Z.000000T0000Z.000000T0000Z.OO/</value>
</cap:parameter>
</entry>

<entry>
<id>http://alerts.weather.gov/cap/wwacapget.php?x=LA124CC3668F88.HeatAdvisory.124CC3746680LA.LIXNPWLIX.4bafb11f62c10ea1605a5c0d076d64c9</id>
<updated>2012-09-05T04:30:00-05:00</updated>
<published>2012-09-05T04:30:00-05:00</published>
<author>
<name>w-nws.webmaster@noaa.gov</name>
</author>
<title>Heat Advisory issued September 05 at 4:30AM CDT until September 05 at 7:00PM CDT by NWS</title>
<link href="http://alerts.weather.gov/cap/wwacapget.php?x=LA124CC3668F88.HeatAdvisory.124CC3746680LA.LIXNPWLIX.4bafb11f62c10ea1605a5c0d076d64c9"></link>
<summary>...A HEAT ADVISORY REMAINS IN EFFECT FOR AREAS WITHOUT POWER... .THE CUMULATIVE AFFECT OF TYPICALLY HOT AND HUMID CONDITIONS COMBINED WITH THE LACK OF CLIMATE CONTROL DUE TO POWER OUTAGES FROM HURRICANE ISAAC HAVE CREATED A LIFE THREATENING SITUATION. ...HEAT ADVISORY REMAINS IN EFFECT UNTIL 7 PM CDT THIS EVENING... * BASIS...MAXIMUM HEAT INDICES BETWEEN 100 AND 106 IS EXPECTED</summary>
<cap:event>Heat Advisory</cap:event>
<cap:effective>2012-09-05T04:30:00-05:00</cap:effective>
<cap:expires>2012-09-05T19:00:00-05:00</cap:expires>
<cap:status>Actual</cap:status>
<cap:msgtype>Alert</cap:msgtype>
<cap:category>Met</cap:category>
<cap:urgency>Expected</cap:urgency>
<cap:severity>Minor</cap:severity>
<cap:certainty>Likely</cap:certainty>
<cap:areadesc>LAZ069; Lower Jefferson; Lower Lafourche; Lower St. Bernard; Orleans; St. Charles; St. James; St. John The Baptist; Upper Jefferson; Upper Lafourche; Upper Plaquemines; Upper St. Bernard</cap:areadesc>
<cap:polygon></cap:polygon>
<cap:geocode>
<valuename>FIPS6</valuename>
<value>022051 022057 022071 022075 022087 022089 022093 022095</value>
<valuename>UGC</valuename>
<value>LAZ057 LAZ058 LAZ059 LAZ060 LAZ061 LAZ062 LAZ063 LAZ064 LAZ067 LAZ068 LAZ069 LAZ070</value>
</cap:geocode>
<cap:parameter>
<valuename>VTEC</valuename>
<value>/O.CON.KLIX.HT.Y.0004.000000T0000Z-120906T0000Z/</value>
</cap:parameter>
</entry>

Each alert is within it's own <entry>, and has <severity>, <published>, <updated>, and <expires> tags, among other cool info. Unlike radar, you can't check the RSS feed to see if anything is new before downloading it all. So tracking the various alerts must be done by your system.

That VTEC line seems pretty handy - a standard, set of codes that explain most of the event, and include a reference number. See here for more VTEC information.

 Here's a sample script that caches alerts. If a new alert comes in, it pops up a notification. It also tracks which active alerts have already been notified, so each time the script runs you don't get spammed.

#/bin/sh
# Usage $ script zone
# Example: $ script WIZ066

#$1 is the zone (WIZ066)
cache="/tmp/alert-cache"

strip_tags () { sed -e 's/<[^>]*>//g'; }

# Get the RSS feed of active alerts in zone
alerts=$(wget -q -O - "http://alerts.weather.gov/cap/wwaatmget.php?x=${1}&y=0")

# No alerts - if a cache exists, delete it.
if [ $(echo "$alerts" | grep -c "There are no active watches") -eq "1" ]; then
  echo "No active alerts in zone ${1}"  
  [ -d ${cache} ] && rm -r ${cache}/
  exit 0
fi

# Get the number of active alerts
num_of_alerts=$(echo "$alerts" | grep -c "")
echo "${num_of_alerts} active item(s)"

# Test for an existing cache. If lacking, create one.
# Create a list of cached alert ids. Each cached alert's filename is the id.
[ -d ${cache} ] || mkdir ${cache}
cached_alerts=$(ls ${cache})

# Loop through each online alert
for entry_startline in $(echo "$alerts" | grep -n "" | cut -d':' -f1)
do
   alert=$(echo "$alerts" | tail -n +$( expr ${entry_startline}) | head -n 32)
   alert_id=$(echo "$alert" | grep "" | strip_tags | cut -d"." -f8)
   alert_title=$(echo "$alert" | grep "" | strip_tags )

   # Test if the alert is already cached.
   if [ $(echo "${cached_alerts}" | grep -c "${alert_id}") -eq 1 ]; then

      # The alert already exists. Do not notify it or re-cache it.
      echo "Alert ${alert_id}, ${alert_title} has already been notified."
   else

      # New alert. Notify and cache
      alert_body=$(echo "$alert" | grep "" | strip_tags )
      raw_alert_issued=$(echo "$alert" | grep "" | strip_tags )
      alert_issued=$(expr $(expr $(date +%s) - $(date -d "${raw_alert_issued}" +%s)) / 60 )
      echo "New ${alert_title} issued ${alert_issued} minute(s) ago"
      notify-send "${alert_title}" "${alert_body}"
      echo "${alert}" > ${cache}/${alert_id}
   fi
done

# Loop though each item in the cache, and ensure it's not expired.
# If expired, delete it.
for alert in ${cached_alerts}; do
   raw_expire_time=$(cat ${cache}/${alert} | grep "" | strip_tags)
   [ $(date -d "${raw_expire_time}" +%s) -le $(date +%s) ] && rm ${cache}/${alert}
done
exit 0

5 comments:

Unknown said...

Ian, this is great information, thanks for sharing.

In your forecasts section, you have:

$ wget -q -O - http://weather.noaa.gov/pub/data/forecasts/zone/wi/wiz066.txt

I look at $ wget -q -O - http://weather.noaa.gov/pub/data/forecasts/zone/wi/
and similar for other states, and see that some zones are current, while others are over five years old.

Do you know of a good source that would have current zone information for _all_ NWS zones for a given state?

Everything I've found at NWS is incomplete, having only ~ 55 - 60% of zones with current forecasts.

Thanks in advance.

Ian said...

Not incomplete - those look to me like old/obsolete/withdrawn or seasonal zones.

Don't scrape those directories for a canonical list of zones. That's not what they are meant for.

I publish a table of current zones at https://raw.githubusercontent.com/ian-weisser/data/master/zone.csv (does not include maritime zones)

I get that data weekly from http://www.nws.noaa.gov/geodata/catalog/wsom/html/cntyzone.htm

Unknown said...

Ian,

Thanks for the quick reply. This looks promising, I'll give it a try.

Code Monkey said...

I'm using your information to make a quick and dirty Weather RSS feed. Did you find a way to capture the nice graphics that weather.gov does? (i.e.: http://imgur.com/EU9HpUC )

Ian said...

I did not try to capture forecast graphics, nor try to use RSS for current conditions, forecasts, or radar images. I don't see a forecast RSS feed outside Alaska/Hawaii.

If you do reuse NWS graphics, remember to provide proper credit.