web-dev-qa-db-fra.com

utiliser wget pour télécharger le code XML pour le script conky ne conservera pas les balises html

J'essaie donc de configurer un conkyrc pour la météo en utilisant le script suivant:

#!/bin/bash

station="KRAP.xml"
wdir='/tmp/weather'

update_xml() {
          if [ ! -e "$station" ]; then
                  wget -q http://w1.weather.gov/xml/current_obs/${station}
                  [ -e "$station" ] && touch "${station}"
else
    # dtime: time the .xml file was downloaded
    # otime: time the weather data was observed
    # ctime: current time (time this script is being run)
    dtime=$(stat -c %Y $station)
    otime=$(date -d "$utime" +%s)
    ctime=$(date +%s)

    if (( "$otime" + 4507 < "$ctime" )); then
        if (( "$dtime" + 307 < "$ctime" )); then
            wget -q -O "$station" http://w1.weather.gov/xml/current_obs/${station}
            [ -e "$station" ] && touch "${station}"
        fi
    fi
fi
}

from_xml() { xmllint -xpath "//$1" - <<< "$xml" | sed 's/<[^>]*>//g'; }

[ -d "$wdir" ] || mkdir -p "$wdir"
cd "$wdir" || exit 1

xml=''
[ -r $station ] && xml="$(< $station)"
( update_xml >/dev/null 2>&1 ) &

if [ -n "$xml" ]; then
        location=$(from_xml "location")
        utime=$(from_xml "observation_time_rfc822")
        weather=$(from_xml "weather")
        temperature=$(from_xml "temp_f")
        humid=$(from_xml "relative_humidity")
        wind_dir=$(from_xml "wind_dir")
        case "$wind_dir" in
                "North") wind_dir="N" ;;
                "South") wind_dir="S" ;;
                "East") wind_dir="E" ;;
                "West") wind_dir="W" ;;
                "Northwest") wind_dir="NW" ;;
                "Northeast") wind_dir="NE" ;;
                "Southwest") wind_dir="SW" ;;
                "Southeast") wind_dir="SE" ;;
       esac
       wind_speed=$(from_xml "wind_kt")
       baro_pressure=$(from_xml "pressure_in")

       echo "$location"
       echo "Updated: $(date -d "$utime" 2>/dev/null )"
       printf 'Weather: %s %s°F\n' "$weather" "$temperature"
       printf 'Barometric Pressure: %s inches\n' "$baro_pressure"
       printf 'Wind: %s at %s knots\n' "$wind_dir" "$wind_speed"
       printf 'Humidity: %s%%\n' "$humid"
else
       echo "No weather data available."
fi

Mais quand il télécharge le XML, il exclut toutes les informations d'identification, donc au lieu de télécharger le code source comme ceci:

<?xml version="1.0" encoding="ISO-8859-1"?> 
<?xml-stylesheet href="latest_ob.xsl" type="text/xsl"?>
 <current_observation version="1.0"
 xmlns:xsd="http://www.w3.org/2001/XMLSchema"
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:noNamespaceSchemaLocation="http://www.weather.gov/view/current_observation.xsd">
<credit>NOAA's National Weather Service</credit>
<credit_URL>http://weather.gov/</credit_URL>
<image>
    <url>http://weather.gov/images/xml_logo.gif</url>
    <title>NOAA's National Weather Service</title>
    <link>http://weather.gov</link>
</image>
<suggested_pickup>15 minutes after the hour</suggested_pickup>
<suggested_pickup_period>60</suggested_pickup_period>
<location>Rapid City, Rapid City Regional Airport, SD</location>
<station_id>KRAP</station_id>
<latitude>44.04556</latitude>
<longitude>-103.05389</longitude>
<observation_time>Last Updated on Jul 31 2015, 8:52 am MDT</observation_time>
    <observation_time_rfc822>Fri, 31 Jul 2015 08:52:00 -0600</observation_time_rfc822>
<weather>Fair</weather>
<temperature_string>72.0 F (22.2 C)</temperature_string>
<temp_f>72.0</temp_f>
<temp_c>22.2</temp_c>
<relative_humidity>52</relative_humidity>
<wind_string>North at 8.1 MPH (7 KT)</wind_string>
<wind_dir>North</wind_dir>
<wind_degrees>340</wind_degrees>
<wind_mph>8.1</wind_mph>
<wind_kt>7</wind_kt>
<pressure_string>1022.2 mb</pressure_string>
<pressure_mb>1022.2</pressure_mb>
<pressure_in>30.24</pressure_in>
<dewpoint_string>53.1 F (11.7 C)</dewpoint_string>
<dewpoint_f>53.1</dewpoint_f>
<dewpoint_c>11.7</dewpoint_c>
<visibility_mi>10.00</visibility_mi>
<icon_url_base>http://forecast.weather.gov/images/wtf/small/</icon_url_base>
<two_day_history_url>http://www.weather.gov/data/obhistory/KRAP.html</two_day_history_url>
<icon_url_name>skc.png</icon_url_name>
<ob_url>http://www.weather.gov/data/METAR/KRAP.1.txt</ob_url>
<disclaimer_url>http://weather.gov/disclaimer.html</disclaimer_url>
<copyright_url>http://weather.gov/disclaimer.html</copyright_url>
<privacy_policy_url>http://weather.gov/notice.html</privacy_policy_url>

Il crée le document XML comme ceci:

NOAA's National Weather Service
http://weather.gov/

    http://weather.gov/images/xml_logo.gif
    NOAA's National Weather Service
    http://weather.gov

15 minutes after the hour
60
Rapid City, Rapid City Regional Airport, SD
KRAP
44.04556
-103.05389
Last Updated on Jul 31 2015, 8:52 am MDT
    Fri, 31 Jul 2015 08:52:00 -0600
Fair
72.0 F (22.2 C)
72.0
22.2
52
North at 8.1 MPH (7 KT)
North
340
8.1
7
1022.2 mb
1022.2
30.24
53.1 F (11.7 C)
53.1
11.7
10.00
 http://forecast.weather.gov/images/wtf/small/
http://www.weather.gov/data/obhistory/KRAP.html
skc.png
http://www.weather.gov/data/METAR/KRAP.1.txt
http://weather.gov/disclaimer.html
http://weather.gov/disclaimer.html
http://weather.gov/notice.html

Il supprime toutes les balises HTML et ne donne que les informations brutes. Ensuite, mon script ne peut pas extraire "emplacement" car il n'est pas identifié dans le téléchargement XML. J'ai essayé d'utiliser "wget ​​-F" pour forcer le langage HTML, mais cela ne fait aucune différence. Est-ce que je manque quelque chose?

3
Brandon

XML est ok, l'analyse est via awk, E.G .:

emplacement

curl http://w1.weather.gov/xml/current_obs/KRAP.xml | \
    awk -F'[<|>]' '/<location>/ {print $3}'

Sortie

Rapid City, Rapid City Regional Airport, SD

emplacement et météo

curl http://w1.weather.gov/xml/current_obs/KRAP.xml \
    | awk -F'[<|>]' '/<location>|<weather>/ {print $3}'

Sortie

Rapid City, Rapid City Regional Airport, SD
Fair

emplacement et météo dans un autre format de sortie

curl http://w1.weather.gov/xml/current_obs/KRAP.xml \
    | awk -F'[<|>]' '/<location>/ {print "Location:"$3}/<weather>/ {print "Weather:"$3}'

Sortie

Location:Rapid City, Rapid City Regional Airport, SD
Weather:Fair
0
A.B.