herbie.archive.Herbie#
- class herbie.archive.Herbie(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=WindowsPath('C:/Users/blayl_depgywe/data'), overwrite=False, verbose=True, **kwargs)[source]#
Locate GRIB2 file at one of the archive sources.
- Parameters:
date (pandas-parsable datetime) – Model initialization datetime. If None, then must set
valid_date
.valid_date (pandas-parsable datetime) – Model valid datetime. Must set when
date
is None.fxx (int) – Forecast lead time in hours. Available lead times depend on the model type and model version. Range is model and run dependant.
model ({'hrrr', 'hrrrak', 'rap', 'gfs', 'gfs_wave', 'ecmwf', 'rrfs', etc.}) – Model name as defined in the models template folder. CASE INSENSITIVE Some examples: -
'hrrr'
HRRR contiguous United States model -'hrrrak'
HRRR Alaska model (alias'alaska'
) -'rap'
RAP model -'ecmwf'
ECMWF open data forecat productsproduct ({'sfc', 'prs', 'nat', 'subh'}) – Output variable product file type. If not specified, will use first product in model template file. CASE SENSITIVE. For example, the HRRR model has these products: -
'sfc'
surface fields -'prs'
pressure fields -'nat'
native fields -'subh'
subhourly fieldsmember (None or int) – Some ensemble models (e.g. the future RRFS) will need to specify an ensemble member.
priority (list or str) – List of model sources to get the data in the order of download priority. CASE INSENSITIVE. Some example data sources and the default priority order are listed below. -
'aws'
Amazon Web Services (Big Data Program) -'nomads'
NOAA’s NOMADS server -'google'
Google Cloud Platform (Big Data Program) -'azure'
Microsoft Azure (Big Data Program) -'pando'
University of Utah Pando Archive (gateway 1) -'pando2'
University of Utah Pando Archive (gateway 2)save_dir (str or pathlib.Path) – Location to save GRIB2 files locally. Default save directory is set in
~/.config/herbie/config.cfg
.Overwrite (bool) – If True, look for GRIB2 files even if local copy exists. If False (default), use the local copy (still need to find the idx file).
**kwargs – Any other paremeter needed to satisfy the conditions in the model template file (e.g., nest=2, other_label=’run2’)
- __init__(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=WindowsPath('C:/Users/blayl_depgywe/data'), overwrite=False, verbose=True, **kwargs)[source]#
Specify model output and find GRIB2 file at one of the sources.
Methods
__init__
([date, valid_date, model, fxx, ...])Specify model output and find GRIB2 file at one of the sources.
download
([searchString, source, save_dir, ...])Download file from source.
find_grib
([overwrite])Find a GRIB file from the archive sources
find_idx
()Find an index file for the GRIB file
get_localFilePath
([searchString])Get path to local file
read_idx
([searchString])Inspect the GRIB2 file contents by reading the index file.
Print all the attributes of the Herbie object
xarray
([searchString, backend_kwargs, ...])Open GRIB2 data as xarray DataSet
Attributes
Predict Local File Name of the full file
Predict Remote File Name
Read and cache the full index file
Methods:
__init__
([date, valid_date, model, fxx, ...])Specify model output and find GRIB2 file at one of the sources.
download
([searchString, source, save_dir, ...])Download file from source.
find_grib
([overwrite])Find a GRIB file from the archive sources
find_idx
()Find an index file for the GRIB file
get_localFilePath
([searchString])Get path to local file
read_idx
([searchString])Inspect the GRIB2 file contents by reading the index file.
tell_me_everything
()Print all the attributes of the Herbie object
xarray
([searchString, backend_kwargs, ...])Open GRIB2 data as xarray DataSet
Attributes:
get_localFileName
Predict Local File Name of the full file
get_remoteFileName
Predict Remote File Name
index_as_dataframe
Read and cache the full index file
- __init__(date=None, *, valid_date=None, model='hrrr', fxx=0, product=None, priority=None, save_dir=WindowsPath('C:/Users/blayl_depgywe/data'), overwrite=False, verbose=True, **kwargs)[source]#
Specify model output and find GRIB2 file at one of the sources.
- download(searchString=None, *, source=None, save_dir=None, overwrite=None, verbose=None, errors='warn')[source]#
Download file from source.
TODO: When we download a full file, the value of self.grib and TODO: self.grib_source should change to represent the local file.
Subsetting by variable follows the same principles described here: https://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html
- Parameters:
searchString (str) – If None, download the full file. Else, use regex to subset the file by specific variables and levels. .. include:: ../../user_guide/searchString.rst
source ({'nomads', 'aws', 'google', 'azure', 'pando', 'pando2'}) – If None, download GRIB2 file from self.grib2 which is the first location the GRIB2 file was found from the priority lists when this class was initialized. Else, you may specify the source to force downloading it from a different location.
save_dir (str or pathlib.Path) – Location to save the model output files. If None, uses the default or path specified in __init__. Else, changes the path files are saved.
overwrite (bool) – If True, overwrite existing files. Default will skip downloading if the full file exists. Not applicable when when searchString is not None because file subsets might be unique.
errors ({'warn', 'raise'}) – When an error occurs, send a warning or raise a value error.
- find_grib(overwrite=False)[source]#
Find a GRIB file from the archive sources
- Returns:
1) The URL or pathlib.Path to the GRIB2 files that exists
2) The source of the GRIB2 file
- property get_localFileName#
Predict Local File Name of the full file
- property get_remoteFileName#
Predict Remote File Name
- property index_as_dataframe#
Read and cache the full index file
- read_idx(searchString=None)[source]#
Inspect the GRIB2 file contents by reading the index file.
This reads index files created with the wgrib2 utility.
- Parameters:
searchString (str) –
Filter dataframe by a searchString regular expression. Searches for strings in the index file lines, specifically the variable, level, and forecast_time columns. Execute
_searchString_help()
for examples of a good searchString.Subsetting is done using the GRIB2 index files. Index files define the grib variables/parameters of each message (sometimes it is useful to think of a grib message as a “layer” of the file) and define the byte range of the message.
Herbie can subset a file by grib message by downloading a byte range of the file. This way, instead of downloading the full file, you can download just the “layer” of the file you want. The searchString method implemented in Herbie to do a partial download is similar to what is explained here: https://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html
Herbie supports reading two different types of index files
Index files output by the wgrib2 command-line utility. These index files are common for forecast models provided by NCEP.
Index files output by the ecCodes/grib_ls command-line utlity. These index files are common for forecast models provided by ECMWF.
You can use regular expression to search for lines in the index file. If
H
is a Herbie object, the regex search is performed on theH.read_idx().search_this
column of the DataFrameTip
If you need help with regular expression, search the web or look at this cheatsheet. Check regular expressions with regexr or regex101.
Here are some examples you can use for the
searchString
argument for the wgrib2-style index files.searchString=
GRIB messages that will be downloaded
":TMP:2 m"
Temperature at 2 m.
":TMP:"
Temperature fields at all levels.
":UGRD:.* mb"
U Wind at all pressure levels.
":500 mb:"
All variables on the 500 mb level.
":APCP:"
All accumulated precipitation fields.
":APCP:surface:0-[1-9]*"
Accumulated precip since initialization time
":APCP:surface:[1-9]*-[1-9]*"
Accumulated precip over last hour
":UGRD:10 m"
U wind component at 10 meters.
":(U|V)GRD:(10|80) m"
U and V wind component at 10 and 80 m.
":(U|V)GRD:"
U and V wind component at all levels.
":(?:U|V)GRD:[0-9]+ hybrid"
U and V wind components at all hybrid levels
":(?:U|V)GRD:[0-9]+ mb"
U and V wind components at all pressure levels
":.GRD:"
(Same as above)
":(TMP|DPT):"
Temperature and Dew Point for all levels .
":(TMP|DPT|RH):"
TMP, DPT, and Relative Humidity for all levels.
":REFC:"
Composite Reflectivity
":surface:"
All variables at the surface.
"^TMP:2 m.*fcst$"
Beginning of string (^), end of string ($) wildcard (.*)
Hint
The NCEP Parameters & Units Table is a useful resource to help you identify wgrib2-style GRIB variable abbreviations and their meanings.
Here are some examples you can use for the
searchString
argument for the grib_ls-style index files.Look at the ECMWF GRIB Parameter Database https://apps.ecmwf.int/codes/grib/param-db
This table is for the operational forecast product (and ensemble product):
searchString (oper/enso)
Messages that will be downloaded
”:2t:”
2-m temperature
”:10u:”
10-m u wind vector
”:10v:”
10-m v wind vector
”:10(u|v):
10m u and 10m v wind
”:d:”
Divergence (all levels)
”:gh:”
geopotential height (all levels)
”:gh:500”
geopotential height only at 500 hPa
”:st:”
soil temperature
”:tp:”
total precipitation
”:msl:”
mean sea level pressure
”:q:”
Specific Humidity
”:r:”
relative humidity
”:ro:”
Runn-off
”:skt:”
skin temperature
”:sp:”
surface pressure
”:t:”
temperature
”:tcwv:”
Total column vertically integrated water vapor
”:vo:”
Relative vorticity
”:v:”
v wind vector
”:u:”
u wind vector
”:(t|u|v|r):”
Temp, u/v wind, RH (all levels)
”:500:”
All variables on the 500 hPa level
This table is for the wave product (and ensemble wave product):
searchString (wave/waef)
Messages that will be downloaded
”:swh:”
Significant height of wind waves + swell
”:mwp:”
Mean wave period
”:mwd:”
Mean wave direction
”:pp1d:”
Peak wave period
”:mp2:”
Mean zero-crossing wave period
Hint
The ECMWF Parameter Database is a useful resource to help you identify ecCodes-style GRIB variable abbreviations and their meanings.
- Return type:
A Pandas DataFrame of the index file.
- xarray(searchString=None, backend_kwargs={}, remove_grib=True, **download_kwargs)[source]#
Open GRIB2 data as xarray DataSet
- Parameters:
searchString (str) – Variables to read into xarray Dataset
remove_grib (bool) – If True, grib file will be removed ONLY IF it didn’t exist before we downloaded it.