Data Support for Climate and W...

Data Support for Climate and Weather Research

Data describing the intricate details of how the Earth's system functions are the cornerstone of climate research and weather. When looking for foundational (or auxiliary) data, many in the global research community studying climate, weather, and Earth, ocean, and atmosphere interactions turn to NCAR's Research Data Archive (RDA). Maintained in the Computational and Information Systems Laboratory, RDA offers a breadth of easy-to-access global and regional data sets, as well as first-class data archive capabilities. Moreover, staffed by those trained in atmospheric and oceanographic sciences, the team's combination of knowledge and skills benefit both the user and data-provider community.

"The RDA has long been known for high quality data curation, going back more than 40 years, and now has significantly improved data acquisition methods," says Doug Schuster, RDA Software Engineer. "Better – and broader – web-interface capabilities make searching for and handling data easier. In the coming year, not only do we expect to apply more computing to improve turn-around time on individual user-specific requests, but we'll have more than 130 Terabytes (TB) of data available online for immediate download."

Because of the in-house expertise of the RDA staff, users have direct access to knowledgeable consultants who are able to answer data questions and recommend data sets that best match research objectives. This in-depth knowledge is made clear through the thoughtful, researcher-oriented design of dataset and metadata staging, and organization of the RDA. The team also designs and writes software that is efficient for importing new data, exporting data to users, and is supplemented with codes and libraries that aid the researcher in their data analysis. This expertise proves invaluable during the quality control/quality assessment process; RDA's data professionals know how data should "look," which makes data content errors stand out more readily and helps eliminate incorrect data before it goes public or becomes widely available.

RDA personnel's experience also feeds into cultivating and improving the type and variety of data available to researchers. The RDA archive is largely composed of data acquired from other organizations so, among other efforts, the staff is tasked with identifying and pulling together data sets that would benefit the science community, and broaden the reach of a researcher's data set to others working in the same as well as in other, complementary fields. RDA staff is well equipped to identify new datasets that either tie strongly into the existing archive or fill in gaps of knowledge that would benefit the larger research picture as well as individual scientists.

In addition to serving up, archiving, and providing data quality control, RDA is tasked with identifying useful data sets generated by those within the science community. RDA supports the scientists and organizations that deliver data to the RDA archive, ensuring data integrity as information is transferred between provider and archive. Upon receiving permission for data distribution, RDA then facilitates convenient, straightforward data access for users.

RDA relieves data providers from the bulk of the necessary data support tasks, and also acts as the first point of contact for users, offering answers to many of the scientific questions that arise, providing insights on the most effective data manipulation software tools, and creating documentation and metadata that are both meaningful to users and accurate from the provider's perspective.

Name

Time Period

Highest Resolution

     Temporal  Horizontal  Vertical

NCEP/NCAR Global Atmospheric Reanalysis

1948-2010 (ongoing)

6 hours

T62 (209km)

17 Plvl

NCEP/DOE Global Atmospheric Reanalysis

1979-2008 (ongoing)

6 hours

 T62

17 Plvl

ECMWF Re-Analysis 40-year (ERA-40)

1957-2002

6 hours

T159 (1.125°)

23 Plvl

NCEP North America Regional Reanalysis

1979-2009 (ongoing)

3 hours

32 (km)

29 Plvl

Japanese Reanalysis (JR-25/JCDAS)

1979-2009 (ongoing)

6 hours

T106 (1.125°)

23 Plvl

ECMWF Interim Reanalysis (ERA-I/ECDAS)

1989-2009 (ongoing)

6 hours

T255 (0.703°)

37 Plvl

NOAA-CIRES 20th Century Reanalysis

1891-2008

6 hours

2° x 2°

28 Plvl

NCEP Climate Forecast System Reanalysis, Atmos.

NCEP Climate Forecast System Reanalysis, Ocean

1979-2009 (ongoing)

1 hour

T362 (38km)

0.25°– 0.50°

64 Plvl

40 Zlvl

 

Table 1: Global atmospheric and oceanographic re-analyses are one of many valuable data resources provided by external organizations that employ the expertise of RDA consultants and are the most recent major reanalyses available in the Research Data Archive. Most time periods are ongoing, that is, providers continue to produce the products gong forward in time. In general, all reanalyses also have lower temporal and horizontal resolutions than those shown above. Most reanalyses also have variables on vertical model coordinate levels, as well as large numbers of surface specific fields, and vertically integrate values.

Among the most recent data access efforts, RDA has developed and continues to enhance machine-to-machine interoperability. Most data sets in the World Meteorological Organization's (WMO) Gridded in Binary (GRIB) format are accessible using the Client URL Request Library (cURL) and common Internet protocols. This format and capabilities allow users to extract specific grids from across multi-terabyte archives in the RDA. For first-time users of a particular dataset, web interfaces are available to help choose the temporal range and parameters they require. From these specifications data download scripts are automatically generated, which can then run on the user's host computer and are easy to modify as research goals expand and to accommodate arrival of new data to RDA's continually growing archive.

"While ease of access is important to our users, we find that it's critical to balance efforts to facilitate data access with adding data content," says Steve Worley, Manager, Data Support Section. "Because one without the other will certainly disappoint our users."

Always striving to get better, RDA data access has made substantial gains in the past five years, says Worley, and this growth and change will continue in the years to come, just as user expectations and the data service challenges continue to evolve. Plans for the RDA in 2010 include expanding the existing online data archive to 130 TB – four times the archive's current size. As resources continue to be added, the online archive will grow to more than 250 TB during 2011. RDA's online data service is supplemented by an easy-to-use request interface that stages data to disk for Internet download from archive storage. This capability gives users access to the complete RDA, which currently includes more than 400 TB of data.