The team “Data Infrastructure and Workflows” focuses on developing and providing state-of-the-art data infrastructure and analytical methods for air quality assessments. One of our goals is to make harmonised and quality-controlled data on ground-level air pollutants and meteorological variables freely accessible, following the principles of findable, accessible, interoperable, and reusable (FAIR) scientific practices. To achieve this, we have established standardised analytical methods and machine learning workflows as a service, enabling users to generate reproducible scientific analyses without requiring extensive programming expertise.
A key component of our work is the TOAR (Tropospheric Ozone Assessment Report) Database Infrastructure project (toar-data.fz-juelich.de, toar-data.org), which houses a 9 TB database of surface measurements of ozone and its precursors, along with meteorological data. This comprehensive dataset is licensed under the CC-BY4 licence and has undergone significant updates, building upon its predecessor, the TOAR-I database. Our approach eliminates bias in data analysis by leveraging curated and quality-controlled datasets rather than raw data, setting a new standard for reproducibility. This advancement ensures that findings are both robust and replicable across diverse applications.
Our team has achieved the Core Trust Seal (CTS) certification for the TOAR Database Infrastructure, underscoring our commitment to the FAIR principles, open data, and trustworthy repositories. We have also been invited to join the World Data System (WDS), further solidifying our reputation as a reliable and transparent data provider. Our dashboard provides user-friendly access to reproducible scientific analyses, making it easier for researchers to explore and utilise our data.
In addition to the TOAR Database Infrastructure project, our team is involved in the WarmWorld project, which aims to improve Earth System simulations using the ICON (ICOsahedral Nonhydrostatic) model. As part of the “Easier” module, we are working to simplify access to climate simulation results by creating a federated data system in collaboration with Deutsches Klimarechenzentrum (DKRZ). To achieve this, we are combining software tools such as the Fields Database (FDB) developed by the European Centre for Medium-Range Weather Forecasts (ECMWF) and SpatioTemporal Asset Catalogs (STAC). Our ultimate goal is to provide seamless access to high-quality data and analytical tools, empowering researchers to drive innovation and discovery in air quality and climate research.
Projects
Members