Application of fuzzy c-means clustering for analysis of chemical ionization mass spectra: insights into the gas phase chemistry of NO3-initiated oxidation of isoprene

Rongrong Wu, Sören R. Zorn, Sungah Kang, Astrid Kiendler-Scharr, Andreas Wahner, and Thomas F. Mentel

Application of fuzzy c-means clustering for analysis of chemical ionization mass spectra: insights into the gas phase chemistry of NO3-initiated oxidation of isoprene

Abstract

Oxidation of volatile organic compounds (VOCs) can lead to the formation of secondary organic aerosol (SOA), a significant component of atmospheric fine particles, which can affect air quality, human health, and climate change. However, the current understanding of the formation mechanism of SOA is still incomplete, which is not only due to the complexity of the chemistry but also relates to analytical challenges in SOA precursor detection and quantification. Recent instrumental advances, especially the development of high-resolution time-of-flight chemical ionization mass spectrometry (CIMS), greatly improved both the detection and quantification of low- and extremely low-volatility organic molecules (LVOCs/ELVOCs), which largely facilitated the investigation of SOA formation pathways. However, analyzing and interpreting complex mass spectrometric data remain a challenging task. This necessitates the use of dimension reduction techniques to simplify mass spectrometric data with the purpose of extracting chemical and kinetic information of the investigated system. Here we present an approach to apply fuzzy c-means clustering (FCM) to analyze CIMS data from a chamber experiment, aiming to investigate the gas phase chemistry of the nitrate-radical-initiated oxidation of isoprene.

The performance of FCM was evaluated and validated. By applying FCM to measurements, various oxidation products were classified into different groups, based on their chemical and kinetic properties, and the common patterns of their time series were identified, which provided insight into the chemistry of the investigated system. The chemical properties of the clusters are described by elemental ratios and the average carbon oxidation state, and the kinetic behaviors are parameterized with a generation number and effective rate coefficient (describing the average reactivity of a species) using the gamma kinetic parameterization model. In addition, the fuzziness of FCM algorithm provides a possibility for the separation of isomers or different chemical processes that species are involved in, which could be useful for mechanism development. Overall, FCM is a technique that can be applied well to simplify complex mass spectrometric data, and the chemical and kinetic properties derived from clustering can be utilized to understand the reaction system of interest.

Last Modified: 04.04.2024