CaDS Seminar 2025 - Dec. 2
Dr. Amirhossein Nikfal (SDL Climate)
A Cloud-Hosted Data Platform for Large-Scale Scientific Data
Abstract:
Managing, cataloguing, and distributing large datasets has become increasingly challenging. To address these needs, we deployed a cloud-native, Kubernetes-orchestrated data platform on the JSC Cloud that provides scalable services for storing, describing, and publishing NetCDF-based scientific data. The system integrates distributed storage, automated metadata workflows, and federated access, enabling efficient data lifecycles that complement HPC environments.This talk will outline the platform’s architecture, highlight practical lessons learned, and discuss how cloud-based approaches, such as microservices, container orchestration, and standardized metadata pipelines, can inspire more robust and scalable data workflows across different research areas within the Simulation and Data Labs.