xOPS

xOPS is an Ansible based Configuration-as-Code (CaC) repository of code that defines and enforces the configuration of the supercomputers, cloud platforms, and storage systems operated by the High-Performance Computing, Cloud and Data Systems and Services (HPCCDSS) division of the Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH.
xOPS codifies every aspect of system setup (e.g. user accounts, network interfaces, job schedulers, storage mounts, monitoring stacks, security policies) into repeatable, auditable automation scripts called *playbooks* and roles Changes are tracked in Git, reviewed via merge requests, and applied consistently across thousands of nodes.
This project is used to keep large-scale infrastructure reliable and predictable while reducing operational risk. It provides a single source of truth for platform configuration, speeds up rollout of new systems and updates, and helps teams recover quickly by recreating previously known working states when incidents occur.
Further documentation is available at: https://apps.fz-juelich.de/jsc/hps/xops/index.html