Disk storage

Even the most advanced, highly efficient HPC systems become ineffective without an appropriate data storage system. Only the proper teaming of computing infrastructure with the right selection of storage solutions can assure the best quality of services provided to users. The scale of problems in this area increases with the complexity and the efficiency of high performance computers. At present, disk storage systems attached to Cyfronet's supercomputers store over 500 000 000 data files (with the file sizes up to several terabytes). A wide variety of research conducted on the Centre's resources requires not only diverse configuration of key Cyfronet's computers, but also an efficient, dedicated storage system.

The most fundamental is the one used for keeping users' home directories. In this case all the crucial elements provide a very high level of availability and data security, which are supported by mechanisms such as snapshots and backups to external tape libraries. Zeus and Prometheus (the two main supercomputers of the Centre) offer such functionality through using specialized HNAS file servers (so called filers), produced by Hitachi Data Systems. These servers base on a hardware implementation of the NFS protocol and provide very high performance and high availability of the file systems. HNAS filers are coupled with Hitachi Data Systems AMS 2500 and HUS 150 disk arrays, used as repositories of physical disk space. These devices also provide extremely high levels of security and performance, fitted to the specific characteristics of the data stored in home directories.

Another type of storage space used in supercomputers is the scratch space, in which the crucial factor is speed. To address this requirement, Cyfronet uses Lustre distributed file system, which is capable to scale both space and performance by aggregating storage capacity of many servers. Moreover, throughput and/or capacity can be easily increased by adding more servers dynamically, without interrupting user computations. Nowadays, all Cyfronet's supercomputers can use scratch spaces based on Lustre. In Zeus case, it is the file system with almost 600 TB capacity and 12 GB/s read/write bandwidth. Prometheus' scratch has enormous capacity of 5 PB and 120 GB/s read/write bandwidth. For even more demanding disk access requirements it is possible to use a super-fast RAM-disk provided by the vSMP partition of the Zeus supercomputer.

However, the major part of Cyfronet's storage resources is dedicated to the needs of users of domain-specific services developed in the PLGrid program. The PLGrid infrastructure provides a dedicated workspace for groups in domain grid environments – the functionality essential for enabling cooperation of scientists from geographically distributed locations. Zeus provides almost 200 TB of such disk space with the use of HNAS filers and the NFS protocol. Prometheus offers similar functionality with higher performance, using the Lustre file system. The maximum capacity of the /archive resource in this supercomputer reaches 5 PB and the total rate of read/write operations attains 60 GB/s.

A special case of mass storage are the resources for large projects and international collaborations, in which Cyfronet takes part, such as WLCG (Worldwide LHC Computing Grid), which stores and analyzes the data coming out of the LHC detector in CERN, or CTA (Cherenkov Telescope Array). Such projects demand high volumes of disk space available by a set of specialized protocols, such as SRM, xroot or GridFTP. Cyfronet provides such space with the use of the DPM (Disk Pool Manager) instances and dedicated networks, such as LHCone. Total amount of disk space provided by these services exceeds 1 PB.

The overall disk space used in Cyfronet exceeds 21 PB and includes:

  • 6.8 TB efficient FC drives,
  • 211 TB economic FATA drives,
  • 19 945 TB efficient SAS drives,
  • 1 052 TB economic SATA drives,
  • 40 TB NVMe semiconductor memories.