hosting
The GWDG offers three different operating models for hosting third-party HPC systems::
- The hardware is to be operated independently at the GWDG. For this purpose, a separate network area will be necessary. The operation is carried out under the institute's own responsibility, so the choice of software / OS and hardware remains entirely with the institute. Additional storage requirement can be provided by the GWDG and connected via Ethernet.
- The hardware is to be operated by the GWDG in its HPC environment. For this purpose, the hardware will be integrated into the existing networks. A root access cannot be granted due to the strong integration. The software environment will be provided and maintained by the GWDG. However, installations in the home directory are possible as on the general HPC resources. The third-party hardware equipment may differ from the rest of the systems in the GWDG’s HPC environment. The institute owning the third-party system can choose between 2 operation modes:
Modus Advantage Disadvantage Direct Direct access via ssh There is no resource management, processes can interfere and disturb each other Indirect A resource allocation is made for each job, which is exclusively available to the job Usage only via Slurm - Participation in a tender of the GWDG or procurement of a hardware identical to a GWDG cluster. In this model, the same options as in 2 are basically available. Due to the identical hardware, however, the participation can also be converted into "fairshare". In this case, the hardware is integrated into the normal operation and can also be shared by other users. The fairshare allows the institute owning the third-party system to submit jobs with higher priority, in order to make these jobs start preferentially - until the fairshare has been used up. Due to the increased fairshare, the institute can distribute high computational demand on significantly more resources than the self-procured ones in the short term. On the other hand, there may be waiting times even if no jobs of the institute are computing yet, because all resources in the cluster are already occupied. However, the next free resource is reserved for a job of the institute due to the high priority.