SNIC Emerging Technologies
|Name||SNIC Emerging Technologies|
|Description||Coordinating new and emerging technologies within SNIC.|
* [http://next-generation-hpc-desktop.readthedocs.org/en/latest/ Report: Next Generation HPC Desktop
Currently, many SNIC centres are continuously investigate the market for current developments in emerging technologies and even buy hardware for evaluation. However, in many of these activities user involvement is limited and efforts are often duplicated between centres. This activity aims to coordinate these efforts and competence within SNIC and provide SNIC users and communities with early access to new technologies. The project will also work closely with the user communities to see if they have interests in upcoming technologies.
Three focus areas have been and are described in the following sections:
Storage Technologies (C3SE, HPC2N)
There are many storage technologies employed within SNIC. To be able to deploy solutions that are suitable for different usage scenarios, it is important that there is a continuous evaluation of availa-ble storage technologies. It is also important that prototype solutions are evaluated closely to users and facilities. Typical projects within this focus area could be:
- File systems for different I/O patterns.
- New storage hardware.
- Higher-level file services.
- Client tools for accessing available resources.
- Integration services.
Objectives and deliverables
- Investigate how existing resources can be facilitated in new ways and how new approaches for users on how to use existing solutions can be facilitated to provide better support for I/O-intensive simulations and work-flows. (C3SE)
- Investigate how the RobinHood service of Lustre v2+ filestystems can be used in conjunction with Tivoli TSM to speed up incremental backups. (C3SE, HPC2N)
- Investigate how the RobinHood service of Lustre v2+ filesystems can be used to provide more fine-grained quotas, ex. for project storage etc. (C3SE)
- One solution that has been used in SNIC for a number of years for long-term storage of re-search data is dCache in combination with TSM for archiving to tape. There are several al-ternatives to using TSM for archiving to tape, which should be further studied both from a performance and cost effectiveness. One example is LTFS . To look at various options for long-term storage is also something that connects well to activities within WLCG and EIS-CAT_3D for which HPC2N are involved in. (HPC2N)
Users and groupquota through Littlejohn. Tested in operation. Scans file system and then uses changelogs to update database. Have since discovered a bug in the Lustre for changeloghantering. To be integrated for better project reporting
Robin Hood and the TSM integration. Based on the changelog is active. Limited progress. Is an activity on the Lustre list.
Access-methods and remote visualisation (Lunarc, UPPMAX)
To improve the usability of our HPC resources, many centres have been providing remote desktop solutions to users in an effort to provide a richer user experience in HPC. The service has been a great success in the research groups and the number of users is increasing. Requests for new applications and usage are increasing steadily. However, the increasing user base and request for new applications and usages has raised questions on how to efficiently scale these services in many dimensions.
- Monitoring of usage patterns on the desktop servers.
- Monitoring and scaling of distribution networks using different usage patterns and desktop configurations such as spatial resolution.
- How to provide seamless access to hardware accelerated desktop services.
- How to provide interactive access to applications and specific hardware configurations (e.g. GPU enabled nodes) through the batch-systems in a transparent way in the same desktop environment.
Identifying bottlenecks in remote desktop architectures.
Objectives and deliverables
Remote desktop service aspects:
- Evaluate different remote-access architectures based on target network specifications.
- Develop best practice guides for configuration and setup of remote desktop services.
- Evaluate techniques for monitoring the load and usage of the servers providing the desktop services, so that the user experience is good. Analyse usage bottlenecks and identify areas for improvement such as easy and (eventually) integrated SFTP file transfer.
- Implement prototypes for providing a scalable and OS independent hardware accelerated back-end to the desktop interface. This can involve taking advantage of single or multiple GPU nodes as a provider for graphical acceleration.
- Developing an on-demand hardware allocation mechanism using the queuing system to enable unique per-session access to unique hardware resources e.g. CPU or accelerator equipped nodes.
- Defining community specific desktops.
Lunarc have been improving and testing configuration on the existing Lunarc HPC Desktop. Knowledge from this have been the basis for the design of the upcoming prototype hardware. Prototype hardware for evaluating future desktop environments have been procured from SouthPole. The hardware consists of:
- NVIDIA K1/K2 evaluation system for investigating an on-demand desktop solution with accelerated graphics support. Solutions evaluated will be hypervisor-based using SLURM to allocate sessions.
- Intel Xeon Clearwell system for evaluating graphics support in this architecture. Could be used for providing cost-effective desktop solutions for HPC resources.
- Commodity graphics card in a server setting.
Prototype system will be available mid february.
Future HPC and Accelerators (PDC, Lunarc, HPC2N)
For future investment of computing resources in SNIC in the coming years (three large and several smaller systems), it is important that there is current specialist knowledge about different types of existing and future CPU architectures. To obtain a good basis for future decision in HPC resources, it is important that the work done at the centres, the SNIC GPU project and activities within PRACE are continued and developed further.
Objectives and deliverables
- Evaluate the use of next generation GPU:s K40 and others in the context of future SNIC resources. (Lunarc, HPC2N)
- Operate and provide support for SNICs investments in the Zorn GPU development resource (PDC)
- Operate and provide support for SNICs investments in the Erik GPU development resource (Lunarc)
- Evaluate the use of next generation Xeon Phi accelerators (Knights Landing) (Lunarc, HPC2N)
Lunarc have procured a NVIDIA K80 system which will be installed in Erik. (Availability mid-feb 2015)
HPC2N is installing Xeon Phi:s in the existing resources (Availability XX-XX-XX)
|Jonas Lindemann (LUNARC)||LUNARC||Application expert||Grid computing|
|Mathias Lindberg (C3SE)||C3SE||Systems expert|
|Thomas Svedberg (C3SE)||C3SE||Application expert||Solid mechanics|