Agenda

Program at a glance:

July 10 (Room: Salón de Actos ESII)

8:30 – 9:00 – Registration and Welcome
9:00 – 9:30 – Opening
9:30 – 11:00 – Shared Pool of Virtualized Accelerators: A Key Architectural Innovation for Power-Efficient Clusters. Jose Duato, Technical University of Valencia, Spain.
11:00 – 11:30 – Coffee Break
11:30 – 13:00 – Congestion Management for Current and Future Interconnection Networks: Challenges and Solutions. Pedro Javier Garcia, University of Castilla-La Mancha (UCLM), Spain.
13:00 – 15:00 – Lunch time
15:00 – 17:00 – Practical Lab: Hands-on configuring HPC networks. German Maglione-Mathey and Raul Galindo-Moreno, University of Castilla-La Mancha (UCLM), Spain
17:00 – 17:30 – Coffee Break
17:30 – 19:00 – Many-core processors interfacing interconnection networks: lessons learned and possible future directions. Holger Fröning, Ruprecht-Karls University of Heidelberg, Germany.

July 11 (Room: Salón de Actos ESII)

9:00 – 10:30 – Efficient integration of distributed applications with high performance I/O technologies. Bernard Metzler, IBM Zurich Research Laboratory, Zurich, Switzerland.
10:30 – 11:00 – Coffee Break
11:00 – 12:30 – New trends in Data Center and HPC networks. Eitan Zahavi, Mellanox Technologies, Israel.
12:30 – 14:00 – Intel Omni-Path Fabric: Architecture and technology overview. Gaspar Mora, Intel Corp., Santa Clara, USA.
14:00 – 16:00 – Lunch time
16:00 – 17:30 – Exascale fabric administration tools – BXI Software solutions. Alain Cady, Atos/BULL.
17:30 – 18:00 – Coffee Break
18:00 – 19:30 – Panel Session. The Future of the Interconnect Technology in the Exascale and Big-Data Era.
Moderator: Pedro Javier Garcia, University of Castilla-La Mancha (UCLM), Spain.
Panelists:
Jose Duato, Technical University of Valencia, Spain.
Eitan Zahavi, Mellanox Technologies, Israel.
Alain Cady, Atos/BULL.
Bernard Metzler, IBM Zurich Research Laboratory, Zurich, Switzerland.

19:30 – 20:00 – Farewell

Extended Program:

Coffee breaks (included in the registration) will be served in the ESII canteen.
Lunch (not included in the registration) can be taken also in the ESII canteen, and other restaurants and canteens close to the ESII.

Monday, July 10

Room: Salón de Actos ESII

8:30 – 9:00 – Registration and Welcome

9:00 – 9:30 – Opening

9:30 – 11:00 – Shared Pool of Virtualized Accelerators: A Key Architectural Innovation for Power-Efficient Clusters
Jose Duato, Technical University of Valencia, Spain.

Abstract: Power consumption has become a critical issue in HPC clusters and datacenters, and currently is the main obstacle in the Exaflop race. A well-known strategy to increase power efficiency consists of incorporating computing accelerators to the system architecture. Currently, many HPC systems include GPUs, but future systems will likely use a larger variety of accelerators: FPGAs, wavefront arrays of fixed-point units for AI, and ultimately, quantum accelerators. However, the average utilization of GPUs in current HPC systems is quite low, and more specialized accelerators are likely to exhibit even lower utilizations.

In this talk, we present a technique to drastically increase the utilization of computing accelerators. It consists of a combination of virtualizing those accelerators to enable concurrent access from different application threads, and providing support for remote access through the system’s high-speed interconnect. The result is a shared pool of virtualized accelerators that will increase resource utilization, application execution speed, and system throughput while reducing energy consumption. A key component to efficiently implement remote accelerator access is a high-speed interconnection network.

In this talk, we will present performance results for a particular implementation of this architectural innovation, based on enabling access to remote nVIDIA GPUs through an InfiniBand network for unmodified CUDA-accelerated applications. We will also introduce some alternative approaches to implement remote access to accelerators as well as some enhancements for this class of techniques.

11:00 – 11:30 – Coffee Break

11:30 – 13:00 – Congestion Management for Current and Future Interconnection Networks: Challenges and Solutions
Pedro Javier Garcia, University of Castilla-La Mancha (UCLM), Spain.

Abstract: Congestion appears in interconnection networks when intense traffic clogs internal paths, thus slowing down traffic and degrading network performance. This keynote offers an overview of current strategies to avoid, reduce or eliminate network congestion and/or its negative effects, analyzing their suitability for future Exascale systems.

13:00 – 15:00 – Lunch time

15:00 – 17:00 – Practical Lab: Hands-on configuring HPC networks
German Maglione-Mathey and Raul Galindo-Moreno, University of Castilla-La Mancha (UCLM), Spain

This session is intended to be a totally practical session where attendants will learn several activities and methods for the configuration of high-performance interconnection networks. Specifically, we will work with the cluster CELLIA from the RAAP research group in the Albacete Research Institute (I3A) at the UCLM.

17:00 – 17:30 – Coffee Break

17:30 – 19:00 – Many-core processors interfacing interconnection networks: lessons learned and possible future directions
Holger Fröning, Ruprecht-Karls University of Heidelberg, Germany.

Abstract: The concurrency galore is currently defining computing at all levels, leading to a vast amount of parallelism even for small computing systems. Technology constraints prohibit a reversal of this trend, and the still unsatisfied need for more computing power has led to a pervasive use of accelerators to speed up computations. In spite of this pervasive use, they are typically supervised by general-purpose CPUS, which results in frequent control flow switches and data transfers as CPUs are responsible for communication tasks. This talk will shortly introduce the current state-of-the-art for accelerator-centric communication, and review our observations and insights when experimenting with accelerators sourcing and sinking network traffic. We will discuss current options and limitations, as well as implications on interconnection networks and tool stacks like MPI. In particular, we will learn why specialized processors require specialized communication models. Finally, the talk will offer some opinions on anticipated research problems.

Tuesday, July 11

Room: Salon de Actos ESII

9:00 – 10:30 – Efficient integration of distributed applications with high performance I/O technologies
Bernard Metzler, IBM Zurich Research Laboratory, Zurich, Switzerland.

Abstract: Recent years have seen an unprecedented, rapid development of both network and storage technologies. With the widespread deployment of multi-gigabit Ethernet networking and fast non-volatile storage class memory, these new technologies have become commodity.
But, using today’s communication software stacks, this more than hundredfold speedup in
I/O performance does not translate into adequate application acceleration. During this session we identify architectural limitations of the legacy I/O stack of today’s compute and storage systems and discuss techniques to export high performance IO right into the application.
We focus our discussion on restructuring the communication stack following the principle of
user space network and storage IO. We exemplify our findings with the acceleration of two real world, distributed applications: We introduce Crail, a user-level I/O architecture to accelerate the Apache data processing ecosystem, and DSS, a distributed shared storage subsystem to serve as a storage abstraction for large scale brain simulations within the European Human Brain Project.

10:30 – 11:00 – Coffee Break

11:00 – 12:30 – New trends in Data Center and HPC networks
Eitan Zahavi, Mellanox Technologies, Israel.

Abstract: In this session we will inspect two future trends in the HPC and data center networks. In recent years, bare metal provisioning needs for Machine Learning as well as HPC clouds has grown in demand. This need stems for the requirement for predictable performance and thus isolation some modern applications rely on. Such that cloud network isolation is no longer just VLAN based but should actually provide some real guarantees. With the ever increasing demand for West/East bandwidth DCNs turn to optical technologies to guarantee the exponential bandwidth growth. Such that Optical Data Center Networks (ODCN) are on the rise. We will review their promise and fundamental limitations via the introduction of NEPHELE – a TDMA ODCN.

12:30 – 14:00 – Intel Omni-Path Fabric: Architecture and technology overview
Gaspar Mora, Intel Corp., Santa Clara, USA.

Abstract: Wondering about the new interconnect solution with 28 systems on the current TOP500 list, including the fastest 100Gbps cluster? This is an introduction to the recent Intel Omni-Path Architecture (Intel OPA), an end-to-end fabric solution that delivers the performance for tomorrow’s high performance computing (HPC) workloads and the ability to scale to tens of thousands of nodes. Special focus will be on the micro-architecture of the 4.8Tbps switch ASIC, 48-ports, that powers Intel Omni-Path Edge switches that build large multitier fabrics for HPC systems.

14:00 – 16:00 – Lunch time

16:00 – 17:30 – Exascale fabric administration tools – BXI Software solutions
Alain Cady, Atos/BULL.

Abstract: For HPC systems, and in the context of a leap to exascale, the scalability of the system management solutions will be a mandatory feature. In the same time, the HPC administrators will be asked to install and manage a plethora of systems in terms of sizes and network complexity. The management solutions should be also flexible enough in order to easily configure, install and use on different HPC systems from teraflops and to exaflops systems. With Bull BXI Software, we are providing a series of tools allowing to install, configure and operate the entire system or individual network devices. In the same time, the proposed solutions enable administrators to be aware and to pin point potential causes of service degradation much more rapidly than can today.

17:30 – 18:00 – Coffee Break

18:00 – 19:30 – Panel Session. The Future of the Interconnect Technology in the Exascale and Big-Data Era.
Moderator: Pedro Javier Garcia, University of Castilla-La Mancha (UCLM), Spain.

Panelists:

Jose Duato, Technical University of Valencia, Spain
Eitan Zahavi, Mellanox Technologies, Israel
Alain Cady, Atos/BULL
Bernard Metzler, IBM Zurich Research Laboratory, Zurich, Switzerland

Applications demands for computing power and data processing are embarrasingly increasing during this decade, so that these requirements obviously have influence in the design of the computing infrastructure. This panel will gather experts both from the academia and the industry to share their opinions in the technology advances for the current and future high-performance interconnection networks.

19:30 – 20:00 – Farewell