{"id":85,"date":"2019-01-16T17:59:58","date_gmt":"2019-01-16T17:59:58","guid":{"rendered":"http:\/\/hipineb.i3a.info\/hipineb2019\/?page_id=85"},"modified":"2019-02-17T13:05:49","modified_gmt":"2019-02-17T13:05:49","slug":"program","status":"publish","type":"page","link":"https:\/\/hipineb.i3a.info\/hipineb2019\/program\/","title":{"rendered":"Program"},"content":{"rendered":"<h2><\/h2>\n<h2><strong>PROGRAM HIGHLIGHTS<\/strong><\/h2>\n<p>The HiPINEB workshop comprises this year the following activities:<\/p>\n<ul>\n<li><span style=\"text-decoration: underline;\">Keynote<\/span>\u00a0will be given by <strong>Prof. Dhabaleswar K. (DK) Panda<\/strong>\u00a0(Ohio State University, USA).<\/li>\n<li><span style=\"text-decoration: underline;\">Research Papers<\/span>: <strong>research papers<\/strong> will be presented<\/li>\n<li><u>Invited talks<\/u> given by <strong>Prof. John Kim<\/strong> (KAIST, South Korea), <strong>Prof. Lizhong Chen<\/strong> (Oregon State University, USA), and\u00a0<strong>Prof. Torsten Hoefler<\/strong> (ETH Z\u00fcrich, Switzerland).<\/li>\n<\/ul>\n<hr \/>\n<h2>PROGRAM AT A GLANCE<\/h2>\n<p><span style=\"color: #ff0000;\"><strong><i>Room: Scarlet Oak\u00a0<\/i><\/strong><\/span><\/p>\n<div>\n<table style=\"border-collapse: collapse; width: 100%; height: 175px;\" border=\"1\">\n<tbody>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">8:15 &#8211; 8:30am<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><strong>Opening<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">8:30 &#8211; 10:00am<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><span style=\"text-decoration: underline;\"><strong>Keynote<\/strong><\/span><strong>:<\/strong><\/p>\n<p><em><strong><a href=\"#keynote\">RDMA-Based Networking Technologies and Middleware for Next-Generation Clusters and Data Centers<\/a><br \/>\n<\/strong><\/em>Prof. Dhabaleswar K. (DK) Panda, The Ohio State University, USA<em><strong><br \/>\n<\/strong><\/em><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">10:00 &#8211; 10:30am<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><strong>Break<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">10:30 &#8211; 12:00am<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><span style=\"text-decoration: underline;\"><strong>Research papers<\/strong><\/span><strong>:<\/strong><\/p>\n<p><a href=\"#bunde\"><em><strong>Shortest paths in Dragonfly systems<\/strong><\/em><\/a><br \/>\n<em>Ryland Curtsinger and David Bunde, Knox College, USA<\/em><\/p>\n<p><a href=\"#zahn\"><strong><em>Effects of Congestion Management on Energy Saving Techniques in Interconnection Networks<\/em><\/strong><\/a><br \/>\n<em>Felix Zahn, Pedro Yebenes, Jesus Escudero-Sahuquillo, Pedro Javier Garcia and Holger Froening, Heidelberg University, Germany<\/em><\/p>\n<p><span style=\"text-decoration: underline;\"><strong>Invited Talk<\/strong><\/span><\/p>\n<p><strong><em><a href=\"#kim\">Revisiting the Dragonfly Topology in High-Performance Interconnection Networks<\/a><br \/>\n<\/em><\/strong><em>Prof. John Kim,\u00a0Associate Professor, KAIST, South Korea<\/em><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">12:00 &#8211; 1:00pm<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><strong>Lunch<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">1:00 &#8211; 3:00pm<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><span style=\"text-decoration: underline;\"><strong>Invited talks<\/strong><\/span><strong>:<\/strong><\/p>\n<p><strong><em><a href=\"#chen\">Routerless Network-on-Chip and Its Optimizations by Deep Reinforcement Learning<\/a><br \/>\n<\/em><\/strong><em>Prof. Lizhong Chen, Oregon State University, USA<\/em><\/p>\n<p><em><strong><a href=\"#hoefler\">Hardware implementations of streaming Processing in the Network NICs<\/a><br \/>\n<\/strong><\/em><em>Prof. Torsten Hoefler, ETH Z\u00fcrich, Switzerland<\/em><\/td>\n<\/tr>\n<tr style=\"height: 25px;\">\n<td style=\"width: 18.2234%; height: 25px;\">3:00-3:15pm<\/td>\n<td style=\"width: 81.7766%; height: 25px;\"><strong>Closing remarks<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"moz-cite-prefix\">\n<hr \/>\n<\/div>\n<h2>PROGRAM DESCRIPTION<\/h2>\n<h3><a id=\"keynote\"><\/a>KEYNOTE<\/h3>\n<p><em><strong>RDMA-Based Networking Technologies and Middleware for Next-Generation Clusters and Data Centers<\/strong><\/em><\/p>\n<p><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone size-medium wp-image-102\" src=\"http:\/\/hipineb.i3a.info\/hipineb2019\/wp-content\/uploads\/sites\/10\/2019\/01\/DKPandaBio-235x300.jpg\" alt=\"\" width=\"235\" height=\"300\" srcset=\"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-content\/uploads\/sites\/10\/2019\/01\/DKPandaBio-235x300.jpg 235w, https:\/\/hipineb.i3a.info\/hipineb2019\/wp-content\/uploads\/sites\/10\/2019\/01\/DKPandaBio.jpg 742w\" sizes=\"(max-width: 235px) 100vw, 235px\" \/><\/p>\n<p><em>Prof. Dhabaleswar K. (DK) Panda, The Ohio State University, USA<\/em><\/p>\n<p>This talk will focus on emerging technologies and middleware for designing next-generation clusters and data centers with high-performance and scalability. The role and significance of RDMA technology with InfiniBand, RoCE (v1 and v2), and Omni-Path will be presented. Challenges in designing high-performance middleware for running HPC, Big Data and Deep Learning applications on these systems while exploiting the underlying networking features will be focused. On the HPC front, RDMA-based designs for MPI and PGAS libraries on modern clusters with GPGPUs will be presented. An overview of RDMA-based designs for Spark, Hadoop, HBase, and Memcached will be presented. On the Deep Learning side, RDMA-based designs for popular Deep Learning frameworks such as TensorFlow, Caffe, and CNTK will be focused. The talk will conclude with challenges in providing efficient virtualization support for next generation clusters and datacenters with CPUs and accelerators.<\/p>\n<p><strong>DK Panda<\/strong> is a Professor and University Distinguished Scholar of Computer Science and Engineering at the Ohio State University.\u00a0 He has published over 450 papers in the area of high-end computing and networking.\u00a0 The MVAPICH2 (High Performance MPI and PGAS over InfiniBand, Omni-Path, iWARP and RoCE) libraries, designed and developed by his research group (<a href=\"http:\/\/mvapich.cse.ohio-state.edu\">http:\/\/mvapich.cse.ohio-state.edu<\/a>), are currently being used by more than 2,950 organizations worldwide (in 86 countries). More than 518,000 downloads of this software have taken place from the project&#8217;s site. This software is empowering several InfiniBand clusters (including the 3<sup>rd<\/sup>, 14<sup>th<\/sup>, 17<sup>th<\/sup>, and 27<sup>th<\/sup> ranked ones) in the TOP500 list. The RDMA packages for Apache Spark, Apache Hadoop and Memcached together with OSU HiBD benchmarks from his group (<a href=\"http:\/\/hibd.cse.ohio-state.edu\">http:\/\/hibd.cse.ohio-state.edu<\/a>) are also publicly available.\u00a0 These libraries are currently being used by more than 300 organizations in 35 countries. More than 28,000 downloads of these libraries have taken place. High-performance and scalable versions of the TensorFlow and Caffe frameworks are available from <a href=\"http:\/\/hidl.cse.ohio-state.edu\">http:\/\/hidl.cse.ohio-state.edu<\/a>. Prof. Panda is an IEEE Fellow. More details about Prof. Panda are available at <a href=\"http:\/\/www.cse.ohio-state.edu\/~panda\">http:\/\/www.cse.ohio-state.edu\/~panda<\/a>.<\/p>\n<hr \/>\n<h3>RESEARCH PAPERS<\/h3>\n<p><em><strong><a id=\"bunde\"><\/a>Shortest paths in Dragonfly systems<\/strong><\/em><br \/>\nRyland Curtsinger and David Bunde, Knox College, USA<\/p>\n<p>Dragonfly is a topology for high-performance computer systems designed to exploit technology trends and meet challenging system constraints, particularly on power. In a Dragonfly system, compute nodes are attached to switches, the switches are organized into groups, and the network is organized as a two-level clique, with an edge between every switch in a group and an edge between every pair of groups. This means that every pair of switches is separated by at most three hops, one within a source group, one from the source group to the destination group, and one within the destination group. Routing using paths of this form is typically called \u201cminimal routing\u201d. In this paper, we show that the resulting paths are not always the shortest possible. We then propose a new class of paths that can be used without additional networking hardware and count its members that are shorter than or of equal length.<\/p>\n<p><strong><em><a id=\"zahn\"><\/a>Effects of Congestion Management on Energy Saving Techniques in Interconnection Networks<\/em><\/strong><br \/>\nFelix Zahn, Pedro Yebenes, Jesus Escudero-Sahuquillo, Pedro Javier Garcia and Holger Froening, Heidelberg University, Germany<\/p>\n<p>In post-Dennard scaling energy becomes more and more important. While most components in data-center and supercomputer become increasingly energy-proportional, this trend seems to pass on interconnection networks. Although previous studies have shown huge potential for saving energy in interconnects, the associated performance decrease seems to be obstructive. An increase of execution time can be caused by a decreased bandwidth as well as by transition times which links need to reconfigure and are not able to transmit data. This leads to more contention on the network than usually interconnects have to deal with.<br \/>\nCongestion management is used in similar situations to limit the impact if these contentions only to single links and avoiding them to congest the entire network. Therefore, we propose combining energy saving policies and congestion management queueing schemes in order to maintain performance while saving energy. For synthetic hotspot traffic, which we use to stress the network, this combination shows a promising result for multiple topologies. In 3D torus, k-ary n-tree, and dragonfly this combination provides a more than 50% lower latency and increases energy efficiency by more than 50% compared to the baseline. Although both techniques aim for fundamentally different goals, none of the investigated configurations seems to suffer any disadvantages form their combination.<\/p>\n<hr \/>\n<h3>INVITED TALKS<\/h3>\n<p><strong><em><a id=\"kim\"><\/a>Revisiting the Dragonfly Topology in High-Performance Interconnection Networks<\/em><\/strong><\/p>\n<p><img decoding=\"async\" src=\"http:\/\/hipineb.i3a.info\/hipineb2015\/wp-content\/uploads\/sites\/3\/2015\/10\/image.jpeg\" alt=\"\" width=\"200\" height=\"261\" \/><\/p>\n<p><em>Prof. John Kim,\u00a0Associate Professor, KAIST, South Korea<\/em><\/p>\n<p>High-radix routers in high-performance computing were proposed to exploit the increasing router pin bandwidth. Given the high-radix routers, a new topology, Dragonfly, was proposed 10 years ago to take advantage of the high-radix routers and the signaling technology. The Dragonfly topology has also been implemented in real systems. In this talk, I will re-visit the Dragonfly topology and in particular, the benefits and the challenges associated with the topology. In addition, I will try to answer if the Dragonfly is the most efficient topology for high-performance computing today.<\/p>\n<p><strong>John Kim<\/strong> is currently an associate professor in the School of Electrical Engineering at KAIST (Korea Advanced Institiute of Science and Technology) in Daejeon, Korea. John Kim received his Ph.D. from Stanford University and B.S\/M.Eng from Cornell University. His research interests include computer architecture, interconnection networks, security, and mobile systems. Prior to graduate school, he worked on the design of several microprocessors at Intel and at Motorola.<\/p>\n<hr \/>\n<p><strong><em><a id=\"chen\"><\/a>Routerless Network-on-Chip and Its Optimizations by Deep Reinforcement Learning<\/em><\/strong><\/p>\n<p><strong><img decoding=\"async\" class=\"\" src=\"http:\/\/web.engr.oregonstate.edu\/~chenliz\/img\/portrait-chen-large.jpg\" alt=\"Resultado de imagen de lizhong chen\" width=\"202\" height=\"277\" \/><\/strong><\/p>\n<p><em>Prof. Lizhong Chen, Oregon State University, USA<\/em><\/p>\n<p>Current and future many-core processors in HPCs demand highly efficient on-chip networks to connect hundreds or even thousands of processing cores. While router-based networks-on-chip (NoCs) offer excellent scalability, they also incur significant power and area overhead due to complex router structures. In this talk, we present a new class of on-chip networks, referred to as <em>Routerless NoCs<\/em>, where costly routers are eliminated. An example design is proposed that utilizes on-chip wiring resources smartly to achieve comparable hop count and scalability as router-based NoCs. To explore the large design space of routerless NoCs more effectively, we further develop a novel deep reinforcement learning framework that learns the optimal loop selection for routerless NoCs with various design constraints. Compared with a conventional mesh, the proposed design is able to achieve 9.5X reduction in power, 7.2X reduction in area, 2.5X reduction in zero-load packet latency, and 1.7X increase in throughput. These results demonstrate the viability and promising benefits of the routerless paradigm, and call for future works that continue to improve the performance, reliability and security of routerless NoCs.<\/p>\n<p><strong>Lizhong Chen<\/strong> is currently an Assistant Professor in the School of Electrical Engineering and Computer Science at Oregon State University. Dr. Chen received his Ph.D. in Computer Engineering and M.S. in Electrical Engineering from the University of Southern California in 2014 and 2011, respectively. His research interests include computer architecture, interconnection networks, GPUs, machine learning, hardware accelerators and emerging IoT technologies. Dr. Chen is the recipient of National Science Foundation\u2019s CRII Award (2016), NSF CAREER Award (2018), Best Paper Nomination at IEEE NAS (2018) and has received multiple other awards\/grants from government agencies and industry. He has served as the program committee member in top computer architecture conferences (e.g., ISCA, DAC, ICS), reviewer for a number of IEEE and ACM journals (e.g., TC, TPDS, TVLSI, TCAD, TACO), and panelist of multiple NSF panels related to computer systems architecture. Dr. Chen is also the founder and organizer of the annual International Workshop on AIDArc (AI-assisted Design for Architecture), held in conjunction with ISCA.<\/p>\n<hr \/>\n<p><em><strong><a id=\"hoefler\"><\/a>Hardware implementations of streaming Processing in the Network NICs<\/strong><\/em><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/cacm.acm.org\/system\/assets\/0002\/1806\/hoefler_eth_official.large.jpg?1476779516&amp;1448026227\" alt=\"Resultado de imagen de torsten hoefler\" \/><\/p>\n<p><em>Prof. Torsten Hoefler, ETH Z\u00fcrich, Switzerland<\/em><\/p>\n<p>We will briefly recap the network acceleration framework streaming Processing in the Network (sPIN), which can easiest be described as &#8220;CUDA for the network card&#8221;. Then, we will describe two different hardware prototype implementations. One using an ARM-based Smart NIC of Broadcom and a second one using a custom RISC-V-based microarchitecture emulated in FPGAs. We will discuss trade-offs and performance for both implementations with several use-cases. Overall, we conclude that an implementation is feasible and should take advantage of the properties of the sPIN programming model.<\/p>\n<p><strong>Torsten Hoefler<\/strong> is an Associate Professor of Computer Science at ETH Z\u00fcrich, Switzerland. Before joining ETH, he led the performance modeling and simulation efforts of parallel petascale applications for the NSF-funded\u00a0Blue Waters project at NCSA\/UIUC. He is also a key member of the Message Passing Interface (MPI) Forum where he chairs the &#8220;Collective Operations and Topologies&#8221; working group. Torsten won best paper awards at the ACM\/IEEE Supercomputing Conference SC10, SC13, SC14, EuroMPI&#8217;13, HPDC&#8217;15, HPDC&#8217;16, IPDPS&#8217;15, and other conferences. He published numerous peer-reviewed scientific conference and journal articles and\u00a0authored chapters of the MPI-2.2 and MPI-3.0 standards. He received the Latsis prize of ETH Zurich as well as an ERC starting grant in 2015. His\u00a0research interests revolve around the central topic of\u00a0&#8220;Performance-centric System Design&#8221; and include scalable networks,\u00a0parallel programming techniques, and performance modeling. Additional\u00a0information about Torsten can be found on his homepage at\u00a0<a href=\"http:\/\/htor.inf.ethz.ch\">htor.inf.ethz.ch<\/a>.<\/p>\n<hr \/>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>PROGRAM HIGHLIGHTS The HiPINEB workshop comprises this year the following activities: Keynote\u00a0will be given by Prof. Dhabaleswar K. (DK) Panda\u00a0(Ohio State University, USA). Research Papers: research papers will be presented &#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"class_list":["post-85","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/pages\/85","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/comments?post=85"}],"version-history":[{"count":23,"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/pages\/85\/revisions"}],"predecessor-version":[{"id":117,"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/pages\/85\/revisions\/117"}],"wp:attachment":[{"href":"https:\/\/hipineb.i3a.info\/hipineb2019\/wp-json\/wp\/v2\/media?parent=85"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}