Write a Blog >>
Wed 26 Feb 2020 10:00 - 10:25 - Concurrency and GPU (Mediterranean Ballroom) Chair(s): Ang Li

Current state-of-the-art in GPU networking utilizes a host-centric, kernel-boundary communication model that reduces performance and increases code complexity. Recent research investigations have explored performing network operations from within a GPU kernel itself. However, these approaches involve the CPU in the critical path which leads to high latency and inefficient utilization of network and/or GPU resources. In this work, we introduce GPU Initiated OpenSHMEM (GIO), a new GPU-centric PGAS programming model and runtime that enables GPUs to act as first-class citizens in network-based systems. GIO leverages tight integration of GPUs and NICs to provide high-performance intra-kernel networking by enabling GPUs to efficiently and directly initiate network operations. This paper explores the GPU’scoarse-grained memory model and its mismatch when GPUs wish to directly interact with the network from within a kernel. GIO also reduces latency by relying on a new template-based design to minimize the overhead of initiating a network operation. We illustrate that for a regular application like a Jacobi 2D Stencil, GIO can improve application performance by up to 40% compared to traditional kernel-boundary networking. Furthermore, we demonstrate that on irregular applications like Sparse Triangular Solve (SpTS), GIO provides up to 44% improvement compared to existing Intra-kernel networking schemes.

Wed 26 Feb

Displayed time zone: Tijuana, Baja California change

09:35 - 10:50
Concurrency and GPU (Mediterranean Ballroom)Main Conference
Chair(s): Ang Li Pacific Northwest National Laboratory
09:35
25m
Talk
Overlapping Host-to-Device Copy and Computation using Hidden Unified Memory
Main Conference
Jaehoon Jung Seoul National University, Daeyoung Park Seoul National University, Youngdong Do Seoul National University, Jungho Park Seoul National University, Jaejin Lee Seoul National University
10:00
25m
Talk
GPU Initiated OpenSHMEM: Correct and Efficient Intra-Kernel Networking for dGPUs
Main Conference
KHALED HAMIDOUCHE Advanced Micro Devices (AMD), Michael LeBeane Advanced Micro Devices (AMD)
10:25
25m
Talk
No Barrier in the Road: A Comprehensive Study and Optimization of ARM Barriers
Main Conference
Nian Liu Shanghai Jiao Tong University, Binyu Zang Shanghai Jiao Tong University, Haibo Chen Shanghai Jiao Tong University