Write a Blog >>
Wed 26 Feb 2020 10:25 - 10:50 - Concurrency and GPU (Mediterranean Ballroom) Chair(s): Ang Li

In this paper, we present the first comprehensive performance characterization and optimization of ARM barriers on both mobile and server platforms. We draw a set of observations through several abstracted models and validate them in scenarios where barriers are intensively used. We find that (1) order-preserving approaches without involving the bus significantly outperform other ones, and (2) the tremendous overhead mostly comes from barriers strictly following remote memory references. Usually, such barriers are inserted when threads are exchanging data, and they are used to ensure the relative order between storing the data to a shared buffer and setting a flag to inform the receiver. Based on the observations, we propose a new mechanism, Pilot, to remove such barriers by leveraging the single-copy atomicity to piggyback the flag with the data. Applying Pilot provides 10%-380% performance improvements in multiple benchmarks, which are close to the ideal performance without barriers.

Wed 26 Feb

Displayed time zone: Tijuana, Baja California change

09:35 - 10:50
Concurrency and GPU (Mediterranean Ballroom)Main Conference
Chair(s): Ang Li Pacific Northwest National Laboratory
09:35
25m
Talk
Overlapping Host-to-Device Copy and Computation using Hidden Unified Memory
Main Conference
Jaehoon Jung Seoul National University, Daeyoung Park Seoul National University, Youngdong Do Seoul National University, Jungho Park Seoul National University, Jaejin Lee Seoul National University
10:00
25m
Talk
GPU Initiated OpenSHMEM: Correct and Efficient Intra-Kernel Networking for dGPUs
Main Conference
KHALED HAMIDOUCHE Advanced Micro Devices (AMD), Michael LeBeane Advanced Micro Devices (AMD)
10:25
25m
Talk
No Barrier in the Road: A Comprehensive Study and Optimization of ARM Barriers
Main Conference
Nian Liu Shanghai Jiao Tong University, Binyu Zang Shanghai Jiao Tong University, Haibo Chen Shanghai Jiao Tong University