Tutorial: Programming with Lightweight Threads using Argobots
Aim of the tutorial
Threading models already play an important role in intranode programming and will continue to do so. Over the past few decades, lightweight threading models such as Argobots, Qthreads, OmpSs, and Converse have been developed and have been shown to outperform traditional Pthreads by several orders of magnitude. Consequently, they are already used in a number of large software projects including LLVM OpenMP (through BOLT), Intel DAOS, Chapel, Charm++, Mercury, and are supported by a number of MPI implementations such as MPICH, Open MPI, and MVAPICH, for hybrid MPI+threads programming. In this tutorial, we will focus on Argobots as the primary user-level threads example, although we will briefly introduce other user-level threading models too. We will start with the fundamentals of lightweight threading models and go on to cover advanced features that would allow users to achieve the best performance for their specific applications. We will also cover high-level programming interfaces that use lightweight threads, including OpenMP, focusing on aspects that users should watch out for. The tutorial will include short hands-on “breaks” after each concept is introduced, so attendees can apply these concepts in additional exercises.
Outline
- Introduction
- Background: OS-Level Threads and Lightweight Threads
- Argobots and other Lightweight Threads
- Running Examples: 2D Stencil Code
- User-Level Thread Scheduling
- User-defined Schedulers
- Thread Pools
- Synchronization Objects
- Break
- Performance and Correctness Debugging
- Tracking and Identifying Bugs
- Performance Debugging
- Performance considerations with Massive Parallelism
- Advanced Topics
- OpenMP over Lightweight Threads
- MPI + Lightweight Threads
- Conclusions and Final Q/A
Target Audience
- This tutorial is targeted for various categories of people working in the areas of high performance computing, storage, networking, and applications related to high-end systems: Scientists, engineers, and researchers working on the design and development of next generation high-end systems including clusters, data centers, storage centers.
- Developers of next generation computing middleware and applications.
- Managers and administrators responsible for setting-up next generation high-end systems and facilities in their organizations/laboratories.
Prerequisite Knowledge
This tutorial is intended for an audience that is already familiar with basic OS-level threading concepts (such as Pthreads) and programming, although attendees need not be experts. In particular the presentation of code examples will assume this familiarity.