Write a Blog >>

Streaming computations often exhibit substantial data parallelism that makes them suited to SIMD architectures. However, many such computations also exhibit irregularity, in the form of data-dependent, dynamic data rates, that makes efficient SIMD execution challenging. One aspect of this challenge is the need to schedule execution of a computation realized as a pipeline of stages connected by finite queues. A scheduler must both ensure high SIMD occupancy by gathering queued items into vectors and minimize costs associated with switching execution between pipeline stages.

In this work, we present the AFIE (Active Full, Inactive Empty) scheduling policy for irregular streaming applications on SIMD processors. AFIE provably groups inputs to each stage of a pipeline into a minimal number of SIMD vectors while incurring a bounded number of switches relative to the best possible policy. These results apply even though irregularity prevents the scheduler from knowing a priori how many outputs will be generated for each input to each node in a pipeline.

We have implemented AFIE as an extension to the MERCATOR system for building irregular streaming applications on NVIDIA GPUs. We report our observations on how the AFIE scheduler simplifies MERCATOR’s runtime code and empirically measure the new scheduler’s improved performance on irregular streaming applications.