Write a Blog >>
Tue 25 Feb 2020 10:00 - 10:25 - Scaling (Mediterranean Ballroom) Chair(s): Zhijia Zhao

Finite State Machine (FSM) is a key component in many important applications, such as Huffman decoding, regular expression matching, and HTML tokenization. Due to its inherent dependency and unpredictable memory access, FSM computations are considered to be extremely difficult to parallelize. As such, significant research efforts have been made to accelerate FSM computations. Although they achieve promising performance results on multi-core machines, these methods are not scalable for emerging many-core architectures such as GPUs.

We analyzed the bottleneck of achieving scalability on GPUs is the sequential merge inherent to these methods. However, different from simple reduction loops, parallelization of FSM computations typically incorporates runtime check and re-execution, which may significantly affect performance. Based on these observations, our parallel merge implementations select efficient runtime check implementations and avoids unnecessary re-executions. Further, based on GPU architectural features, we develop optimization techniques to further improve performance.

We evaluate our parallel merge implementations on a set of representative algorithms. Experimental results show that our parallel merge implementation is several times more efficient than sequential merge implementations and achieves linear scalability on different GPUs.

Tue 25 Feb

Displayed time zone: Tijuana, Baja California change

09:35 - 10:25
Scaling (Mediterranean Ballroom)Main Conference
Chair(s): Zhijia Zhao UC Riverside
Using Sample-Based Time Series Data for Automated Diagnosis of Scalability Losses in Parallel Programs
Main Conference
Lai Wei Rice University, John Mellor-Crummey Rice University
Scaling out Speculative Execution of Finite-State Machines with Parallel Merge
Main Conference
Yang Xia The Ohio State University, Peng Jiang The University of Iowa, Gagan Agrawal The Ohio State University