Write a Blog >>
Tue 25 Feb 2020 10:00 - 10:25 - Scaling (Mediterranean Ballroom) Chair(s): Zhijia Zhao

Finite State Machine (FSM) is a key component in many important applications, such as Huffman decoding, regular expression matching, and HTML tokenization. Due to its inherent dependency and unpredictable memory access, FSM computations are considered to be extremely difficult to parallelize. As such, significant research efforts have been made to accelerate FSM computations. Although they achieve promising performance results on multi-core machines, these methods are not scalable for emerging many-core architectures such as GPUs.

We analyzed the bottleneck of achieving scalability on GPUs is the sequential merge inherent to these methods. However, different from simple reduction loops, parallelization of FSM computations typically incorporates runtime check and re-execution, which may significantly affect performance. Based on these observations, our parallel merge implementations select efficient runtime check implementations and avoids unnecessary re-executions. Further, based on GPU architectural features, we develop optimization techniques to further improve performance.

We evaluate our parallel merge implementations on a set of representative algorithms. Experimental results show that our parallel merge implementation is several times more efficient than sequential merge implementations and achieves linear scalability on different GPUs.

Tue 25 Feb
Times are displayed in time zone: (GMT-07:00) Tijuana, Baja California change

09:35 - 10:25: Main Conference - Scaling (Mediterranean Ballroom)
Chair(s): Zhijia ZhaoUC Riverside
PPoPP-2020-papers09:35 - 10:00
Lai WeiRice University, John Mellor-CrummeyRice University
PPoPP-2020-papers10:00 - 10:25
Yang XiaThe Ohio State University, Peng JiangThe University of Iowa, Gagan AgrawalThe Ohio State University