Write a Blog >>

Molecular dynamics (MD) simulations are playing an increasingly important role in many research areas. Pair-wise potentials are widely used in MD simulations of bio-molecules, polymers, and nanometer materials. Due to a low compute-to-memory-access ratio, their calculation is often restricted by memory transfer speeds. Sunway TaihuLight is one of the fastest supercomputers featuring its own SW26010 many-core processor. Since the SW26010 has some critical limitations regarding main memory bandwidth and scratchpad memory size, it is considered as a good platform to investigate the optimization of pair-wise potentials especially in terms of data reusage. MD algorithms often use a neighbor-list data structure to reduce the computational workload. In this paper, we show that a cell-linked-list-based approach is more suitable for the SW26010 processor. We apply a number of novel optimization methods including self-adaptable replica-summation for conflict-free parallelization, parameter profiles for flexible vectorization, and fast intersection filters for reducing the computational workload. Experiments show that our implementation on a single SW26010 can achieve almost the same performance as a KNL-based Xeon Phi for the non-bonded kernel of AMBER. We also established an open source standalone framework featuring the techniques above, ESMD, which is at least 50% faster than the latest existing LAMMPS port on a single TaihuLight node. Furthermore, EMSD achieves a weak scaling efficiency of 88% on 4,096 nodes.