Principle and Practice of Parallel Programing (PPOPP) Workshop on Programming Models for SIMD/Vector Processing(WPMVP), Mar 2016
Lionel LACASSAGNE, Laurent CABARET, Daniel ETIEMBLE, Farouk HEBBACHE, Andrea PETRETO
This paper presents a new multi-pass iterative algorithm for Connected Component Labeling. The performance of this algorithm is compared to those of State-of-the-Art two-pass direct algorithms. We show that thanks to the parallelism of the SIMD multi-core processors and an activity matrix that avoids useless memory access, such algorithms have performance that comes closer and closer to direct ones. This new active-tile iterative algorithm has been benchmarked on four generations of Intel Xeon processors: 2×4-coreNehalem, 2×12-core Ivy-Bridge, 2×14-core Haswell and 57-core Knight Corner. Macro meta-programming was used to design a unique code for SSE, AVX2 and KNC SIMD instruction set.