Intel Silently Merges New AVX-512 Quicksort Library, Up To 17x Improvement

Intel Silently Merges New AVX-512 Quicksort Library, Up To 17x Improvement

 0
Intel Silently Merges New AVX-512 Quicksort Library, Up To 17x Improvement
Image source: NumPy via J.Wilson, Wccftech.

NumPy, or Numerical Python, is one of the Python libraries that focuses on scientific computing in the well-known coding language and has recently integrated Intel's C++ header file library used for quicksort in AVX-512. The new integration shows increased speeds of ten to seventeen times faster SIMD-based sorting.

NumPy's library, based in Python, is explained as providing:

...a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

— according to the official NumPy project website.

Intel uploaded the x86-simd-sort onto the company's GitHub to supply users with a C++ header file library to assist with SIMD sorting at a higher-performing level. Raghuveer Devulapalli, one of the Intel engineers, was crucial in integrating the x86-simd-sort code into NumPy. However, the file library only focuses on AVX-512 and its quick sort inclusion.

[The new x86-simd-sort is a] C++ header file library for SIMD based 16-bit, 32-bit and 64-bit data type sorting on x86 processors. Source header files are available in src directory. We currently only have AVX-512 based implementation of quicksort. This repository also includes a test suite which can be built and run to test the sorting algorithms for correctness. It also has benchmarking code to compare its performance relative to std::sort.

Michael Larabel, Linux analyst and editor of the website Phoronix, states that the results are incredibly favorable, where the increase in sorting with AVX-512 assisted the project enhanced performance between ten to seventeen times.

Larabel notes that PR 22315 was introduced into NumPy to "vectorized the quicksort for 16-bit and 64-bit data types" bf the AVX-512 integration. He continues that Tiger Lake-based systems, specifically ones that use the 11th Gen Tiger Lake i7-1165G7, witnessed the highest speed in 16-bit int sorting (seventeen times better). In contrast, 64-bit float sorting received the lowest (ten times increased). Lastly, 32-bit data types and random arrays did see an improvement of twelve to thirteen times increased sorting capability. You can see the results of the benchmarks here.

News Sources: Phoronix, Intel GitHub 1, 2

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow