Hello Guix! This patch series is an attempt to allow users to build or substitute packages for the very CPU they are using, as opposed to using a generic binary that targets the baseline architecture—e.g., x86_64 without AVX extensions. As a reminder, my take on this is that The Right Thing is for code to select optimized implementations for the host CPU at load time, using (possibly hand-crafted) “function multi-versioning”: https://hpc.guix.info/blog/2018/01/pre-built-binaries-vs-performance/ Now, there’s at least one situation where developers don’t do “the right thing”: C++ header-only libraries. It turns out header-only libraries with #ifdef’d SIMD code are quite common: Eigen, xsimd, xtensor, etc. Every user of those libs has to be compiled with ‘-march=native’ to take advantage of those SIMD-optimized routines and there’s little hope of seeing those libraries implement load-time or run-time selection¹. This patch set implements “package multi-versioning”, where a package can have different variants users may choose from: baseline, haswell, skylake, etc. This is implemented as a package transformation option, ‘--tune’. Without any argument, ‘--tune’ grafts tuned package variants for each package that has the ‘tunable?’ property. For example: guix shell eigen-benchmarks --tune -- benchBlasGemm 16 16 16 100 100 runs one of the Eigen benchmarks tuned for the host CPU, because ‘eigen-benchmarks’ is marked as “tunable”. This is achieved not by passing ‘-march=native’, because the daemon might be running on a separate machine with a different CPU, but by identifying the ‘-march’ value corresponding to the host CPU and passing ‘-march’ to the compiler, via a wrapper. On my skylake laptop, that gives a noticeable difference on the GEMM benchmark of Eigen and good results on the xtensor benchmarks too, unsurprisingly. I don’t have figures for higher-level applications, but it’d be nice to benchmark some of Eigen’s dependents for instance, as shown by: guix graph -M2 -t reverse-package eigen | xdot -f fdp - If you could run such benchmarks, that’d be great! :-) Things like Fenics may benefit from it. Nix people chose to introduce separate system types for the various x86_64 micro-architecture levels: x86_64-linux-v1, x86_64-linux-v2, etc.² I think this is somewhat wasteful and unpractical though. It’s also unclear whether those levels, defined in the new x86_64 psABI³, are a viable abstraction: vendors seem to be mixing features rather than really following the accumulative pattern that those levels imply. Thoughts? Ludo’. ¹ https://listengine.tuxfamily.org/lists.tuxfamily.org/eigen/2021/11/msg00006.html ² https://discourse.nixos.org/t/nix-2-4-released/15822 ³ https://gitlab.com/x86-psABIs/x86-64-ABI/-/blob/master/x86-64-ABI/low-level-sys-info.tex Ludovic Courtès (10): Add (guix cpu). transformations: Add '--tune'. ci: Add extra jobs for tunable packages. gnu: Add eigen-benchmarks. gnu: Add xsimd-benchmark. gnu: Add xtensor-benchmark. gnu: ceres-solver: Mark as tunable. gnu: Add ceres-solver-benchmarks. gnu: libfive: Mark as tunable. gnu: prusa-slicer: Mark as tunable. Makefile.am | 1 + doc/guix.texi | 54 ++++++++++++++ gnu/ci.scm | 43 ++++++++--- gnu/packages/algebra.scm | 79 ++++++++++++++++++++ gnu/packages/cpp.scm | 23 ++++++ gnu/packages/engineering.scm | 10 ++- gnu/packages/maths.scm | 49 ++++++++++++- guix/cpu.scm | 137 +++++++++++++++++++++++++++++++++++ guix/transformations.scm | 134 ++++++++++++++++++++++++++++++++++ tests/transformations.scm | 20 +++++ 10 files changed, 538 insertions(+), 12 deletions(-) create mode 100644 guix/cpu.scm base-commit: 052f56e5a614854636563278ee5a2248b3609d87 prerequisite-patch-id: 7e5c2bb5942496daf01a7f6dfc1b0b5b214f1584 -- 2.33.0