How to build LLVM from source?

It is not easy to find documentation regarding Clang’s optimization options, especially for newer versions of LLVM. So I did a manual statistics and organized them in a table. This article and the table was inspired&forked from lolo32, many thanks!
Currently, there are two versions of Clang co-existing my Intel NUC 10, Clang 12.0.1(installed through source code) and Clang 11.0.0(installed through yum). The OS information is as the title. Version 1:

  Clang version 12.0.1
  Target: x86_64-unknown-linux gnu
  Thread model: posix
Bash

This was made with commands:

  echo 'int;' | clang++-12 -xc -O0    - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -O1    - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -O2    - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -O3    - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -Ofast - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -Os    - -o /dev/null -\#\#\#
  echo 'int;' | clang++-12 -xc -Oz    - -o /dev/null -\#\#\#
Bash

-O0 means “no optimization”: this level compiles the fastest and generates the most debuggable code. It enable -mrelax-all option.
-O1 somewhere between -O0 and -O2.
-O2 moderate level of optimization which enables most optimizations.
-O3 is like -O2 except that it enables optimizations that take longer to perform or that may generate larger code (in an attempt to make the program run faster).
-Ofast enable -O3, with other aggressive optimizations that may violate strict compliance with language standards. It speedups math calculations. Including 1. Floating-point math obeys regular algebraic rules for real numbers (e.g. + and * are associative, x/y == x * (1/y), and (a + b) * c == a * c + b * c) 2. Operands to floating-point operations are not equal to NaN and Inf, and 3. +0 and-0 are interchangeable. -ffast-math also defines the __FAST_MATH__ preprocessor macro. Some math libraries recognize this macro and change their behavior. With the exception of -ffp-contract=fast, using any of the options below to disable any of the individual optimizations in -ffast-math will cause __FAST_MATH__ to no longer be set.
-Os is like -O2 with extra optimizations to reduce code size.
-Oz is like -Os, but try to minimize even more the code size.

Below are the tables,

Option	-O0	-O1	-O2	-O3	-Ofast	-Os	-Oz	Description
`-cc1`:	✅	✅	✅	✅	✅	✅	✅	the frontend
`-triple x86_64-unknown-linux-gnu`	✅	✅	✅	✅	✅	✅	✅	Specify target triple(architecture)
`-emit-obj`:	✅	✅	✅	✅	✅	✅	✅	Emit native object files
`-mrelax-all`:	✅	❌	❌	❌	❌	❌	❌	(integrated) relax all machine instructions
`--mrelax-relocations`:	✅	✅	✅	✅	✅	✅	✅	These options control whether the assembler should generate relax relocations
`-disable-free`:	✅	✅	✅	✅	✅	✅	✅	Disable freeing of memory on exit
`-mrelocation-model static`:	✅	✅	✅	✅	✅	✅	✅	The relocation model to use (what is static?)
`-mframe-pointer=all`:	✅	❌	❌	❌	❌	❌	❌	keep frame pointers
`-mframe-pointer=none`:	❌:	✅	✅	✅	✅	✅	✅	eliminate frame pointers which point to the base address of the function’s frame
`-menable-no-inf`:	❌	❌	❌	❌	✅	❌	❌	Allow ;optimization to assume there are no infinities.
`-menable-no-nans`:	❌	❌	❌	❌	✅	❌	❌	Allow ;optimization to assume there are no NaNs.
`-menable-unsafe-fp-math`:	❌	❌	❌	❌	✅	❌	❌	Allow unsafe floating-point math optimizations which may decrease precision
`-fno-signed-zeros`:	❌	❌	❌	❌	✅	❌	❌	Allow optimizations for floating point arithmetic that ignore the signedness of zero
`-mreassociate`:	❌	❌	❌	❌	✅	❌	❌	Allow reassociation transformations for floating-point instructions
`-freciprocal-math`:	❌	❌	❌	❌	✅	❌	❌	Allow division operations to be reassociated
`-fdenormal-fp-math=preserve-sign,preserve-sign`:	❌	❌	❌	❌	✅	❌	❌	Select which denormal numbers the code is permitted to require.
`-ffp-contract=fast`:	❌	❌	❌	❌	✅	❌	❌	Form fused FP ops (e.g. FMAs): fast (everywhere) OR on (according to FP_CONTRACT pragma, ;default) OR off (never fuse)
`-fmath-errno`:	✅	✅	✅	✅	❌	✅	✅	Require math functions to indicate errors by setting errno
`-fno-rounding-math`:	✅	✅	✅	✅	✅	✅	✅	Force floating-point operations to honor the dynamically-set rounding mode by default.
`-ffast-math`:	❌	❌	❌	❌	✅	❌	❌	Enable fast-math mode. This option lets the compiler make aggressive, potentially-lossy assumptions about floating-point math
`-ffinite-math-only`:	❌	❌	❌	❌	✅	❌	❌	Allow floating-point optimizations that assume arguments and results are not NaNs or +-Inf.
`-mconstructor-aliases`:	✅	✅	✅	✅	✅	✅	✅	enable constructor aliases
`-munwind-tables`:	✅	✅	✅	✅	✅	✅	✅	Generate unwinding tables for all functions
`-target-cpu x86-64`:	✅	✅	✅	✅	✅	✅	✅	Target a specific cpu type
`-tune-cpu generic`:	✅	✅	✅	✅	✅	✅	✅	tells the compiler to emit instructions for some (probably ancient, like generic x86-64) CPU, but schedule (order) the instructions for a (probably more common, like a broadwell or a znver2) one. Same as `-mtune` on GCC
`-fno-split-dwarf-inlining`:	✅	✅	✅	✅	✅	✅	✅	Provide minimal debug info in the object/executable to facilitate online symbolication/stack traces in the absence of .dwo/.dwp files when using Split DWARF
`-debugger-tuning=gdb`:	✅	✅	✅	✅	✅	✅	✅	tune the debug info
`-internal-isystem`:	✅	✅	✅	✅	✅	✅	✅
`-internal-externc-isystem`:	✅	✅	✅	✅	✅	✅	✅
`-resource-dir`:	✅	✅	✅	✅	✅	✅	✅	The directory which holds the compiler resource files
`-fdebug-compilation-dir`:	✅	✅	✅	✅	✅	✅	✅	The compilation directory to embed in the debug info.
`-ferror-limit 19`:	✅	✅	✅	✅	✅	✅	✅	Set the maximum number of errors to emit before stopping (0 = no limit).
`-fgnuc-version=4.2.1`:	✅	✅	✅	✅	✅	✅	✅	Sets various macros to claim compatibility with the given GCC version
`-fcolor-diagnostics`:	✅	✅	✅	✅	✅	✅	✅	Use colors in diagnostics
`-faddrsig`:	✅	✅	✅	✅	✅	✅	✅	Emit an address-significance table
`-vectorize-loops`:	❌	❌	✅	✅	✅	✅	❌	Run the Loop vectorization passes
`-vectorize-slp`:	❌	❌	✅	✅	✅	✅	✅	Run the SLP vectorization passes
`-main-file-name`:	✅	✅	✅	✅	✅	✅	✅

And Version 2:

  Clang version 11.0.0(Red Hat 11.0.0-1.module+el8.4.0)
  Target: x86_64-unknown-linux gnu
  Thread model: posix
Bash

With the same commands except replacing clang++-12 with clang++ for each.

Option	-O0	-O1	-O2	-O3	-Ofast	-Os	-Oz	Description
`-cc1`:	✅	✅	✅	✅	✅	✅	✅	the frontend
`-triple x86_64-unknown-linux-gnu`	✅	✅	✅	✅	✅	✅	✅	Specify target triple(architecture)
`-emit-obj`:	✅	✅	✅	✅	✅	✅	✅	Emit native object files
`-mrelax-all`:	✅	❌	❌	❌	❌	❌	❌	(integrated) relax all machine instructions
`-disable-free`:	✅	✅	✅	✅	✅	✅	✅	Disable freeing of memory on exit
`-disable-llvm-verifier`:	✅	✅	✅	✅	✅	✅	✅	Don’t run the LLVM IR verifier pass
`-discard-value-names`:	❌	✅	✅	✅	✅	✅	✅	Discard value names when generating LLVM IR
`-mrelocation-model static`:	✅	✅	✅	✅	✅	✅	✅	The relocation model to use (what is static?)
`-mframe-pointer=all`:	✅	❌	❌	❌	❌	❌	❌	keep frame pointers
`-mframe-pointer=none`:	❌:	✅	✅	✅	✅	✅	✅	eliminate frame pointers which point to the base address of the function’s frame
`-menable-no-inf`:	❌	❌	❌	❌	✅	❌	❌	Allow ;optimization to assume there are no infinities.
`-menable-no-nans`:	❌	❌	❌	❌	✅	❌	❌	Allow ;optimization to assume there are no NaNs.
`-menable-unsafe-fp-math`:	❌	❌	❌	❌	✅	❌	❌	Allow unsafe floating-point math optimizations which may decrease precision
`-fno-signed-zeros`:	❌	❌	❌	❌	✅	❌	❌	Allow optimizations for floating point arithmetic that ignore the signedness of zero
`-mreassociate`:	❌	❌	❌	❌	✅	❌	❌	Allow reassociation transformations for floating-point instructions
`-freciprocal-math`:	❌	❌	❌	❌	✅	❌	❌	Allow division operations to be reassociated
`-fdenormal-fp-math=preserve-sign,preserve-sign`:	❌	❌	❌	❌	✅	❌	❌	Select which denormal numbers the code is permitted to require.
`-ffp-contract=fast`:	❌	❌	❌	❌	✅	❌	❌	Form fused FP ops (e.g. FMAs): fast (everywhere) OR on (according to FP_CONTRACT pragma, ;default) OR off (never fuse)
`-fmath-errno`:	✅	✅	✅	✅	❌	✅	✅	Require math functions to indicate errors by setting errno
`-fno-rounding-math`:	✅	✅	✅	✅	✅	✅	✅	Force floating-point operations to honor the dynamically-set rounding mode by default.
`-ffast-math`:	❌	❌	❌	❌	✅	❌	❌	Enable fast-math mode. This option lets the compiler make aggressive, potentially-lossy assumptions about floating-point math
`-ffinite-math-only`:	❌	❌	❌	❌	✅	❌	❌	Allow floating-point optimizations that assume arguments and results are not NaNs or +-Inf.
`-mconstructor-aliases`:	✅	✅	✅	✅	✅	✅	✅	enable constructor aliases
`-munwind-tables`:	✅	✅	✅	✅	✅	✅	✅	Generate unwinding tables for all functions
`-target-cpu x86-64`:	✅	✅	✅	✅	✅	✅	✅	Target a specific cpu type
`-fno-split-dwarf-inlining`:	✅	✅	✅	✅	✅	✅	✅	Provide minimal debug info in the object/executable to facilitate online symbolication/stack traces in the absence of .dwo/.dwp files when using Split DWARF
`-debugger-tuning=gdb`:	✅	✅	✅	✅	✅	✅	✅	tune the debug info
`-internal-isystem`:	✅	✅	✅	✅	✅	✅	✅
`-internal-externc-isystem`:	✅	✅	✅	✅	✅	✅	✅
`-resource-dir`:	✅	✅	✅	✅	✅	✅	✅	The directory which holds the compiler resource files
`-fdebug-compilation-dir`:	✅	✅	✅	✅	✅	✅	✅	The compilation directory to embed in the debug info.
`-ferror-limit 19`:	✅	✅	✅	✅	✅	✅	✅	Set the maximum number of errors to emit before stopping (0 = no limit).
`-fgnuc-version=4.2.1`:	✅	✅	✅	✅	✅	✅	✅	Sets various macros to claim compatibility with the given GCC version
`-fcolor-diagnostics`:	✅	✅	✅	✅	✅	✅	✅	Use colors in diagnostics
`-faddrsig`:	✅	✅	✅	✅	✅	✅	✅	Emit an address-significance table
`-vectorize-loops`:	❌	❌	✅	✅	✅	✅	❌	Run the Loop vectorization passes
`-vectorize-slp`:	❌	❌	✅	✅	✅	✅	✅	Run the SLP vectorization passes
`-main-file-name`:	✅	✅	✅	✅	✅	✅	✅

We can see the main differences of optimization parameters between the two versions are the introduction of cpu tune option and relax relocations.

How to Build LLVM From Source?

Copyright

Comments

How to Build LLVM From Source?

Copyright

Related Posts

Comments