Experienced test failures when building OpenColorIO 2.0.0 with -march=znver2 on GCC, after process of elimination found the culprit to be -mfma specifically. After building with Clang got the same test failures when also explicitly enabling -ffp-contract=fast. Tested also with master branch as of commit 4e27f96 and the failures are identical.
https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html
https://releases.llvm.org/11.0.1/tools/clang/docs/ClangCommandLineReference.html
clang version 11.1.0
gcc version 10.2.0
Distribution is Gentoo Linux
cmake options:
-DCMAKE_INSTALL_PREFIX=/usr -DBUILD_SHARED_LIBS=ON -DLIB_SUFFIX= -DOCIO_BUILD_APPS=no \
-DOCIO_BUILD_DOCS=no -DOCIO_BUILD_GPU_TESTS=OFF -DOCIO_BUILD_PYTHON=yes -DOCIO_BUILD_TESTS=yes \
-DOCIO_BUILD_JAVA=OFF -DOCIO_INSTALL_EXT_PACKAGES=NONE -DOCIO_USE_SSE=yes
Logs
Build log GCC
Build log Clang
Test logs GCC
Test logs Clang
How to reproduce
Compile OpenColorIO with following flags and execute the tests.
Compiler flags for GCC: -mfma
Compiler flags for Clang: -mfma -ffp-contract=fast
Why the difference between Clang and GCC compiler flags?
GCC defaults to -ffp-contract=fast while Clang defaults to -ffp-contract=on.
Why would you have -mfma enabled?
-mfma is enabled on basically all -march options since haswell, -mfma is also enabled in the upcoming x86-64-v3 march feature level.
https://www.phoronix.com/scan.php?page=news_item&px=GCC-11-x86-64-Feature-Levels
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
Impact
Very minor as a minority compiles OpenColorIO from source with these optimizations. Distribution binary providers might experience this failure if building binaries on GCC with future micro-architecture feature levels from v3 and up.
Tests which got the failures
GCC failed with 11 tests, while Clang with 12 tests.
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/CPUProcessor_tests.cpp:824:
FAILED: cacheID == expectedID
values were 'CPU Processor: from 16ui to 32f oFlags 263995331 ops: <Lut1D $17d2b407e859021ee87958e5d4e91c8f forward default standard domain none>' and 'CPU Processor: from 16ui to 32f oFlags 263995331 ops: <Lut1D $a57d7444e629d796d2234c18a0539c74 forward default standard domain none>'
[126/991] [CPUProcessor / with_several_ops ] - FAILED
// Truncated the rest, log includes several of these with values being off by one in every single one.
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/CPUProcessor_tests.cpp:2167:
FAILED: outValues[idx+3] == OCIO::Converter<outBD>::CastValue(pxl[3])
values were '65214' and '65215'
[135/991] [CPUProcessor / optimizations ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/fileformats/FileFormatCTF_tests.cpp:8081:
FAILED: expectedCLF == output1.str()
values were '<?xml version="1.0" encoding="UTF-8"?>
<ProcessList compCLFversion="3" id="UID42">
<Range inBitDepth="32f" outBitDepth="32f">
<minInValue> -0.125 </minInValue>
<maxInValue> 1.125 </maxInValue>
<minOutValue> 0 </minOutValue>
<maxOutValue> 1 </maxOutValue>
</Range>
<LUT1D inBitDepth="32f" outBitDepth="32f">
<Array dim="10 1">
0
0.11111112
0.22222224
0.33333334
0.44444448
0.55555558
0.66666675
0.77777779
0.88888896
1
</Array>
</LUT1D>
<LUT3D inBitDepth="32f" outBitDepth="32f">
<Array dim="2 2 2 3">
0 0 0
0.0361 0.0361 0.53609997
0.3576 0.85759997 0.3576
0.3937 0.8937 0.8937
0.6063 0.1063 0.1063
0.64240003 0.1424 0.64239997
0.96389997 0.96389997 0.4639
1 1 1
</Array>
</LUT3D>
</ProcessList>
' and '<?xml version="1.0" encoding="UTF-8"?>
<ProcessList compCLFversion="3" id="UID42">
<Range inBitDepth="32f" outBitDepth="32f">
<minInValue> -0.125 </minInValue>
<minOutValue> 0 </minOutValue>
<maxOutValue> 1 </maxOutValue>
</Range>
<LUT1D inBitDepth="32f" outBitDepth="32f">
<Array dim="10 1">
0
0.11111111
0.22222222
0.33333334
0.44444445
0.55555558
0.66666669
0.77777779
0.8888889
1
</Array>
</LUT1D>
<LUT3D inBitDepth="32f" outBitDepth="32f">
<Array dim="2 2 2 3">
0 0 0
0.0361 0.0361 0.53609997
0.3576 0.85759997 0.3576
0.3937 0.8937 0.8937
0.6063 0.1063 0.1063
0.64240003 0.1424 0.64239997
0.96389997 0.96389997 0.4639
1 1 1
</Array>
</LUT3D>
</ProcessList>
'
[393/991] [FileFormatCTF / bake_1d_3d ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:234:
FAILED: rgba[0] == logECVal(rgbaImage[0], const_ec, inMax, outMax)
values were '0.13045' and '0.13045'
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:235:
FAILED: rgba[1] == logECVal(rgbaImage[1], const_ec, inMax, outMax)
values were '0.50108' and '0.50108'
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:239:
FAILED: rgba[5] == logECVal(rgbaImage[5], const_ec, inMax, outMax)
values were '0.10108' and '0.10108'
[576/991] [ExposureContrastRenderer / log ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:122:
FAILED: Index: 17 - Values: 0.896949828 and: 0.896951199 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:122:
FAILED: Index: 21 - Values: 1.10895336 and: 1.10895324 - Threshold: 1.00000001e-07
[619/991] [GammaOpCPU / apply_basic_style_fwd ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:185:
FAILED: Index: 16 - Values: 0.830311298 and: 0.830311418 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:185:
FAILED: Index: 17 - Values: 0.976092517 and: 0.976092875 - Threshold: 1.00000001e-07
[620/991] [GammaOpCPU / apply_basic_style_rev ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:276:
FAILED: Index: 17 - Values: 0.896949828 and: 0.896951199 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:276:
FAILED: Index: 21 - Values: -0.896949828 and: -0.896951199 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:276:
FAILED: Index: 25 - Values: 1.10895336 and: 1.10895324 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:276:
FAILED: Index: 29 - Values: -1.10895336 and: -1.10895324 - Threshold: 1.00000001e-07
[621/991] [GammaOpCPU / apply_basic_mirror_style_fwd ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:366:
FAILED: Index: 16 - Values: 0.830311298 and: 0.830311418 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:366:
FAILED: Index: 17 - Values: 0.976092517 and: 0.976092875 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:366:
FAILED: Index: 20 - Values: -0.830311298 and: -0.830311418 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:366:
FAILED: Index: 21 - Values: -0.976092517 and: -0.976092875 - Threshold: 1.00000001e-07
[622/991] [GammaOpCPU / apply_basic_mirror_style_rev ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:444:
FAILED: Index: 17 - Values: 0.896949828 and: 0.896951199 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:444:
FAILED: Index: 25 - Values: 1.10895336 and: 1.10895324 - Threshold: 1.00000001e-07
[623/991] [GammaOpCPU / apply_basic_pass_thru_style_fwd ] - FAILED
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:522:
FAILED: Index: 16 - Values: 0.830311298 and: 0.830311418 - Threshold: 1.00000001e-07
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:522:
FAILED: Index: 17 - Values: 0.976092517 and: 0.976092875 - Threshold: 1.00000001e-07
[624/991] [GammaOpCPU / apply_basic_pass_thru_style_rev ] - FAILED
// Truncated the rest, log includes several more of these with values differing by ~0.00001-0.0000001.
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:576:
FAILED: Index: 22 - Values: 1.49998474 and: 1.49998403 - Threshold: 1.00000001e-07
[625/991] [GammaOpCPU / apply_moncurve_style_fwd ] - FAILED
// Truncated the rest, log includes several more of these with values differing by ~0.00001-0.0000001.
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/gamma/GammaOpCPU_tests.cpp:690:
FAILED: Index: 30 - Values: -1.84183896 and: -1.84183872 - Threshold: 1.00000001e-07
[627/991] [GammaOpCPU / apply_moncurve_mirror_style_fwd ] - FAILED
Errors were identical between GCC and Clang except for the following which was only in Clang.
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:234:
FAILED: rgba[0] == logECVal(rgbaImage[0], const_ec, inMax, outMax)
values were '0.13045' and '0.13045'
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:235:
FAILED: rgba[1] == logECVal(rgbaImage[1], const_ec, inMax, outMax)
values were '0.50108' and '0.50108'
/var/tmp/portage/media-libs/opencolorio-2.0.0-r1/work/OpenColorIO-2.0.0/tests/cpu/ops/exposurecontrast/ExposureContrastOpCPU_tests.cpp:239:
FAILED: rgba[5] == logECVal(rgbaImage[5], const_ec, inMax, outMax)
values were '0.10108' and '0.10108'
[576/991] [ExposureContrastRenderer / log ] - FAILED
Experienced test failures when building OpenColorIO 2.0.0 with
-march=znver2on GCC, after process of elimination found the culprit to be-mfmaspecifically. After building with Clang got the same test failures when also explicitly enabling-ffp-contract=fast. Tested also with master branch as of commit 4e27f96 and the failures are identical.https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html
https://releases.llvm.org/11.0.1/tools/clang/docs/ClangCommandLineReference.html
clang version 11.1.0
gcc version 10.2.0
Distribution is Gentoo Linux
cmake options:
Logs
Build log GCC
Build log Clang
Test logs GCC
Test logs Clang
How to reproduce
Compile OpenColorIO with following flags and execute the tests.
Compiler flags for GCC:
-mfmaCompiler flags for Clang:
-mfma -ffp-contract=fastWhy the difference between Clang and GCC compiler flags?
GCC defaults to
-ffp-contract=fastwhile Clang defaults to-ffp-contract=on.Why would you have
-mfmaenabled?-mfmais enabled on basically all-marchoptions since haswell,-mfmais also enabled in the upcomingx86-64-v3march feature level.https://www.phoronix.com/scan.php?page=news_item&px=GCC-11-x86-64-Feature-Levels
https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html
Impact
Very minor as a minority compiles OpenColorIO from source with these optimizations. Distribution binary providers might experience this failure if building binaries on GCC with future micro-architecture feature levels from v3 and up.
Tests which got the failures
GCC failed with 11 tests, while Clang with 12 tests.
Errors were identical between GCC and Clang except for the following which was only in Clang.