Greetings Developers,
I've been working on benchmarking the GPU capabilities of the traccc framework and have encountered a reproducible bug where the main example executables fail to process the official ODD dataset. After a thorough investigation, I believe I have isolated the root cause.
Environment Details
- OS: Windows 11 with WSL2 (Ubuntu 22.04)
- Compiler: g++ 13 (installed via Conda from conda-forge)
- CUDA: 12.6 (installed via Conda from conda-forge)
- traccc Version: Cloned from the main branch on January 8, 2026
The Problem
The high-level executables (traccc_seq_example, traccc_seq_example_cuda, and traccc_throughput_mt) consistently crash with a file format error when run on the ODD dataset downloaded via the official data/traccc_data_get_files.sh script.
Error Message: std::invalid_argument (what(): Unsupported data format) or std::__ios_failure (iostream error)
Steps to Reproduce
1. Build the project from the main branch
A minimal build command that reproduces the issue:
cmake -B build -S . -DTRACCC_BUILD_CUDA=ON -DTRACCC_BUILD_EXAMPLES=ON -DTRACCC_USE_ROOT=OFF
cmake --build build -j4
2. Download the ODD data
./data/traccc_data_get_files.sh
3. Run the sequential CPU example on the downloaded data
./build/bin/traccc_seq_example --input-directory="odd/geant4_10muon_10GeV/" --input-events=1
Expected vs. Actual Behavior
Expected: The program should process one event and exit gracefully.
Actual: The program prints a "duplicate cells" warning and then aborts with the "Unsupported data format" exception.
Root Cause Analysis
I tried a detailed investigation of the source code and data files and found :
- The stack trace points to a failure within the
io::read_cells function
- Source code analysis of
examples/run/common/throughput_mt.ipp and examples/run/cpu/seq_example.cpp confirms these executables are hardcoded to call io::read_cells
- This function is hardcoded to open files ending in
-cells.csv
- The CSV header in event files is:
geometry_id,measurement_id,channel0,channel1,timestamp,value
- Analysis of the C++ parser in
io/src/csv/read_cells.cpp shows it is not designed to handle the measurement_id column in that position, causing the parser to fail
Workaround / Additional Information
I got an alternative way to do my work but I am not sure if this is the supposed way-
- I ran the crashing
traccc_seq_example with an --output-directory flag, which successfully generated all intermediate data files (including spacepoints.csv) before crashing
- I then ran
traccc_seeding_example_cuda, which is hardcoded to call io::read_spacepoints
- This second executable ran to completion successfully, performing the full seeding, track finding (CKF), and fitting on the GPU
Conclusion: The data itself is valid, but the specific C++ parser for cells.csv in the higher-level executables is out of sync with the data format provided by the official download script.
Summary
This detailed report indicates that the issue is a parser incompatibility rather than invalid data. The io::read_cells function requires updating to handle the measurement_id column position in the official ODD dataset format.
I hope this detailed report is helpful for funderstanding the case. Thank you for your work on this great project.
Greetings Developers,
I've been working on benchmarking the GPU capabilities of the traccc framework and have encountered a reproducible bug where the main example executables fail to process the official ODD dataset. After a thorough investigation, I believe I have isolated the root cause.
Environment Details
The Problem
The high-level executables (
traccc_seq_example,traccc_seq_example_cuda, andtraccc_throughput_mt) consistently crash with a file format error when run on the ODD dataset downloaded via the officialdata/traccc_data_get_files.shscript.Error Message:
std::invalid_argument (what(): Unsupported data format)orstd::__ios_failure (iostream error)Steps to Reproduce
1. Build the project from the main branch
A minimal build command that reproduces the issue:
cmake -B build -S . -DTRACCC_BUILD_CUDA=ON -DTRACCC_BUILD_EXAMPLES=ON -DTRACCC_USE_ROOT=OFF cmake --build build -j42. Download the ODD data
3. Run the sequential CPU example on the downloaded data
./build/bin/traccc_seq_example --input-directory="odd/geant4_10muon_10GeV/" --input-events=1Expected vs. Actual Behavior
Expected: The program should process one event and exit gracefully.
Actual: The program prints a "duplicate cells" warning and then aborts with the "Unsupported data format" exception.
Root Cause Analysis
I tried a detailed investigation of the source code and data files and found :
io::read_cellsfunctionexamples/run/common/throughput_mt.ippandexamples/run/cpu/seq_example.cppconfirms these executables are hardcoded to callio::read_cells-cells.csvgeometry_id,measurement_id,channel0,channel1,timestamp,valueio/src/csv/read_cells.cppshows it is not designed to handle themeasurement_idcolumn in that position, causing the parser to failWorkaround / Additional Information
I got an alternative way to do my work but I am not sure if this is the supposed way-
traccc_seq_examplewith an--output-directoryflag, which successfully generated all intermediate data files (includingspacepoints.csv) before crashingtraccc_seeding_example_cuda, which is hardcoded to callio::read_spacepointsConclusion: The data itself is valid, but the specific C++ parser for
cells.csvin the higher-level executables is out of sync with the data format provided by the official download script.Summary
This detailed report indicates that the issue is a parser incompatibility rather than invalid data. The
io::read_cellsfunction requires updating to handle themeasurement_idcolumn position in the official ODD dataset format.I hope this detailed report is helpful for funderstanding the case. Thank you for your work on this great project.