Description
Hi OpenROAD devs,
I am working on a random testing infra similar to https://en.wikipedia.org/wiki/QuickCheck. If I run the tests sequentially then they work as expected, but when I try to run the test cases in parallel to get some speed up using OpenMP's parallel for loop, I found out that OpenRoad crashes.
Running OpenRoad inside Valgrind, I get the following ST:
==2885== Thread 8:
==2885== Invalid write of size 2
==2885== at 0x63DE8A3: memmove (vg_replace_strmem.c:1270)
==2885== by 0x9B6698A: _IO_file_xsgetn (fileops.c:1296)
==2885== by 0x9B5B11E: fread (iofread.c:38)
==2885== by 0x2AC9F74: LefDefParser::lefReloadBuffer() (lef_keywords.cpp:206)
==2885== by 0x2ACA096: LefDefParser::lefGetc() (lef_keywords.cpp:229)
==2885== by 0x2ACB34E: LefDefParser::lefsublex() (lef_keywords.cpp:694)
==2885== by 0x2ACB090: LefDefParser::lefyylex() (lef_keywords.cpp:634)
==2885== by 0x2ADE1D6: LefDefParser::lefyyparse() (lef_parser.cpp:3825)
==2885== by 0x2AD4681: LefDefParser::lefrRead(_IO_FILE*, char const*, void*) (lefrReader.cpp:297)
==2885== by 0x28C764C: odb::lefin_parse(odb::lefin*, utl::Logger*, char const*) (reader.cpp:560)
==2885== by 0x28C231C: odb::lefin::readLef(char const*) (lefin.cpp:2094)
==2885== by 0x28C32E4: odb::lefin::createTechAndLib(char const*, char const*) (lefin.cpp:2268)
==2885== Address 0x1cfef764 is 22,548 bytes inside a block of size 61,632 free'd
==2885== at 0x63D908B: operator delete(void*, unsigned long) (vg_replace_malloc.c:593)
==2885== by 0x2AD390D: LefDefParser::lefrData::reset() (lefrData.cpp:319)
==2885== by 0x2AD4596: LefDefParser::lefrRead(_IO_FILE*, char const*, void*) (lefrReader.cpp:281)
==2885== by 0x28C764C: odb::lefin_parse(odb::lefin*, utl::Logger*, char const*) (reader.cpp:560)
==2885== by 0x28C231C: odb::lefin::readLef(char const*) (lefin.cpp:2094)
==2885== by 0x28C32E4: odb::lefin::createTechAndLib(char const*, char const*) (lefin.cpp:2268)
==2885== by 0x32B2B2: RandomDesign::loadLibs() (RandomDesign.cpp:187)
==2885== by 0x32A4D5: RandomDesign::generate() (RandomDesign.cpp:70)
==2885== by 0x2BBBB9: test_suite::test_default::test_method() [clone ._omp_fn.0] (TestScanReplace.cpp:93)
==2885== by 0x830EDE5: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==2885== by 0x833CEA6: start_thread (pthread_create.c:477)
==2885== by 0x9BE5A2E: clone (clone.S:95)
==2885== Block was alloc'd at
==2885== at 0x63D7DEF: operator new(unsigned long) (vg_replace_malloc.c:342)
==2885== by 0x2AD3917: LefDefParser::lefrData::reset() (lefrData.cpp:322)
==2885== by 0x2AD4596: LefDefParser::lefrRead(_IO_FILE*, char const*, void*) (lefrReader.cpp:281)
==2885== by 0x28C764C: odb::lefin_parse(odb::lefin*, utl::Logger*, char const*) (reader.cpp:560)
==2885== by 0x28C231C: odb::lefin::readLef(char const*) (lefin.cpp:2094)
==2885== by 0x28C32E4: odb::lefin::createTechAndLib(char const*, char const*) (lefin.cpp:2268)
==2885== by 0x32B2B2: RandomDesign::loadLibs() (RandomDesign.cpp:187)
==2885== by 0x32A4D5: RandomDesign::generate() (RandomDesign.cpp:70)
==2885== by 0x2BBBB9: test_suite::test_default::test_method() [clone ._omp_fn.0] (TestScanReplace.cpp:93)
==2885== by 0x830EDE5: ??? (in /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0)
==2885== by 0x833CEA6: start_thread (pthread_create.c:477)
==2885== by 0x9BE5A2E: clone (clone.S:95)
==2885==
As you can see, the issue is happening in lef_keywords.cpp:206, because the memory was freed in lefrData.cpp:319. So looks like my threads are sharing lefData, that is global defined in src/lef/lef/lefrData.cpp:44.
Suggested Solution
Would be nice if we can remove lefData from being a global variable, maybe moving it inside the class lefrData. From a quick grep, I see that there are aprox. 326 instances of lefData in the code so looks like this is a mayor refactor.
Additional Context
This would make random testing more practical in OpenRoad and improve parallel execution for other tools/projects.
Also, I am using something similar to this code to load the lef, running it in parallel:
for(;;) {
odb::lefin lef_reader(openRoad_->getDb(), openRoad_->getLogger(), false);
lib = lef_reader.createTechAndLib("lib", "some.lef");
}
Description
Hi OpenROAD devs,
I am working on a random testing infra similar to https://en.wikipedia.org/wiki/QuickCheck. If I run the tests sequentially then they work as expected, but when I try to run the test cases in parallel to get some speed up using OpenMP's parallel for loop, I found out that OpenRoad crashes.
Running OpenRoad inside Valgrind, I get the following ST:
As you can see, the issue is happening in lef_keywords.cpp:206, because the memory was freed in lefrData.cpp:319. So looks like my threads are sharing lefData, that is global defined in src/lef/lef/lefrData.cpp:44.
Suggested Solution
Would be nice if we can remove lefData from being a global variable, maybe moving it inside the class lefrData. From a quick grep, I see that there are aprox. 326 instances of lefData in the code so looks like this is a mayor refactor.
Additional Context
This would make random testing more practical in OpenRoad and improve parallel execution for other tools/projects.
Also, I am using something similar to this code to load the lef, running it in parallel: