Skip to content

Commit 8719c6c

Browse files
committed
[SPARK-55240][CORE] Refactor LazyTry stacktrace handling to use wrapper instead of suppressed exception
### What changes were proposed in this pull request? This PR refactors how `doTryWithCallerStacktrace` and `getTryWithCallerStacktrace` handle stacktrace stitching, with improved behavior for `LazyTry`: **Key changes:** 1. **New internal wrapper class `TryStackTraceWrapper`**: Instead of modifying the original exception's suppressed list in `doTryWithCallerStacktrace`, we now wrap the exception in a `TryStackTraceWrapper` that carries: - The original exception - The depth of stacktrace frames to preserve (optimization: just an int, not a copy of the array) - An `OriginalTryStackTraceException` holding the original stacktrace for later use 2. **Conditional suppressed exception in `LazyTry`**: - `LazyTry` now tracks whether the lazy value has been accessed (`materialized` flag) - On **first access**: No suppressed exception is added (clean output since the caller context is the same as initialization) - On **subsequent accesses**: The original stacktrace is added as a suppressed exception to help with debugging 3. **`getTryWithCallerStacktrace` now accepts `isFirstAccess` parameter**: - Default is `true` (no suppressed exception added) - When `false`, adds the original stacktrace as suppressed if not already present (idempotent check) ### Why are the changes needed? 1. **Cleaner first-access output**: On the first access to a `LazyTry` value, the suppressed exception showing the "original" stacktrace is redundant because it's essentially the same context. Removing it makes error output cleaner. 2. **Preserved debugging info for subsequent accesses**: When accessing a failed `LazyTry` from a different call site, the suppressed exception showing where the error originally occurred is valuable for debugging. 3. **Better encapsulation**: The wrapper is internal-only and never exposed to users. The original exception type is preserved when thrown. ### Does this PR introduce _any_ user-facing change? Yes: - On first access to a failed `LazyTry`, exceptions will no longer have a suppressed `OriginalTryStackTraceException` (cleaner output) - On subsequent accesses, the behavior is unchanged (suppressed exception shows original stacktrace) ### How was this patch tested? Existing unit tests in `UtilsSuite` and `LazyTrySuite` updated and passing. ### Was this patch authored or co-authored using generative AI tooling? Yes. cursor 2.4 Closes apache#54007 from cloud-fan/error. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent 778661a commit 8719c6c

File tree

3 files changed

+93
-79
lines changed

3 files changed

+93
-79
lines changed

core/src/main/scala/org/apache/spark/util/LazyTry.scala

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -45,17 +45,21 @@ import scala.util.Try
4545
private[spark] class LazyTry[T](initialize: => T) extends Serializable {
4646
private lazy val tryT: Try[T] = Utils.doTryWithCallerStacktrace { initialize }
4747

48+
// Track whether the lazy val has been materialized or not.
49+
@volatile private var materialized: Boolean = false
50+
4851
/**
4952
* Get the lazy value. If the initialization block threw an exception, it will be re-thrown here.
5053
* The exception will be re-thrown with the current caller's stacktrace.
51-
* An exception with stack trace from when the exception was first thrown can be accessed with
52-
* ```
53-
* ex.getSuppressed.find { e =>
54-
* e.getMessage == org.apache.spark.util.Utils.TRY_WITH_CALLER_STACKTRACE_FULL_STACKTRACE
55-
* }
56-
* ```
54+
*
55+
* On the first access, no suppressed exception is added. On subsequent accesses, the original
56+
* stacktrace is added as a suppressed exception to help with debugging.
5757
*/
58-
def get: T = Utils.getTryWithCallerStacktrace(tryT)
58+
def get: T = {
59+
val isFirstAccess = !materialized
60+
materialized = true
61+
Utils.getTryWithCallerStacktrace(tryT, isFirstAccess)
62+
}
5963
}
6064

6165
private[spark] object LazyTry {

core/src/main/scala/org/apache/spark/util/Utils.scala

Lines changed: 55 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1368,20 +1368,35 @@ private[spark] object Utils
13681368
val TRY_WITH_CALLER_STACKTRACE_FULL_STACKTRACE =
13691369
"Full stacktrace of original doTryWithCallerStacktrace caller"
13701370

1371+
/**
1372+
* Exception used to preserve the original stacktrace before stitching. On subsequent accesses
1373+
* (not the first), this is added as a suppressed exception to show where the error originally
1374+
* occurred.
1375+
*/
13711376
class OriginalTryStackTraceException()
1372-
extends Exception(TRY_WITH_CALLER_STACKTRACE_FULL_STACKTRACE) {
1373-
var doTryWithCallerStacktraceDepth: Int = 0
1374-
}
1377+
extends Exception(TRY_WITH_CALLER_STACKTRACE_FULL_STACKTRACE)
1378+
1379+
/**
1380+
* Internal wrapper used to carry the original exception along with metadata needed for
1381+
* stacktrace stitching. This wrapper is never exposed to users.
1382+
*
1383+
* @param originalException The original exception
1384+
* @param depth The number of stacktrace frames below doTryWithCallerStacktrace to preserve
1385+
* @param originalStacktraceEx Exception holding the original stacktrace (for suppressed display)
1386+
*/
1387+
class TryStackTraceWrapper(
1388+
val originalException: Throwable,
1389+
val depth: Int,
1390+
val originalStacktraceEx: OriginalTryStackTraceException) extends Exception(originalException)
13751391

13761392
/**
13771393
* Use Try with stacktrace substitution for the caller retrieving the error.
13781394
*
13791395
* Normally in case of failure, the exception would have the stacktrace of the caller that
13801396
* originally called doTryWithCallerStacktrace. However, we want to replace the part above
13811397
* this function with the stacktrace of the caller who calls getTryWithCallerStacktrace.
1382-
* So here we save the part of the stacktrace below doTryWithCallerStacktrace, and
1383-
* getTryWithCallerStacktrace will stitch it with the new stack trace of the caller.
1384-
* The full original stack trace is kept in ex.getSuppressed.
1398+
* So here we wrap the exception with metadata, and getTryWithCallerStacktrace will
1399+
* unwrap it and stitch the stacktrace with the new caller's stack trace.
13851400
*
13861401
* @param f Code block to be wrapped in Try
13871402
* @return Try with Success or Failure of the code block. Use with getTryWithCallerStacktrace.
@@ -1392,29 +1407,27 @@ private[spark] object Utils
13921407
}
13931408
t match {
13941409
case Failure(ex) =>
1410+
// If already wrapped, return as-is (nested call)
1411+
if (ex.isInstanceOf[TryStackTraceWrapper]) {
1412+
return t
1413+
}
13951414
// Note: we remove the common suffix instead of e.g. finding the call to this function, to
13961415
// account for recursive calls with multiple doTryWithCallerStacktrace on the stack trace.
13971416
val origStackTrace = ex.getStackTrace
13981417
val currentStackTrace = Thread.currentThread().getStackTrace
13991418
val commonSuffixLen = origStackTrace.reverse.zip(currentStackTrace.reverse).takeWhile {
14001419
case (exElem, currentElem) => exElem == currentElem
14011420
}.length
1402-
// Add the full stack trace of the original caller as the suppressed exception.
1403-
// It may already be there if it's a nested call to doTryWithCallerStacktrace.
1404-
val origEx = ex.getSuppressed.find { e =>
1405-
e.isInstanceOf[OriginalTryStackTraceException]
1406-
}.getOrElse {
1407-
val fullEx = new OriginalTryStackTraceException()
1408-
fullEx.setStackTrace(origStackTrace)
1409-
ex.addSuppressed(fullEx)
1410-
fullEx
1411-
}.asInstanceOf[OriginalTryStackTraceException]
1412-
// Update the depth of the stack of the current doTryWithCallerStacktrace, for stitching
1413-
// it with the stack of getTryWithCallerStacktrace.
1414-
origEx.doTryWithCallerStacktraceDepth = origStackTrace.size - commonSuffixLen
1415-
case Success(_) => // nothing
1421+
// Compute the depth of the "below" portion to preserve during stitching
1422+
val depth = origStackTrace.size - commonSuffixLen
1423+
// Create exception to hold original stacktrace (for suppressed on subsequent access)
1424+
val originalStacktraceEx = new OriginalTryStackTraceException()
1425+
originalStacktraceEx.setStackTrace(origStackTrace)
1426+
// Wrap everything
1427+
val wrapper = new TryStackTraceWrapper(ex, depth, originalStacktraceEx)
1428+
Failure(wrapper)
1429+
case Success(_) => t
14161430
}
1417-
t
14181431
}
14191432

14201433
/**
@@ -1424,32 +1437,34 @@ private[spark] object Utils
14241437
* below the original doTryWithCallerStacktrace which triggered it, with the caller stack trace
14251438
* of the current caller of getTryWithCallerStacktrace.
14261439
*
1427-
* Full stack trace of the original doTryWithCallerStacktrace caller can be retrieved with
1428-
* ```
1429-
* ex.getSuppressed.find { e =>
1430-
* e.isInstanceOf[Utils.OriginalTryStackTraceException]
1431-
* }
1432-
* ```
1433-
*
1440+
* On subsequent accesses (not the first), the original stacktrace is added as a suppressed
1441+
* exception to help with debugging.
14341442
*
14351443
* @param t Try from doTryWithCallerStacktrace
1444+
* @param isFirstAccess Whether this is the first access to the Try value
14361445
* @return Result of the Try or rethrows the failure exception with modified stacktrace.
14371446
*/
1438-
def getTryWithCallerStacktrace[T](t: Try[T]): T = t match {
1439-
case Failure(ex) =>
1440-
val originalStacktraceEx = ex.getSuppressed.find { e =>
1441-
// added in doTryWithCallerStacktrace
1442-
e.isInstanceOf[OriginalTryStackTraceException]
1443-
}.getOrElse {
1444-
// If we don't have the expected stacktrace information, just rethrow
1445-
throw ex
1446-
}.asInstanceOf[OriginalTryStackTraceException]
1447-
val belowStacktrace = originalStacktraceEx.getStackTrace
1448-
.take(originalStacktraceEx.doTryWithCallerStacktraceDepth)
1447+
def getTryWithCallerStacktrace[T](t: Try[T], isFirstAccess: Boolean = true): T = t match {
1448+
case Failure(wrapper: TryStackTraceWrapper) =>
1449+
val originalEx = wrapper.originalException
1450+
// Stitch the stacktrace: keep the first `depth` frames, replace the rest with current caller
1451+
val belowStacktrace = originalEx.getStackTrace.take(wrapper.depth)
14491452
// We are modifying and throwing the original exception. It would be better if we could
14501453
// return a copy, but we can't easily clone it and preserve. If this is accessed from
14511454
// multiple threads that then look at the stack trace, this could break.
1452-
ex.setStackTrace(belowStacktrace ++ Thread.currentThread().getStackTrace.drop(1))
1455+
originalEx.setStackTrace(
1456+
belowStacktrace ++ Thread.currentThread().getStackTrace.drop(1))
1457+
// On subsequent accesses, add suppressed exception showing original stacktrace
1458+
if (!isFirstAccess) {
1459+
val alreadyAdded = originalEx.getSuppressed.exists(
1460+
_.isInstanceOf[OriginalTryStackTraceException])
1461+
if (!alreadyAdded) {
1462+
originalEx.addSuppressed(wrapper.originalStacktraceEx)
1463+
}
1464+
}
1465+
throw originalEx
1466+
case Failure(ex) =>
1467+
// Not wrapped (shouldn't happen if used correctly), just rethrow
14531468
throw ex
14541469
case Success(s) => s
14551470
}

core/src/test/scala/org/apache/spark/util/UtilsSuite.scala

Lines changed: 27 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1640,7 +1640,7 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
16401640
}
16411641

16421642
private def callGetTryAgain(t: Try[String]): String = {
1643-
Utils.getTryWithCallerStacktrace(t)
1643+
Utils.getTryWithCallerStacktrace(t, isFirstAccess = false)
16441644
}
16451645

16461646
test("doTryWithCallerStacktrace and getTryWithCallerStacktrace") {
@@ -1669,26 +1669,8 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
16691669
assert(!st1.exists(_.getMethodName == "callDoTry"))
16701670
assert(st1.exists(_.getMethodName == "callGetTry"))
16711671

1672-
// The original stack trace with callDoTry should be in the suppressed exceptions.
1673-
// Example:
1674-
// scalastyle:off line.size.limit
1675-
// Suppressed: java.lang.Exception: Full stacktrace of original doTryWithCallerStacktrace caller
1676-
// at org.apache.spark.util.UtilsSuite.throwException(UtilsSuite.scala:1640)
1677-
// at org.apache.spark.util.UtilsSuite.$anonfun$callDoTry$1(UtilsSuite.scala:1645)
1678-
// at scala.util.Try$.apply(Try.scala:213)
1679-
// at org.apache.spark.util.Utils$.doTryWithCallerStacktrace(Utils.scala:1586)
1680-
// at org.apache.spark.util.UtilsSuite.callDoTry(UtilsSuite.scala:1645)
1681-
// at org.apache.spark.util.UtilsSuite.$anonfun$new$165(UtilsSuite.scala:1658)
1682-
// ... 56 more
1683-
// scalastyle:on line.size.limit
1684-
val origSt = e1.getSuppressed.find(_.isInstanceOf[Utils.OriginalTryStackTraceException])
1685-
assert(origSt.isDefined)
1686-
assert(origSt.get.getStackTrace.exists(_.getMethodName == "throwException"))
1687-
assert(origSt.get.getStackTrace.exists(_.getMethodName == "callDoTry"))
1688-
1689-
// Should save the depth of the stack trace under doTryWithCallerStacktrace.
1690-
assert(origSt.get.asInstanceOf[Utils.OriginalTryStackTraceException]
1691-
.doTryWithCallerStacktraceDepth == 4)
1672+
// First access (isFirstAccess = true by default) - no suppressed exception
1673+
assert(!e1.getSuppressed.exists(_.isInstanceOf[Utils.OriginalTryStackTraceException]))
16921674

16931675
val e2 = intercept[Exception] {
16941676
callGetTryAgain(t)
@@ -1715,6 +1697,24 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
17151697
// callGetTry that we called before shouldn't be on the stack trace.
17161698
assert(!st2.exists(_.getMethodName == "callGetTry"))
17171699

1700+
// Second access (isFirstAccess = false) - suppressed exception should be added.
1701+
// The original stack trace with callDoTry should be in the suppressed exceptions.
1702+
// Example:
1703+
// scalastyle:off line.size.limit
1704+
// Suppressed: org.apache.spark.util.Utils$OriginalTryStackTraceException: Full stacktrace of original doTryWithCallerStacktrace caller
1705+
// at org.apache.spark.util.UtilsSuite.throwException(UtilsSuite.scala:1640)
1706+
// at org.apache.spark.util.UtilsSuite.$anonfun$callDoTry$1(UtilsSuite.scala:1645)
1707+
// at scala.util.Try$.apply(Try.scala:213)
1708+
// at org.apache.spark.util.Utils$.doTryWithCallerStacktrace(Utils.scala:1586)
1709+
// at org.apache.spark.util.UtilsSuite.callDoTry(UtilsSuite.scala:1645)
1710+
// at org.apache.spark.util.UtilsSuite.$anonfun$new$165(UtilsSuite.scala:1658)
1711+
// ... 56 more
1712+
// scalastyle:on line.size.limit
1713+
val origSt = e2.getSuppressed.find(_.isInstanceOf[Utils.OriginalTryStackTraceException])
1714+
assert(origSt.isDefined, "Suppressed exception should be added on subsequent access")
1715+
assert(origSt.get.getStackTrace.exists(_.getMethodName == "throwException"))
1716+
assert(origSt.get.getStackTrace.exists(_.getMethodName == "callDoTry"))
1717+
17181718
// Unfortunately, this utility is not able to clone the exception, but modifies it in place,
17191719
// so now e1 is also pointing to "callGetTryAgain" instead of "callGetTry".
17201720
val st1Again = e1.getStackTrace
@@ -1760,20 +1760,13 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
17601760
// at org.apache.spark.util.UtilsSuite.callDoTryNestedNested(UtilsSuite.scala:1654)
17611761
// at org.apache.spark.util.UtilsSuite.$anonfun$new$172(UtilsSuite.scala:1674)
17621762
// ...
1763-
// Suppressed: org.apache.spark.util.Utils$OriginalTryStackTraceException: Full stacktrace of original doTryWithCallerStacktrace caller
1764-
// at org.apache.spark.util.UtilsSuite.throwException(UtilsSuite.scala:1529)
1765-
// at org.apache.spark.util.UtilsSuite.$anonfun$callDoTry$1(UtilsSuite.scala:1534)
1766-
// at scala.util.Try$.apply(Try.scala:217)
1767-
// at org.apache.spark.util.Utils$.doTryWithCallerStacktrace(Utils.scala:1377)
1768-
// at org.apache.spark.util.UtilsSuite.callDoTry(UtilsSuite.scala:1534)
1769-
// at org.apache.spark.util.UtilsSuite.$anonfun$callDoTryNested$1(UtilsSuite.scala:1631)
1770-
// ...
17711763
// scalastyle:on line.size.limit
17721764

17731765
assert(e.getStackTrace.exists(_.getMethodName == "callGetTryFromNested"))
17741766
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTryFromNestedNested"))
17751767
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTry"))
1776-
assert(e.getSuppressed.length == 1)
1768+
// No suppressed exception - the wrapper is used internally only
1769+
assert(e.getSuppressed.length == 0)
17771770

17781771
Utils.getTryWithCallerStacktrace(t)
17791772
}
@@ -1821,7 +1814,8 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
18211814
assert(e.getStackTrace.exists(_.getMethodName == "callGetTryFromNestedNested"))
18221815
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTryFromNested"))
18231816
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTry"))
1824-
assert(e.getSuppressed.length == 1)
1817+
// No suppressed exception - the wrapper is used internally only
1818+
assert(e.getSuppressed.length == 0)
18251819

18261820
Utils.getTryWithCallerStacktrace(t)
18271821
}
@@ -1865,7 +1859,8 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
18651859
assert(e.getStackTrace.exists(_.getMethodName == "callGetTry"))
18661860
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTryFromNested"))
18671861
assert(!e.getStackTrace.exists(_.getMethodName == "callGetTryFromNestedNested"))
1868-
assert(e.getSuppressed.length == 1)
1862+
// No suppressed exception - the wrapper is used internally only
1863+
assert(e.getSuppressed.length == 0)
18691864
}
18701865
}
18711866

0 commit comments

Comments
 (0)