What is the problem the feature request solves?
Note: This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.
Comet does not currently support the Spark timestamp_add_ym_interval function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.
The TimestampAddYMInterval expression adds a year-month interval to a timestamp value. This operation is timezone-aware and handles both TimestampType and TimestampNTZType inputs while preserving the original timestamp data type.
Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.
Describe the potential solution
Spark Specification
Syntax:
timestamp_column + INTERVAL 'value' YEAR TO MONTH
// DataFrame API usage
col("timestamp_column") + expr("INTERVAL '2-3' YEAR TO MONTH")
Arguments:
| Argument |
Type |
Description |
| timestamp |
Expression |
The timestamp expression to add the interval to |
| interval |
Expression |
The year-month interval expression to add |
| timeZoneId |
Option[String] |
Optional timezone identifier for timezone-aware operations |
Return Type: Returns the same data type as the input timestamp expression (TimestampType or TimestampNTZType).
Supported Data Types:
- Input timestamp:
AnyTimestampType (TimestampType or TimestampNTZType)
- Input interval:
YearMonthIntervalType
Edge Cases:
- Null handling: Returns null if either timestamp or interval input is null (null intolerant)
- Timezone handling: Uses session timezone for
TimestampType and UTC for TimestampNTZType
- Month overflow: Handles month arithmetic that crosses year boundaries correctly
- Day adjustment: May adjust day values when adding months to dates like January 31st + 1 month
Examples:
-- Add 2 years and 3 months to a timestamp
SELECT timestamp_col + INTERVAL '2-3' YEAR TO MONTH FROM events;
-- Add 1 year to current timestamp
SELECT current_timestamp() + INTERVAL '1' YEAR;
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(col("created_at") + expr("INTERVAL '1-6' YEAR TO MONTH"))
// Using interval column
df.select(col("timestamp_col") + col("interval_col"))
Implementation Approach
See the Comet guide on adding new expressions for detailed instructions.
- Scala Serde: Add expression handler in
spark/src/main/scala/org/apache/comet/serde/
- Register: Add to appropriate map in
QueryPlanSerde.scala
- Protobuf: Add message type in
native/proto/src/proto/expr.proto if needed
- Rust: Implement in
native/spark-expr/src/ (check if DataFusion has built-in support first)
Additional context
Difficulty: Medium
Spark Expression Class: org.apache.spark.sql.catalyst.expressions.TimestampAddYMInterval
Related:
DateAddYMInterval - Adds year-month intervals to date values
TimestampAddDTInterval - Adds day-time intervals to timestamps
DateTimeUtils.timestampAddMonths() - Underlying implementation method
This issue was auto-generated from Spark reference documentation.
What is the problem the feature request solves?
Comet does not currently support the Spark
timestamp_add_ym_intervalfunction, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.The
TimestampAddYMIntervalexpression adds a year-month interval to a timestamp value. This operation is timezone-aware and handles bothTimestampTypeandTimestampNTZTypeinputs while preserving the original timestamp data type.Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.
Describe the potential solution
Spark Specification
Syntax:
Arguments:
Return Type: Returns the same data type as the input timestamp expression (
TimestampTypeorTimestampNTZType).Supported Data Types:
AnyTimestampType(TimestampTypeorTimestampNTZType)YearMonthIntervalTypeEdge Cases:
TimestampTypeand UTC forTimestampNTZTypeExamples:
Implementation Approach
See the Comet guide on adding new expressions for detailed instructions.
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scalanative/proto/src/proto/expr.protoif needednative/spark-expr/src/(check if DataFusion has built-in support first)Additional context
Difficulty: Medium
Spark Expression Class:
org.apache.spark.sql.catalyst.expressions.TimestampAddYMIntervalRelated:
DateAddYMInterval- Adds year-month intervals to date valuesTimestampAddDTInterval- Adds day-time intervals to timestampsDateTimeUtils.timestampAddMonths()- Underlying implementation methodThis issue was auto-generated from Spark reference documentation.