What is the problem the feature request solves?
Note: This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.
Comet does not currently support the Spark timestamp_add function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.
The TimestampAdd expression adds a specified quantity of time units to a timestamp value. It supports various time units (like days, hours, minutes, seconds) and is timezone-aware, returning a timestamp of the same type as the input timestamp.
Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.
Describe the potential solution
Spark Specification
Syntax:
TIMESTAMPADD(unit, quantity, timestamp)
Arguments:
| Argument |
Type |
Description |
| unit |
String |
The time unit to add (e.g., "YEAR", "MONTH", "DAY", "HOUR", "MINUTE", "SECOND") |
| quantity |
Long |
The number of units to add to the timestamp |
| timestamp |
AnyTimestampType |
The base timestamp to which the quantity will be added |
| timeZoneId |
Option[String] |
Optional timezone identifier for timezone-aware calculations |
Return Type: Returns the same data type as the input timestamp parameter (preserves whether it's TimestampType or TimestampNTZType).
Supported Data Types:
- quantity:
LongType only
- timestamp: Any timestamp type (
TimestampType or TimestampNTZType)
- unit: String literal representing valid time units
Edge Cases:
- Null handling: Returns null if any input parameter (quantity or timestamp) is null (
nullIntolerant = true)
- Timezone handling: Automatically selects appropriate timezone based on timestamp data type
- Unit validation: Invalid unit strings are handled during expression conversion phase
- Overflow: Large quantity values may cause timestamp overflow, behavior depends on underlying
DateTimeUtils implementation
Examples:
-- Add 5 days to a timestamp
SELECT TIMESTAMPADD('DAY', 5, TIMESTAMP '2010-01-01 01:02:03.123456');
-- Add 3 hours to current timestamp
SELECT TIMESTAMPADD('HOUR', 3, current_timestamp());
-- Subtract time by using negative quantity
SELECT TIMESTAMPADD('MINUTE', -30, TIMESTAMP '2010-01-01 12:00:00');
// DataFrame API usage
import org.apache.spark.sql.functions._
// Add 7 days to a timestamp column
df.select(expr("TIMESTAMPADD('DAY', 7, timestamp_col)"))
// Using column references for quantity
df.select(expr("TIMESTAMPADD('HOUR', quantity_col, timestamp_col)"))
Implementation Approach
See the Comet guide on adding new expressions for detailed instructions.
- Scala Serde: Add expression handler in
spark/src/main/scala/org/apache/comet/serde/
- Register: Add to appropriate map in
QueryPlanSerde.scala
- Protobuf: Add message type in
native/proto/src/proto/expr.proto if needed
- Rust: Implement in
native/spark-expr/src/ (check if DataFusion has built-in support first)
Additional context
Difficulty: Medium
Spark Expression Class: org.apache.spark.sql.catalyst.expressions.TimestampAdd
Related:
DateAdd - For date-only arithmetic
DateSub - For subtracting from dates
Interval expressions for duration-based calculations
TimeZoneAwareExpression trait for timezone handling
This issue was auto-generated from Spark reference documentation.
What is the problem the feature request solves?
Comet does not currently support the Spark
timestamp_addfunction, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.The
TimestampAddexpression adds a specified quantity of time units to a timestamp value. It supports various time units (like days, hours, minutes, seconds) and is timezone-aware, returning a timestamp of the same type as the input timestamp.Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.
Describe the potential solution
Spark Specification
Syntax:
TIMESTAMPADD(unit, quantity, timestamp)Arguments:
Return Type: Returns the same data type as the input
timestampparameter (preserves whether it'sTimestampTypeorTimestampNTZType).Supported Data Types:
LongTypeonlyTimestampTypeorTimestampNTZType)Edge Cases:
nullIntolerant = true)DateTimeUtilsimplementationExamples:
Implementation Approach
See the Comet guide on adding new expressions for detailed instructions.
spark/src/main/scala/org/apache/comet/serde/QueryPlanSerde.scalanative/proto/src/proto/expr.protoif needednative/spark-expr/src/(check if DataFusion has built-in support first)Additional context
Difficulty: Medium
Spark Expression Class:
org.apache.spark.sql.catalyst.expressions.TimestampAddRelated:
DateAdd- For date-only arithmeticDateSub- For subtracting from datesIntervalexpressions for duration-based calculationsTimeZoneAwareExpressiontrait for timezone handlingThis issue was auto-generated from Spark reference documentation.