You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add precision to IntervalDay and new IntervalCompound type (#665)
* Update IntervalDay to support multiple subsecond precisions (0-9).
This is done in a way that should be effectively compatible with old
systems. Old plans should be able to be continue to be consumed treating
the old values as microsecond precision.
* Add new IntervalCompound type which is a combination of IntervalMonth
and IntervalDay
BREAKING CHANGE: The encoding of IntervalDay literals has changed in a
strictly backwards incompatible way. However, the logical meaning across
encoding is maintained using a oneof. Moving a field into a oneof makes
unset/set to zero unclear with older messages but the fields are defined
such that the logical meaning of the two is indistinct. If neither
microseconds nor precision is set, the value can be considered a
precision 6 value. If you aren't using IntervalDay type, you will not
need to make any changes.
BREAKING CHANGE: TypeExpression and Parameterized type protobufs (used
to serialize output derivation) are updated to match the now compound
nature of IntervalDay. If you use protobuf to serialize output
derivation that refer to IntervalDay type, you will need to rework that
logic.
fixes#664
---------
Co-authored-by: Jacques Nadeau <jacques@apache.org>
Copy file name to clipboardExpand all lines: proto/substrait/algebra.proto
+16-1Lines changed: 16 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -817,6 +817,7 @@ message Expression {
817
817
int64time=17;
818
818
IntervalYearToMonthinterval_year_to_month=19;
819
819
IntervalDayToSecondinterval_day_to_second=20;
820
+
IntervalCompoundinterval_compound=36;
820
821
stringfixed_char=21;
821
822
VarCharvar_char=22;
822
823
bytesfixed_binary=23;
@@ -888,7 +889,21 @@ message Expression {
888
889
messageIntervalDayToSecond {
889
890
int32days=1;
890
891
int32seconds=2;
891
-
int32microseconds=3;
892
+
893
+
// Consumers should expect either (miroseconds) to be set or (precision and subseconds) to be set
894
+
oneofprecision_mode {
895
+
int32microseconds=3 [deprecated = true]; // use precision and subseconds below, they cover and replace microseconds.
896
+
// Sub-second precision, 0 means the value given is in seconds, 3 is milliseconds, 6 microseconds, 9 is nanoseconds. Should be used with subseconds below.
897
+
int32precision=4;
898
+
}
899
+
900
+
// the number of fractional seconds using 1e(-precision) units. Should only be used with precision field, not microseconds.
Copy file name to clipboardExpand all lines: site/docs/types/type_classes.md
+2-1Lines changed: 2 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,6 @@ Simple type classes are those that don't support any form of configuration. For
24
24
| date | A date within [1000-01-01..9999-12-31]. | `int32` days since `1970-01-01`
25
25
| time | A time since the beginning of any day. Range of [0..86,399,999,999] microseconds; leap seconds need not be supported. | `int64` microseconds past midnight
26
26
| interval_year | Interval year to month. Supports a range of [-10,000..10,000] years with month precision (= [-120,000..120,000] months). Usually stored as separate integers for years and months, but only the total number of months is significant, i.e. `1y 0m` is considered equal to `0y 12m` or `1001y -12000m`. | `int32` years and `int32` months, with the added constraint that each component can never independently specify more than 10,000 years, even if the components have opposite signs (e.g. `-10000y 200000m` is **not** allowed)
27
-
| interval_day | Interval day to second. Supports a range of [-3,650,000..3,650,000] days with microsecond precision (= [-315,360,000,000,000,000..315,360,000,000,000,000] microseconds). Usually stored as separate integers for various components, but only the total number of microseconds is significant, i.e. `1d 0s` is considered equal to `0d 86400s`. | `int32` days, `int32` seconds, and `int32` microseconds, with the added constraint that each component can never independently specify more than 10,000 years, even if the components have opposite signs (e.g. `3650001d -86400s 0us` is **not** allowed)
28
27
| uuid | A universally-unique identifier composed of 128 bits. Typically presented to users in the following hexadecimal format: `c48ffa9e-64f4-44cb-ae47-152b4e60e77b`. Any 128-bit value is allowed, without specific adherence to RFC4122. | 16-byte `binary`
29
28
30
29
## Compound Types
@@ -43,6 +42,8 @@ Compound type classes are type classes that need to be configured by means of a
43
42
| MAP<K, V> | An unordered list of type K keys with type V values. Keys may be repeated. While the key type could be nullable, keys may not be null. | `repeated KeyValue` (in turn two `Literal`s), all key types matching K and all value types matching V
44
43
| PRECISIONTIMESTAMP<P> | A timestamp with fractional second precision (P, number of digits) 0 <= P <= 9. Does not include timezone information and can thus not be unambiguously mapped to a moment on the timeline without context. Similar to naive datetime in Python. | `int64` seconds, milliseconds, microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 (in an unspecified timezone)
45
44
| PRECISIONTIMESTAMPTZ<P> | A timezone-aware timestamp, with fractional second precision (P, number of digits) 0 <= P <= 9. Similar to aware datetime in Python. | `int64` seconds, milliseconds, microseconds or nanoseconds since 1970-01-01 00:00:00.000000000 UTC
45
+
| INTERVAL_DAY<P> | Interval day to second. Supports a range of [-3,650,000..3,650,000] days with fractional second precision (P, number of digits) 0 <= P <= 9. Usually stored as separate integers for various components, but only the total number of fractional seconds is significant, i.e. `1d 0s` is considered equal to `0d 86400s`. | `int32` days, `int32` seconds, and `int64` fractional seconds, with the added constraint that each component can never independently specify more than 10,000 years, even if the components have opposite signs (e.g. `3650001d -86400s 0us` is **not** allowed)
46
+
| INTERVAL_COMPOUND<P> | A compound interval type that is composed of elements of the underlying elements and rules of both interval_month and interval_day to express arbitrary durations across multiple grains. Substrait gives no definition for the conversion of values between independent grains (e.g. months to days).
0 commit comments