-
-
Notifications
You must be signed in to change notification settings - Fork 95
Closed
Labels
Description
| Pandas Value | chDB Display | isNull() |
Issue |
|---|---|---|---|
2024-01-01 |
1978-10-22 |
0 | Date corrupted |
pd.NaT |
1970-01-01 |
0 | Should be NULL |
Date functions like toYear(), toMonth() return correct values,
but the displayed datetime is corrupted.
to check
import pandas as pd
import numpy as np
import chdb
print('=== Testing pandas datetime64 with chDB ===')
print()
# Create DataFrame with datetime and NaT
df = pd.DataFrame({
'date_col': pd.to_datetime(['2024-01-01', pd.NaT, '2024-03-15', pd.NaT, '2024-05-20']),
'name': ['A', 'B', 'C', 'D', 'E'],
})
print('Original pandas DataFrame:')
print(df)
print()
print('dtypes:')
print(df.dtypes)
print()
print('isna() result:')
print(df['date_col'].isna())
print()
# Query via chDB
conn = chdb.connect(':memory:')
__df__ = df
print('=== chDB Query Results ===')
print()
print('1. Basic SELECT:')
try:
result = conn.query('SELECT * FROM Python(__df__)', 'DataFrame')
print(result)
print('dtypes:', result.dtypes.to_dict())
except Exception as e:
print('Error:', e)
print()
print('2. isNull check:')
try:
result = conn.query('SELECT date_col, isNull(date_col) as is_null FROM Python(__df__)', 'DataFrame')
print(result)
except Exception as e:
print('Error:', e)
print()
print('3. Date functions:')
try:
result = conn.query('SELECT date_col, toYear(date_col) as year, toMonth(date_col) as month FROM Python(__df__)', 'DataFrame')
print(result)
except Exception as e:
print('Error:', e)
print()
print('4. ifNull for date:')
try:
result = conn.query(\"SELECT date_col, ifNull(date_col, toDate('1970-01-01')) as filled FROM Python(__df__)\", 'DataFrame')
print(result)
except Exception as e:
print('Error:', e)
print()
print('5. Date comparison:')
try:
result = conn.query(\"SELECT date_col, date_col > toDate('2024-02-01') as after_feb FROM Python(__df__)\", 'DataFrame')
print(result)
except Exception as e:
print('Error:', e)
print()
print('6. min/max aggregates:')
try:
result = conn.query('SELECT min(date_col) as min_date, max(date_col) as max_date FROM Python(__df__)', 'DataFrame')
print(result)
except Exception as e:
print('Error:', e)output
=== Testing pandas datetime64 with chDB ===
Original pandas DataFrame:
date_col name
0 2024-01-01 A
1 NaT B
2 2024-03-15 C
3 NaT D
4 2024-05-20 E
dtypes:
date_col datetime64[ns]
name object
dtype: object
isna() result:
0 False
1 True
2 False
3 True
4 False
Name: date_col, dtype: bool
=== chDB Query Results ===
1. Basic SELECT:
date_col name
0 1978-10-22 15:35:12.388288512+07:30 A
1 1970-01-01 07:30:00+07:30 B
2 1832-12-17 10:00:29.217286144+06:55 C
3 1970-01-01 07:30:00+07:30 D
4 2255-11-03 07:31:56.654886912+08:00 E
dtypes: {'date_col': datetime64[ns, Asia/Singapore], 'name': dtype('O')}
2. isNull check:
date_col is_null
0 1978-10-22 15:35:12.388288512+07:30 0
1 1970-01-01 07:30:00+07:30 0
2 1832-12-17 10:00:29.217286144+06:55 0
3 1970-01-01 07:30:00+07:30 0
4 2255-11-03 07:31:56.654886912+08:00 0
3. Date functions:
date_col year month
0 1978-10-22 15:35:12.388288512+07:30 2024 1
1 1970-01-01 07:30:00+07:30 1900 1
2 1832-12-17 10:00:29.217286144+06:55 2024 3
3 1970-01-01 07:30:00+07:30 1900 1
4 2255-11-03 07:31:56.654886912+08:00 2024 5
4. ifNull for date:
date_col filled
0 1978-10-22 15:35:12.388288512+07:30 1978-10-22 15:35:12.388288512+07:30
1 1970-01-01 07:30:00+07:30 1970-01-01 07:30:00+07:30
2 1832-12-17 10:00:29.217286144+06:55 1832-12-17 10:00:29.217286144+06:55
3 1970-01-01 07:30:00+07:30 1970-01-01 07:30:00+07:30
4 2255-11-03 07:31:56.654886912+08:00 2255-11-03 07:31:56.654886912+08:00
5. Date comparison:
date_col after_feb
0 1978-10-22 15:35:12.388288512+07:30 0
1 1970-01-01 07:30:00+07:30 0
2 1832-12-17 10:00:29.217286144+06:55 1
3 1970-01-01 07:30:00+07:30 0
4 2255-11-03 07:31:56.654886912+08:00 1
6. min/max aggregates:
min_date max_date
0 NaT 2024-05-20 08:00:00+08:00