Utils to work with our data lake.
python3 -m pip install git+https://github.com/noverde/novlake#egg=novlake
-
Add config of user
testtonovlake-settings.yaml -
Create database
user_testin Athena.
CREATE DATABASE user_test;
- Run pytest
pytest
Create .env file in home directory with the following instruction:
export NOVLAKE_SETTINGS=s3://<BUCKET_NAME>/novlake-settings.yaml
from novlake.lake import Lake
lake = Lake("camila")
lake.query("SELECT * FROM dumps.loans LIMIT 10")novlake-settings.yaml shall use the following schema:
documentation_home: ""
users:
default:
notebook_path: s3://sample-notebooks/default/
athena_schema_name: user_default
s3_repo: s3://sample-repo/user_default/
athena_output: s3://aws-athena-query-results-sample/novlake/user_default/
test:
notebook_path: s3://novlake-test-data/notebooks/user_test/
athena_schema_name: user_test
s3_repo: s3://novlake-test-data/user_test/
athena_output: s3://novlake-test-data/athena_output/