Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 34 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
[![flake8](https://github.com/Curts0/PyTabular/actions/workflows/flake8.yml/badge.svg?branch=master)](https://github.com/Curts0/PyTabular/actions/workflows/flake8.yml)
### What is it?

[PyTabular](https://github.com/Curts0/PyTabular) (python-tabular in [pypi](https://pypi.org/project/python-tabular/)) is a python package that allows for programmatic execution on your tabular models! This is possible thanks to [Pythonnet](https://pythonnet.github.io/) and Microsoft's [.Net APIs on Azure Analysis Services](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices?view=analysisservices-dotnet). Current, this build is tested and working on Windows Operating System only. Help is needed to expand this for other operating systems. The package should have the dll files included when you import it. See [Documentation Here](https://curts0.github.io/PyTabular/). PyTabular is still considered alpha while I'm working on building out the proper tests and testing environments, so I can ensure some kind of stability in features. Please send bugs my way! Preferably in the issues section in Github. I want to harden this project so many can use it easily. I currently have local pytest for python 3.6 to 3.10 and run those tests through a local AAS and Gen2 model.
[PyTabular](https://github.com/Curts0/PyTabular) (python-tabular in [pypi](https://pypi.org/project/python-tabular/)) is a python package that allows for programmatic execution on your tabular models! This is possible thanks to [Pythonnet](https://pythonnet.github.io/) and Microsoft's [.Net APIs on Azure Analysis Services](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices?view=analysisservices-dotnet). Currently, this build is tested and working on Windows Operating System only. Help is needed to expand this for other operating systems. The package should have the dll files included when you import it. See [Documentation Here](https://curts0.github.io/PyTabular/). PyTabular is still considered alpha while I'm working on building out the proper tests and testing environments, so I can ensure some kind of stability in features. Please send bugs my way! Preferably in the issues section in Github. I want to harden this project so many can use it easily. I currently have local pytest for python 3.6 to 3.10 and run those tests through a local AAS and Gen2 model.

### Getting Started
See the [Pypi project](https://pypi.org/project/python-tabular/) for available version.
Expand Down Expand Up @@ -92,8 +92,8 @@ model.Tables['Table Name'].Refresh()
#or
model.Tables['Table Name'].Partitions['Partition Name'].Refresh()

#Add Tracing=True for simple Traces tracking the refresh.
model.Refresh(['Table1','Table2'], Tracing=True)
#Default Tracing happens automatically, but can be removed by --
model.Refresh(['Table1','Table2'], trace = None)
```

It's not uncommon to need to run through some checks on specific Tables, Partitions, Columns, Etc...
Expand All @@ -118,7 +118,7 @@ This will use the function [Return_Zero_Row_Tables](https://curts0.github.io/PyT
```python
import pytabular
model = pytabular.Tabular(CONNECTION_STR)
tables = pytabular.Return_Zero_Row_Tables()
tables = pytabular.Return_Zero_Row_Tables(model)
if len(tables) > 0:
model.Refresh(tables, Tracing = True) #Add a trace in there for some fun.
```
Expand Down Expand Up @@ -180,5 +180,35 @@ for file_path in LIST_OF_FILE_PATHS:
model.Query(file_path)
```

#### Advanced Refreshing with Pre and Post Checks
Maybe you are introducing new logic to a fact table, and you need to ensure that a measure checking last month values never changes. To do that you can take advantage of the `Refresh_Check` and `Refresh_Check_Collection` classes (Sorry, I know the documentation stinks right now). But using those you can build out something that would first check the results of the measure, then refresh, then check the results of the measure after refresh, and lastly perform your desired check. In this case the `pre` value matches the `post` value. When refreshing and your pre does not equal post, it would fail and give an assertion error in your logging.
```python
from pytabular import Tabular
from pytabular.refresh import Refresh_Check, Refresh_Check_Collection

model = Tabular(CONNECTION_STR)

# This is our custom check that we want to run after refresh.
# Does the pre refresh value match the post refresh value.
def sum_of_sales_assertion(pre, post):
return pre == post

# This is where we put it all together into the `Refresh_Check` class. Give it a name, give it a query to run, and give it the assertion you want to make.
sum_of_last_month_sales = Refresh_Check(
'Last Month Sales',
lambda: model.Query("EVALUATE {[Last Month Sales]}")
,sum_of_sales_assertion
)

# Here we are adding it to a `Refresh_Check_Collection` because you can have more than on `Refresh_Check` to run.
all_refresh_check = Refresh_Check_Collection([sum_of_last_month_sales])

model.Refresh(
'Fact Table Name',
refresh_checks = Refresh_Check_Collection([sum_of_last_month_sales])

)
```

### Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md)
Binary file removed dist/python_tabular-0.0.35-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.35.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.40-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.40.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.50-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.50.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.60-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.60.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.70-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.70.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.80-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.80.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.0.90-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.0.90.tar.gz
Binary file not shown.
Binary file removed dist/python_tabular-0.1.0-py3-none-any.whl
Binary file not shown.
Binary file removed dist/python_tabular-0.1.0.tar.gz
Binary file not shown.
6 changes: 6 additions & 0 deletions mkgendocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ pages:
source: 'pytabular/query.py'
classes:
- Connection
- page: "Refreshes.md"
source: 'pytabular/refresh.py'
classes:
- PyRefresh
- Refresh_Check
- Refresh_Check_Collection
- page: "Table.md"
source: 'pytabular/table.py'
classes:
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "python_tabular"
version = "0.1.8"
version = "0.1.9"
authors = [
{ name="Curtis Stallings", email="curtisrstallings@gmail.com" },
]
Expand Down
23 changes: 13 additions & 10 deletions pytabular/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,32 @@
import sys
import platform
from rich.logging import RichHandler
from rich.theme import Theme
from rich.console import Console
from rich import pretty

pretty.install()
console = Console(theme=Theme({"logging.level.warning": "bold reverse red"}))
logging.basicConfig(
level=logging.DEBUG,
format="%(message)s",
datefmt="[%H:%M:%S]",
handlers=[RichHandler()],
handlers=[RichHandler(console=console)],
)
logger = logging.getLogger("PyTabular")
logger.setLevel(logging.INFO)
logger.info("Logging configured...")
logger.info(f"To update PyTabular logger...")
logger.info(f"To update logging:")
logger.info(f">>> import logging")
logger.info(f">>> pytabular.logger.setLevel(level=logging.INFO)")
logger.info(f"See https://docs.python.org/3/library/logging.html#logging-levels")


logger.debug(f"Python Version::{sys.version}")
logger.debug(f"Python Location::{sys.exec_prefix}")
logger.debug(f"Package Location::{__file__}")
logger.debug(f"Working Directory::{os.getcwd()}")
logger.debug(f"Platform::{sys.platform}-{platform.release()}")
logger.info(f"Python Version::{sys.version}")
logger.info(f"Python Location::{sys.exec_prefix}")
logger.info(f"Package Location::{__file__}")
logger.info(f"Working Directory::{os.getcwd()}")
logger.info(f"Platform::{sys.platform}-{platform.release()}")

dll = os.path.join(os.path.dirname(__file__), "dll")
sys.path.append(dll)
Expand All @@ -35,11 +38,11 @@
logger.debug(f"Beginning CLR references...")
import clr

logger.info("Adding Reference Microsoft.AnalysisServices.AdomdClient")
logger.debug("Adding Reference Microsoft.AnalysisServices.AdomdClient")
clr.AddReference("Microsoft.AnalysisServices.AdomdClient")
logger.info("Adding Reference Microsoft.AnalysisServices.Tabular")
logger.debug("Adding Reference Microsoft.AnalysisServices.Tabular")
clr.AddReference("Microsoft.AnalysisServices.Tabular")
logger.info("Adding Reference Microsoft.AnalysisServices")
logger.debug("Adding Reference Microsoft.AnalysisServices")
clr.AddReference("Microsoft.AnalysisServices")

logger.debug(f"Importing specifics in module...")
Expand Down
150 changes: 31 additions & 119 deletions pytabular/pytabular.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

from Microsoft.AnalysisServices.Tabular import (
Server,
RefreshType,
ColumnType,
Table,
DataColumn,
Expand All @@ -11,7 +10,7 @@
)

from Microsoft.AnalysisServices import UpdateOptions
from typing import Any, Dict, List, Union
from typing import List, Union
from collections import namedtuple
import pandas as pd
import os
Expand All @@ -20,16 +19,16 @@
from logic_utils import (
pd_dataframe_to_m_expression,
pandas_datatype_to_tabular_datatype,
ticks_to_datetime,
remove_suffix,
)
from query import Connection

from table import PyTable, PyTables
from partition import PyPartitions
from column import PyColumns
from measure import PyMeasures
from tabular_tracing import Refresh_Trace
from object import PyObject
from refresh import PyRefresh
from query import Connection

logger = logging.getLogger("PyTabular")

Expand All @@ -51,6 +50,8 @@ class Tabular(PyObject):
"""

def __init__(self, CONNECTION_STR: str):

# Connecting to model...
logger.debug("Initializing Tabular Class")
self.Server = Server()
self.Server.Connect(CONNECTION_STR)
Expand All @@ -72,10 +73,15 @@ def __init__(self, CONNECTION_STR: str):
self.CompatibilityMode: int = self.Database.CompatibilityMode.value__
self.Model = self.Database.Model
logger.info(f"Connected to Model - {self.Model.Name}")
super().__init__(self.Model)
self.Adomd = Connection(self.Server)
self.Adomd: Connection = Connection(self.Server)

# Build PyObjects
self.Reload_Model_Info()

# Run subclass init
super().__init__(self.Model)

# Building rich table display for repr
self._display.add_row(
"EstimatedSize",
str(round(self.Database.EstimatedSize / 1000000000, 2)) + " GB",
Expand All @@ -90,21 +96,21 @@ def __init__(self, CONNECTION_STR: str):
self._display.add_row("Database", self.Database.Name)
self._display.add_row("Server", self.Server.Name)

# Finished and registering disconnect
logger.debug("Class Initialization Completed")
logger.debug("Registering Disconnect on Termination...")
atexit.register(self.Disconnect)

pass

# def __repr__(self) -> str:
# return f"Server::{self.Server.Name}\nDatabase::{self.Database.Name}\nModel::{self.Model.Name}\nEstimated Size::{self.Database.EstimatedSize}"

def Reload_Model_Info(self) -> bool:
"""Runs on __init__ iterates through details, can be called after any model changes. Called in SaveChanges()

Returns:
bool: True if successful
"""
self.Database.Refresh()

self.Tables = PyTables(
[PyTable(table, self) for table in self.Model.Tables.GetEnumerator()]
)
Expand All @@ -117,7 +123,6 @@ def Reload_Model_Info(self) -> bool:
self.Measures = PyMeasures(
[measure for table in self.Tables for measure in table.Measures]
)
self.Database.Refresh()
return True

def Is_Process(self) -> bool:
Expand All @@ -142,117 +147,22 @@ def Disconnect(self) -> bool:
logger.debug(f"Disconnecting from - {self.Server.Name}")
return self.Server.Disconnect()

def Refresh(
self,
Object: Union[str, Table, Partition, Dict[str, Any]],
RefreshType: RefreshType = RefreshType.Full,
Tracing=False,
) -> None:
"""Refreshes table(s) and partition(s).
def Refresh(self, *args, **kwargs) -> pd.DataFrame:
"""PyRefresh Class to handle refreshes of model.

Args:
Object (Union[ str, Table, Partition, Dict[str, Any], Iterable[str, Table, Partition, Dict[str, Any]] ]): Designed to handle a few different ways of selecting a refresh.
str == 'Table_Name'
Table == Table Object
Partition == Partition Object
Dict[str, Any] == A way to specify a partition of group of partitions. For ex. {'Table_Name':'Partition1'} or {'Table_Name':['Partition1','Partition2']}. NOTE you can also change out the strings for partition or tables objects.
RefreshType (RefreshType, optional): See [RefreshType](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.tabular.refreshtype?view=analysisservices-dotnet). Defaults to RefreshType.Full.
Tracing (bool, optional): Currently just some basic tracing to track refreshes. Defaults to False.

Raises:
Exception: Raises exception if unable to find table or partition via string.

model (Tabular): Main Tabular Class
object (Union[str, PyTable, PyPartition, Dict[str, Any]]): Designed to handle a few different ways of selecting a refresh. Can be a string of 'Table Name' or dict of {'Table Name': 'Partition Name'} or even some combination with the actual PyTable and PyPartition classes.
trace (Base_Trace, optional): Set to `None` if no Tracing is desired, otherwise you can use default trace or create your own. Defaults to Refresh_Trace.
refresh_checks (Refresh_Check_Collection, optional): Add your `Refresh_Check`'s into a `Refresh_Check_Collection`. Defaults to Refresh_Check_Collection().
default_row_count_check (bool, optional): Quick built in check will fail the refresh if post check row count is zero. Defaults to True.
refresh_type (RefreshType, optional): Input RefreshType desired. Defaults to RefreshType.Full.

Returns:
WIP: WIP
pd.DataFrame
"""
logger.debug("Beginning RequestRefresh cadence...")

def _Refresh_Report(Property_Changes) -> pd.DataFrame:
logger.debug("Running Refresh Report...")
refresh_data = []
for property_change in Property_Changes:
if (
isinstance(property_change.Object, Partition)
and property_change.Property_Name == "RefreshedTime"
):
table, partition, refreshed_time = (
property_change.Object.Table.Name,
property_change.Object.Name,
ticks_to_datetime(property_change.New_Value.Ticks),
)
logger.info(
f'{table} - {partition} Refreshed! - {refreshed_time.strftime("%m/%d/%Y, %H:%M:%S")}'
)
refresh_data += [[table, partition, refreshed_time]]
return pd.DataFrame(
refresh_data, columns=["Table", "Partition", "Refreshed Time"]
)

def refresh_table(table: Table) -> None:
logging.info(f"Requesting refresh for {table.Name}")
table.RequestRefresh(RefreshType)

def refresh_partition(partition: Partition) -> None:
logging.info(
f"Requesting refresh for {partition.Table.Name}|{partition.Name}"
)
partition.RequestRefresh(RefreshType)

def refresh_dict(partition_dict: Dict) -> None:
for table in partition_dict.keys():
table_object = find_table(table) if isinstance(table, str) else table

def handle_partitions(object):
if isinstance(object, str):
refresh_partition(find_partition(table_object, object))
elif isinstance(object, Partition):
refresh_partition(object)
else:
[handle_partitions(obj) for obj in object]

handle_partitions(partition_dict[table])

def find_table(table_str: str) -> Table:
result = self.Model.Tables.Find(table_str)
if result is None:
raise Exception(f"Unable to find table! from {table_str}")
logger.debug(f"Found table {result.Name}")
return result

def find_partition(table: Table, partition_str: str) -> Partition:
result = table.Partitions.Find(partition_str)
if result is None:
raise Exception(
f"Unable to find partition! {table.Name}|{partition_str}"
)
logger.debug(f"Found partition {result.Table.Name}|{result.Name}")
return result

def refresh(Object):
if isinstance(Object, str):
refresh_table(find_table(Object))
elif isinstance(Object, Dict):
refresh_dict(Object)
elif isinstance(Object, Table):
refresh_table(Object)
elif isinstance(Object, Partition):
refresh_partition(Object)
else:
[refresh(object) for object in Object]

refresh(Object)
if Tracing:
rt = Refresh_Trace(self)
rt.Start()

m = self.SaveChanges()

if Tracing:
rt.Stop()
rt.Drop()

return _Refresh_Report(m.Property_Changes)
r = PyRefresh(self, *args, **kwargs)
return r.Run()

def Update(self, UpdateOptions: UpdateOptions = UpdateOptions.ExpandFull) -> None:
"""[Update Model](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.majorobject.update?view=analysisservices-dotnet#microsoft-analysisservices-majorobject-update(microsoft-analysisservices-updateoptions))
Expand Down Expand Up @@ -410,7 +320,8 @@ def clone_role_permissions():

clone_role_permissions()
logger.info(f"Refreshing Clone... {table.Name}")
self.Refresh([table])
self.Reload_Model_Info()
self.Refresh(table.Name, default_row_count_check=False)
logger.info(f"Updating Model {self.Model.Name}")
self.SaveChanges()
return True
Expand Down Expand Up @@ -642,6 +553,7 @@ def Create_Table(self, df: pd.DataFrame, table_name: str) -> bool:
f"Adding table: {new_table.Name} to {self.Server.Name}::{self.Database.Name}::{self.Model.Name}"
)
self.Model.Tables.Add(new_table)
self.Refresh([new_table], Tracing=True)
self.SaveChanges()
self.Reload_Model_Info()
self.Refresh(new_table.Name)
return True
Loading