diff --git a/README.md b/README.md index 6f51858..41acfc8 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ [![flake8](https://github.com/Curts0/PyTabular/actions/workflows/flake8.yml/badge.svg?branch=master)](https://github.com/Curts0/PyTabular/actions/workflows/flake8.yml) ### What is it? -[PyTabular](https://github.com/Curts0/PyTabular) (python-tabular in [pypi](https://pypi.org/project/python-tabular/)) is a python package that allows for programmatic execution on your tabular models! This is possible thanks to [Pythonnet](https://pythonnet.github.io/) and Microsoft's [.Net APIs on Azure Analysis Services](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices?view=analysisservices-dotnet). Current, this build is tested and working on Windows Operating System only. Help is needed to expand this for other operating systems. The package should have the dll files included when you import it. See [Documentation Here](https://curts0.github.io/PyTabular/). PyTabular is still considered alpha while I'm working on building out the proper tests and testing environments, so I can ensure some kind of stability in features. Please send bugs my way! Preferably in the issues section in Github. I want to harden this project so many can use it easily. I currently have local pytest for python 3.6 to 3.10 and run those tests through a local AAS and Gen2 model. +[PyTabular](https://github.com/Curts0/PyTabular) (python-tabular in [pypi](https://pypi.org/project/python-tabular/)) is a python package that allows for programmatic execution on your tabular models! This is possible thanks to [Pythonnet](https://pythonnet.github.io/) and Microsoft's [.Net APIs on Azure Analysis Services](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices?view=analysisservices-dotnet). Currently, this build is tested and working on Windows Operating System only. Help is needed to expand this for other operating systems. The package should have the dll files included when you import it. See [Documentation Here](https://curts0.github.io/PyTabular/). PyTabular is still considered alpha while I'm working on building out the proper tests and testing environments, so I can ensure some kind of stability in features. Please send bugs my way! Preferably in the issues section in Github. I want to harden this project so many can use it easily. I currently have local pytest for python 3.6 to 3.10 and run those tests through a local AAS and Gen2 model. ### Getting Started See the [Pypi project](https://pypi.org/project/python-tabular/) for available version. @@ -92,8 +92,8 @@ model.Tables['Table Name'].Refresh() #or model.Tables['Table Name'].Partitions['Partition Name'].Refresh() -#Add Tracing=True for simple Traces tracking the refresh. -model.Refresh(['Table1','Table2'], Tracing=True) +#Default Tracing happens automatically, but can be removed by -- +model.Refresh(['Table1','Table2'], trace = None) ``` It's not uncommon to need to run through some checks on specific Tables, Partitions, Columns, Etc... @@ -118,7 +118,7 @@ This will use the function [Return_Zero_Row_Tables](https://curts0.github.io/PyT ```python import pytabular model = pytabular.Tabular(CONNECTION_STR) -tables = pytabular.Return_Zero_Row_Tables() +tables = pytabular.Return_Zero_Row_Tables(model) if len(tables) > 0: model.Refresh(tables, Tracing = True) #Add a trace in there for some fun. ``` @@ -180,5 +180,35 @@ for file_path in LIST_OF_FILE_PATHS: model.Query(file_path) ``` +#### Advanced Refreshing with Pre and Post Checks +Maybe you are introducing new logic to a fact table, and you need to ensure that a measure checking last month values never changes. To do that you can take advantage of the `Refresh_Check` and `Refresh_Check_Collection` classes (Sorry, I know the documentation stinks right now). But using those you can build out something that would first check the results of the measure, then refresh, then check the results of the measure after refresh, and lastly perform your desired check. In this case the `pre` value matches the `post` value. When refreshing and your pre does not equal post, it would fail and give an assertion error in your logging. +```python +from pytabular import Tabular +from pytabular.refresh import Refresh_Check, Refresh_Check_Collection + +model = Tabular(CONNECTION_STR) + +# This is our custom check that we want to run after refresh. +# Does the pre refresh value match the post refresh value. +def sum_of_sales_assertion(pre, post): + return pre == post + +# This is where we put it all together into the `Refresh_Check` class. Give it a name, give it a query to run, and give it the assertion you want to make. +sum_of_last_month_sales = Refresh_Check( + 'Last Month Sales', + lambda: model.Query("EVALUATE {[Last Month Sales]}") + ,sum_of_sales_assertion +) + +# Here we are adding it to a `Refresh_Check_Collection` because you can have more than on `Refresh_Check` to run. +all_refresh_check = Refresh_Check_Collection([sum_of_last_month_sales]) + +model.Refresh( + 'Fact Table Name', + refresh_checks = Refresh_Check_Collection([sum_of_last_month_sales]) + +) +``` + ### Contributing See [CONTRIBUTING.md](CONTRIBUTING.md) \ No newline at end of file diff --git a/dist/python_tabular-0.0.35-py3-none-any.whl b/dist/python_tabular-0.0.35-py3-none-any.whl deleted file mode 100644 index 7d5670c..0000000 Binary files a/dist/python_tabular-0.0.35-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.35.tar.gz b/dist/python_tabular-0.0.35.tar.gz deleted file mode 100644 index 9c74061..0000000 Binary files a/dist/python_tabular-0.0.35.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.40-py3-none-any.whl b/dist/python_tabular-0.0.40-py3-none-any.whl deleted file mode 100644 index bf198e4..0000000 Binary files a/dist/python_tabular-0.0.40-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.40.tar.gz b/dist/python_tabular-0.0.40.tar.gz deleted file mode 100644 index 5ec2254..0000000 Binary files a/dist/python_tabular-0.0.40.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.50-py3-none-any.whl b/dist/python_tabular-0.0.50-py3-none-any.whl deleted file mode 100644 index e52d41b..0000000 Binary files a/dist/python_tabular-0.0.50-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.50.tar.gz b/dist/python_tabular-0.0.50.tar.gz deleted file mode 100644 index 5329af2..0000000 Binary files a/dist/python_tabular-0.0.50.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.60-py3-none-any.whl b/dist/python_tabular-0.0.60-py3-none-any.whl deleted file mode 100644 index 94fba08..0000000 Binary files a/dist/python_tabular-0.0.60-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.60.tar.gz b/dist/python_tabular-0.0.60.tar.gz deleted file mode 100644 index c6b809e..0000000 Binary files a/dist/python_tabular-0.0.60.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.70-py3-none-any.whl b/dist/python_tabular-0.0.70-py3-none-any.whl deleted file mode 100644 index 7e7ccab..0000000 Binary files a/dist/python_tabular-0.0.70-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.70.tar.gz b/dist/python_tabular-0.0.70.tar.gz deleted file mode 100644 index 470ee20..0000000 Binary files a/dist/python_tabular-0.0.70.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.80-py3-none-any.whl b/dist/python_tabular-0.0.80-py3-none-any.whl deleted file mode 100644 index b4319dd..0000000 Binary files a/dist/python_tabular-0.0.80-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.80.tar.gz b/dist/python_tabular-0.0.80.tar.gz deleted file mode 100644 index 24b9dda..0000000 Binary files a/dist/python_tabular-0.0.80.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.0.90-py3-none-any.whl b/dist/python_tabular-0.0.90-py3-none-any.whl deleted file mode 100644 index f03cc01..0000000 Binary files a/dist/python_tabular-0.0.90-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.0.90.tar.gz b/dist/python_tabular-0.0.90.tar.gz deleted file mode 100644 index e3f817b..0000000 Binary files a/dist/python_tabular-0.0.90.tar.gz and /dev/null differ diff --git a/dist/python_tabular-0.1.0-py3-none-any.whl b/dist/python_tabular-0.1.0-py3-none-any.whl deleted file mode 100644 index 5eeff65..0000000 Binary files a/dist/python_tabular-0.1.0-py3-none-any.whl and /dev/null differ diff --git a/dist/python_tabular-0.1.0.tar.gz b/dist/python_tabular-0.1.0.tar.gz deleted file mode 100644 index 0fbfc04..0000000 Binary files a/dist/python_tabular-0.1.0.tar.gz and /dev/null differ diff --git a/mkgendocs.yml b/mkgendocs.yml index 67f5d3f..8c4036f 100644 --- a/mkgendocs.yml +++ b/mkgendocs.yml @@ -13,6 +13,12 @@ pages: source: 'pytabular/query.py' classes: - Connection + - page: "Refreshes.md" + source: 'pytabular/refresh.py' + classes: + - PyRefresh + - Refresh_Check + - Refresh_Check_Collection - page: "Table.md" source: 'pytabular/table.py' classes: diff --git a/pyproject.toml b/pyproject.toml index ec7fcd9..b1df4f3 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "python_tabular" -version = "0.1.8" +version = "0.1.9" authors = [ { name="Curtis Stallings", email="curtisrstallings@gmail.com" }, ] diff --git a/pytabular/__init__.py b/pytabular/__init__.py index afeaf34..50937f3 100644 --- a/pytabular/__init__.py +++ b/pytabular/__init__.py @@ -4,29 +4,32 @@ import sys import platform from rich.logging import RichHandler +from rich.theme import Theme +from rich.console import Console from rich import pretty pretty.install() +console = Console(theme=Theme({"logging.level.warning": "bold reverse red"})) logging.basicConfig( level=logging.DEBUG, format="%(message)s", datefmt="[%H:%M:%S]", - handlers=[RichHandler()], + handlers=[RichHandler(console=console)], ) logger = logging.getLogger("PyTabular") logger.setLevel(logging.INFO) logger.info("Logging configured...") -logger.info(f"To update PyTabular logger...") +logger.info(f"To update logging:") logger.info(f">>> import logging") logger.info(f">>> pytabular.logger.setLevel(level=logging.INFO)") logger.info(f"See https://docs.python.org/3/library/logging.html#logging-levels") -logger.debug(f"Python Version::{sys.version}") -logger.debug(f"Python Location::{sys.exec_prefix}") -logger.debug(f"Package Location::{__file__}") -logger.debug(f"Working Directory::{os.getcwd()}") -logger.debug(f"Platform::{sys.platform}-{platform.release()}") +logger.info(f"Python Version::{sys.version}") +logger.info(f"Python Location::{sys.exec_prefix}") +logger.info(f"Package Location::{__file__}") +logger.info(f"Working Directory::{os.getcwd()}") +logger.info(f"Platform::{sys.platform}-{platform.release()}") dll = os.path.join(os.path.dirname(__file__), "dll") sys.path.append(dll) @@ -35,11 +38,11 @@ logger.debug(f"Beginning CLR references...") import clr -logger.info("Adding Reference Microsoft.AnalysisServices.AdomdClient") +logger.debug("Adding Reference Microsoft.AnalysisServices.AdomdClient") clr.AddReference("Microsoft.AnalysisServices.AdomdClient") -logger.info("Adding Reference Microsoft.AnalysisServices.Tabular") +logger.debug("Adding Reference Microsoft.AnalysisServices.Tabular") clr.AddReference("Microsoft.AnalysisServices.Tabular") -logger.info("Adding Reference Microsoft.AnalysisServices") +logger.debug("Adding Reference Microsoft.AnalysisServices") clr.AddReference("Microsoft.AnalysisServices") logger.debug(f"Importing specifics in module...") diff --git a/pytabular/pytabular.py b/pytabular/pytabular.py index 55692fb..76f69b6 100644 --- a/pytabular/pytabular.py +++ b/pytabular/pytabular.py @@ -2,7 +2,6 @@ from Microsoft.AnalysisServices.Tabular import ( Server, - RefreshType, ColumnType, Table, DataColumn, @@ -11,7 +10,7 @@ ) from Microsoft.AnalysisServices import UpdateOptions -from typing import Any, Dict, List, Union +from typing import List, Union from collections import namedtuple import pandas as pd import os @@ -20,16 +19,16 @@ from logic_utils import ( pd_dataframe_to_m_expression, pandas_datatype_to_tabular_datatype, - ticks_to_datetime, remove_suffix, ) -from query import Connection + from table import PyTable, PyTables from partition import PyPartitions from column import PyColumns from measure import PyMeasures -from tabular_tracing import Refresh_Trace from object import PyObject +from refresh import PyRefresh +from query import Connection logger = logging.getLogger("PyTabular") @@ -51,6 +50,8 @@ class Tabular(PyObject): """ def __init__(self, CONNECTION_STR: str): + + # Connecting to model... logger.debug("Initializing Tabular Class") self.Server = Server() self.Server.Connect(CONNECTION_STR) @@ -72,10 +73,15 @@ def __init__(self, CONNECTION_STR: str): self.CompatibilityMode: int = self.Database.CompatibilityMode.value__ self.Model = self.Database.Model logger.info(f"Connected to Model - {self.Model.Name}") - super().__init__(self.Model) - self.Adomd = Connection(self.Server) + self.Adomd: Connection = Connection(self.Server) + + # Build PyObjects self.Reload_Model_Info() + # Run subclass init + super().__init__(self.Model) + + # Building rich table display for repr self._display.add_row( "EstimatedSize", str(round(self.Database.EstimatedSize / 1000000000, 2)) + " GB", @@ -90,21 +96,21 @@ def __init__(self, CONNECTION_STR: str): self._display.add_row("Database", self.Database.Name) self._display.add_row("Server", self.Server.Name) + # Finished and registering disconnect logger.debug("Class Initialization Completed") logger.debug("Registering Disconnect on Termination...") atexit.register(self.Disconnect) pass - # def __repr__(self) -> str: - # return f"Server::{self.Server.Name}\nDatabase::{self.Database.Name}\nModel::{self.Model.Name}\nEstimated Size::{self.Database.EstimatedSize}" - def Reload_Model_Info(self) -> bool: """Runs on __init__ iterates through details, can be called after any model changes. Called in SaveChanges() Returns: bool: True if successful """ + self.Database.Refresh() + self.Tables = PyTables( [PyTable(table, self) for table in self.Model.Tables.GetEnumerator()] ) @@ -117,7 +123,6 @@ def Reload_Model_Info(self) -> bool: self.Measures = PyMeasures( [measure for table in self.Tables for measure in table.Measures] ) - self.Database.Refresh() return True def Is_Process(self) -> bool: @@ -142,117 +147,22 @@ def Disconnect(self) -> bool: logger.debug(f"Disconnecting from - {self.Server.Name}") return self.Server.Disconnect() - def Refresh( - self, - Object: Union[str, Table, Partition, Dict[str, Any]], - RefreshType: RefreshType = RefreshType.Full, - Tracing=False, - ) -> None: - """Refreshes table(s) and partition(s). + def Refresh(self, *args, **kwargs) -> pd.DataFrame: + """PyRefresh Class to handle refreshes of model. Args: - Object (Union[ str, Table, Partition, Dict[str, Any], Iterable[str, Table, Partition, Dict[str, Any]] ]): Designed to handle a few different ways of selecting a refresh. - str == 'Table_Name' - Table == Table Object - Partition == Partition Object - Dict[str, Any] == A way to specify a partition of group of partitions. For ex. {'Table_Name':'Partition1'} or {'Table_Name':['Partition1','Partition2']}. NOTE you can also change out the strings for partition or tables objects. - RefreshType (RefreshType, optional): See [RefreshType](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.tabular.refreshtype?view=analysisservices-dotnet). Defaults to RefreshType.Full. - Tracing (bool, optional): Currently just some basic tracing to track refreshes. Defaults to False. - - Raises: - Exception: Raises exception if unable to find table or partition via string. - + model (Tabular): Main Tabular Class + object (Union[str, PyTable, PyPartition, Dict[str, Any]]): Designed to handle a few different ways of selecting a refresh. Can be a string of 'Table Name' or dict of {'Table Name': 'Partition Name'} or even some combination with the actual PyTable and PyPartition classes. + trace (Base_Trace, optional): Set to `None` if no Tracing is desired, otherwise you can use default trace or create your own. Defaults to Refresh_Trace. + refresh_checks (Refresh_Check_Collection, optional): Add your `Refresh_Check`'s into a `Refresh_Check_Collection`. Defaults to Refresh_Check_Collection(). + default_row_count_check (bool, optional): Quick built in check will fail the refresh if post check row count is zero. Defaults to True. + refresh_type (RefreshType, optional): Input RefreshType desired. Defaults to RefreshType.Full. Returns: - WIP: WIP + pd.DataFrame """ - logger.debug("Beginning RequestRefresh cadence...") - - def _Refresh_Report(Property_Changes) -> pd.DataFrame: - logger.debug("Running Refresh Report...") - refresh_data = [] - for property_change in Property_Changes: - if ( - isinstance(property_change.Object, Partition) - and property_change.Property_Name == "RefreshedTime" - ): - table, partition, refreshed_time = ( - property_change.Object.Table.Name, - property_change.Object.Name, - ticks_to_datetime(property_change.New_Value.Ticks), - ) - logger.info( - f'{table} - {partition} Refreshed! - {refreshed_time.strftime("%m/%d/%Y, %H:%M:%S")}' - ) - refresh_data += [[table, partition, refreshed_time]] - return pd.DataFrame( - refresh_data, columns=["Table", "Partition", "Refreshed Time"] - ) - - def refresh_table(table: Table) -> None: - logging.info(f"Requesting refresh for {table.Name}") - table.RequestRefresh(RefreshType) - - def refresh_partition(partition: Partition) -> None: - logging.info( - f"Requesting refresh for {partition.Table.Name}|{partition.Name}" - ) - partition.RequestRefresh(RefreshType) - - def refresh_dict(partition_dict: Dict) -> None: - for table in partition_dict.keys(): - table_object = find_table(table) if isinstance(table, str) else table - - def handle_partitions(object): - if isinstance(object, str): - refresh_partition(find_partition(table_object, object)) - elif isinstance(object, Partition): - refresh_partition(object) - else: - [handle_partitions(obj) for obj in object] - - handle_partitions(partition_dict[table]) - - def find_table(table_str: str) -> Table: - result = self.Model.Tables.Find(table_str) - if result is None: - raise Exception(f"Unable to find table! from {table_str}") - logger.debug(f"Found table {result.Name}") - return result - - def find_partition(table: Table, partition_str: str) -> Partition: - result = table.Partitions.Find(partition_str) - if result is None: - raise Exception( - f"Unable to find partition! {table.Name}|{partition_str}" - ) - logger.debug(f"Found partition {result.Table.Name}|{result.Name}") - return result - - def refresh(Object): - if isinstance(Object, str): - refresh_table(find_table(Object)) - elif isinstance(Object, Dict): - refresh_dict(Object) - elif isinstance(Object, Table): - refresh_table(Object) - elif isinstance(Object, Partition): - refresh_partition(Object) - else: - [refresh(object) for object in Object] - - refresh(Object) - if Tracing: - rt = Refresh_Trace(self) - rt.Start() - - m = self.SaveChanges() - - if Tracing: - rt.Stop() - rt.Drop() - - return _Refresh_Report(m.Property_Changes) + r = PyRefresh(self, *args, **kwargs) + return r.Run() def Update(self, UpdateOptions: UpdateOptions = UpdateOptions.ExpandFull) -> None: """[Update Model](https://docs.microsoft.com/en-us/dotnet/api/microsoft.analysisservices.majorobject.update?view=analysisservices-dotnet#microsoft-analysisservices-majorobject-update(microsoft-analysisservices-updateoptions)) @@ -410,7 +320,8 @@ def clone_role_permissions(): clone_role_permissions() logger.info(f"Refreshing Clone... {table.Name}") - self.Refresh([table]) + self.Reload_Model_Info() + self.Refresh(table.Name, default_row_count_check=False) logger.info(f"Updating Model {self.Model.Name}") self.SaveChanges() return True @@ -642,6 +553,7 @@ def Create_Table(self, df: pd.DataFrame, table_name: str) -> bool: f"Adding table: {new_table.Name} to {self.Server.Name}::{self.Database.Name}::{self.Model.Name}" ) self.Model.Tables.Add(new_table) - self.Refresh([new_table], Tracing=True) self.SaveChanges() + self.Reload_Model_Info() + self.Refresh(new_table.Name) return True diff --git a/pytabular/refresh.py b/pytabular/refresh.py new file mode 100644 index 0000000..5902f32 --- /dev/null +++ b/pytabular/refresh.py @@ -0,0 +1,312 @@ +from tabular_tracing import Refresh_Trace, Base_Trace +import logging +from Microsoft.AnalysisServices.Tabular import ( + RefreshType, + Table, + Partition, +) +import pandas as pd +from logic_utils import ticks_to_datetime +from typing import Union, Dict, Any +from table import PyTable +from partition import PyPartition +from abc import ABC + +logger = logging.getLogger("PyTabular") + + +class Refresh_Check(ABC): + def __init__(self, name, function, assertion=None) -> None: + """TODO DOCUMENTATION + + Args: + name (_type_): _description_ + function (_type_): _description_ + assertion (_type_, optional): _description_. Defaults to None. + """ + super().__init__() + self._name = name + self._function = function + self._assertion = assertion + self._pre = None + self._post = None + + def __repr__(self) -> str: + return f"{self.name} - {self.pre} - {self.post} - {str(self.function)}" + + @property + def name(self): + "Get your custom name of refresh check." + return self._name + + @name.setter + def name(self, name): + self._name = name + + @name.deleter + def name(self): + del self._name + + @property + def function(self): + "Get the function that is used to run a pre and post check." + return self._function + + @function.setter + def function(self, func): + self._function = func + + @function.deleter + def function(self): + del self._function + + @property + def pre(self): + "Get the pre value that is the result from the pre refresh check." + return self._pre + + @pre.setter + def pre(self, pre): + self._pre = pre + + @pre.deleter + def pre(self): + del self._pre + + @property + def post(self): + "Get the post value that is the result from the post refresh check." + return self._post + + @post.setter + def post(self, post): + self._post = post + + @post.deleter + def post(self): + del self._post + + @property + def assertion(self): + "Get the assertion that is the result from the post refresh check." + return self._assertion + + @assertion.setter + def assertion(self, assertion): + self._assertion = assertion + + @assertion.deleter + def assertion(self): + del self._assertion + + def _check(self, stage): + logger.debug(f"Running {stage}-Check for {self.name}") + results = self.function() + if stage == "Pre": + self.pre = results + else: + self.post = results + logger.info(f"{stage}-Check results for {self.name} - {results}") + return results + + def Pre_Check(self): + self._check("Pre") + pass + + def Post_Check(self): + self._check("Post") + self.Assertion() + pass + + def Assertion(self): + if self.assertion is None: + logger.debug("Skipping assertion none given") + else: + test = self.assertion(self.pre, self.post) + assert_str = f"Test {self.name} - {test} - Pre Results - {self.pre} | Post Results {self.post}" + if test: + logger.info(assert_str) + else: + logger.critical(assert_str) + assert ( + test + ), f"Test failed! Pre Results - {self.pre} | Post Results {self.post}" + + +class Refresh_Check_Collection: + def __init__(self, refresh_checks: Refresh_Check = []) -> None: + """TODO Documentation + + Args: + refresh_checks (Refresh_Check, optional): _description_. Defaults to []. + """ + self._refresh_checks = refresh_checks + pass + + def __iter__(self): + for refresh_check in self._refresh_checks: + yield refresh_check + + def add_refresh_check(self, refresh_check: Refresh_Check): + self._refresh_checks.append(refresh_check) + + +class PyRefresh: + def __init__( + self, + model, + object: Union[str, PyTable, PyPartition, Dict[str, Any]], + trace: Base_Trace = Refresh_Trace, + refresh_checks: Refresh_Check_Collection = Refresh_Check_Collection(), + default_row_count_check: bool = True, + refresh_type: RefreshType = RefreshType.Full, + ) -> None: + """PyRefresh Class to handle refreshes of model. + + Args: + model (Tabular): Main Tabular Class + object (Union[str, PyTable, PyPartition, Dict[str, Any]]): Designed to handle a few different ways of selecting a refresh. Can be a string of 'Table Name' or dict of {'Table Name': 'Partition Name'} or even some combination with the actual PyTable and PyPartition classes. + trace (Base_Trace, optional): Set to `None` if no Tracing is desired, otherwise you can use default trace or create your own. Defaults to Refresh_Trace. + refresh_checks (Refresh_Check_Collection, optional): Add your `Refresh_Check`'s into a `Refresh_Check_Collection`. Defaults to Refresh_Check_Collection(). + default_row_count_check (bool, optional): Quick built in check will fail the refresh if post check row count is zero. Defaults to True. + refresh_type (RefreshType, optional): Input RefreshType desired. Defaults to RefreshType.Full. + """ + self.model = model + self.object = object + self.trace = trace + self.default_row_count_check = default_row_count_check + self.refresh_type = refresh_type + self._objects_to_refresh = [] + self._request_refresh(self.object) + self._checks = refresh_checks + self._pre_checks() + logger.info("Refresh Request Completed!") + pass + + def _pre_checks(self): + logger.debug("Running Pre-checks") + if self.trace is not None: + logger.debug("Getting Trace") + self.trace = self._get_trace() + if self.default_row_count_check: + logger.debug( + f"Running default row count check - {self.default_row_count_check}" + ) + tables = [ + table + for refresh_dict in self._objects_to_refresh + for table in refresh_dict.keys() + ] + + def row_count_assertion(pre, post): + return post > 0 + + for table in set(tables): + check = Refresh_Check( + f"{table.Name} Row Count", table.Row_Count, row_count_assertion + ) + self._checks.add_refresh_check(check) + for check in self._checks: + check.Pre_Check() + pass + + def _post_checks(self): + if self.trace is not None: + self.trace.Stop() + self.trace.Drop() + for check in self._checks: + check.Post_Check() + pass + + def _get_trace(self) -> Base_Trace: + return self.trace(self.model) + + def _find_table(self, table_str: str) -> Table: + try: + result = self.model.Tables[table_str] + except Exception: + raise Exception(f"Unable to find table! from {table_str}") + logger.debug(f"Found table {result.Name}") + return result + + def _find_partition(self, table: Table, partition_str: str) -> Partition: + try: + result = table.Partitions[partition_str] + except Exception: + raise Exception(f"Unable to find partition! {table.Name}|{partition_str}") + logger.debug(f"Found partition {result.Table.Name}|{result.Name}") + return result + + def _refresh_table(self, table: PyTable) -> None: + logging.info(f"Requesting refresh for {table.Name}") + self._objects_to_refresh += [ + {table: [partition for partition in table.Partitions]} + ] + table.RequestRefresh(self.refresh_type) + + def _refresh_partition(self, partition: PyPartition) -> None: + logging.info(f"Requesting refresh for {partition.Table.Name}|{partition.Name}") + self._objects_to_refresh += [{partition.Table: [partition]}] + partition.RequestRefresh(self.refresh_type) + + def _refresh_dict(self, partition_dict: Dict) -> None: + for table in partition_dict.keys(): + table_object = self._find_table(table) if isinstance(table, str) else table + + def handle_partitions(object): + if isinstance(object, str): + self._refresh_partition(self._find_partition(table_object, object)) + elif isinstance(object, PyPartition): + self._refresh_partition(object) + else: + [handle_partitions(obj) for obj in object] + + handle_partitions(partition_dict[table]) + + def _request_refresh(self, object): + logger.debug(f"Requesting Refresh for {object}") + if isinstance(object, str): + self._refresh_table(self._find_table(object)) + elif isinstance(object, Dict): + self._refresh_dict(object) + elif isinstance(object, PyTable): + self._refresh_table(object) + elif isinstance(object, PyPartition): + self._refresh_partition(object) + else: + [self._request_refresh(obj) for obj in object] + + def _refresh_report(self, Property_Changes) -> pd.DataFrame: + logger.debug("Running Refresh Report...") + refresh_data = [] + for property_change in Property_Changes: + if ( + isinstance(property_change.Object, Partition) + and property_change.Property_Name == "RefreshedTime" + ): + table, partition, refreshed_time = ( + property_change.Object.Table.Name, + property_change.Object.Name, + ticks_to_datetime(property_change.New_Value.Ticks), + ) + logger.info( + f'{table} - {partition} Refreshed! - {refreshed_time.strftime("%m/%d/%Y, %H:%M:%S")}' + ) + refresh_data += [[table, partition, refreshed_time]] + return pd.DataFrame( + refresh_data, columns=["Table", "Partition", "Refreshed Time"] + ) + + def Run(self) -> None: + if self.model.Server.Connected is False: + logger.info(f"{self.Server.Name} - Reconnecting...") + self.model.Server.Reconnect() + + if self.trace is not None: + self.trace.Start() + + save_changes = self.model.SaveChanges() + + self._post_checks() + + return self._refresh_report(save_changes.Property_Changes) diff --git a/test/test_tabular.py b/test/test_tabular.py index efa973c..917d285 100644 --- a/test/test_tabular.py +++ b/test/test_tabular.py @@ -7,8 +7,8 @@ aas = pytabular.Tabular(local.AAS) gen2 = pytabular.Tabular(local.GEN2) testing_parameters = [(aas), (gen2)] -testingtablename = 'PyTestTable' -testingtabledf = pd.DataFrame(data={'col1': [1, 2, 3], 'col2': ['four', 'five', 'six']}) +testingtablename = "PyTestTable" +testingtabledf = pd.DataFrame(data={"col1": [1, 2, 3], "col2": ["four", "five", "six"]}) @pytest.mark.parametrize("model", testing_parameters) @@ -18,10 +18,10 @@ def test_sanity_check(model): @pytest.mark.parametrize("model", testing_parameters) def test_connection(model): - ''' + """ Does a quick check to the Tabular Class To ensure that it can connnect - ''' + """ assert model.Server.Connected @@ -32,19 +32,17 @@ def test_database(model): @pytest.mark.parametrize("model", testing_parameters) def test_basic_query(model): - int_result = model.Query('EVALUATE {1}') + int_result = model.Query("EVALUATE {1}") text_result = model.Query('EVALUATE {"Hello World"}') - assert int_result == 1 and text_result == 'Hello World' + assert int_result == 1 and text_result == "Hello World" @pytest.mark.parametrize("model", testing_parameters) def test_file_query(model): singlevaltest = local.SINGLEVALTESTPATH dfvaltest = local.DFVALTESTPATH - dfdupe = pd.DataFrame({'[Value1]': (1, 3), '[Value2]': (2, 4)}) - assert ( - model.Query(singlevaltest) == 1 and model.Query(dfvaltest).equals(dfdupe) - ) + dfdupe = pd.DataFrame({"[Value1]": (1, 3), "[Value2]": (2, 4)}) + assert model.Query(singlevaltest) == 1 and model.Query(dfvaltest).equals(dfdupe) @pytest.mark.parametrize("model", testing_parameters) @@ -65,8 +63,7 @@ def test_query_every_column(model): def remove_testing_table(model): table_check = [ table - for table - in model.Model.Tables.GetEnumerator() + for table in model.Model.Tables.GetEnumerator() if testingtablename in table.Name ] for table in table_check: @@ -77,23 +74,23 @@ def remove_testing_table(model): @pytest.mark.parametrize("model", testing_parameters) def test_pre_table_checks(model): remove_testing_table(model) - assert len( - [ - table - for table - in model.Model.Tables.GetEnumerator() - if testingtablename in table.Name - ] - ) == 0 + assert ( + len( + [ + table + for table in model.Model.Tables.GetEnumerator() + if testingtablename in table.Name + ] + ) + == 0 + ) @pytest.mark.parametrize("model", testing_parameters) def test_create_table(model): - df = pd.DataFrame(data={'col1': [1, 2, 3], 'col2': ['four', 'five', 'six']}) + df = pd.DataFrame(data={"col1": [1, 2, 3], "col2": ["four", "five", "six"]}) model.Create_Table(df, testingtablename) - assert len( - model.Query(f"EVALUATE {testingtablename}") - ) == 3 + assert len(model.Query(f"EVALUATE {testingtablename}")) == 3 @pytest.mark.parametrize("model", testing_parameters) @@ -104,40 +101,46 @@ def test_pytables_count(model): @pytest.mark.parametrize("model", testing_parameters) def test_backingup_table(model): model.Backup_Table(testingtablename) - assert len( - [ - table - for table - in model.Model.Tables.GetEnumerator() - if f'{testingtablename}_backup' == table.Name - ] - ) == 1 + assert ( + len( + [ + table + for table in model.Model.Tables.GetEnumerator() + if f"{testingtablename}_backup" == table.Name + ] + ) + == 1 + ) @pytest.mark.parametrize("model", testing_parameters) def test_revert_table(model): model.Revert_Table(testingtablename) - assert len( - [ - table - for table - in model.Model.Tables.GetEnumerator() - if f'{testingtablename}' == table.Name - ] - ) == 1 + assert ( + len( + [ + table + for table in model.Model.Tables.GetEnumerator() + if f"{testingtablename}" == table.Name + ] + ) + == 1 + ) @pytest.mark.parametrize("model", testing_parameters) def test_table_removal(model): remove_testing_table(model) - assert len( - [ - table - for table - in model.Model.Tables.GetEnumerator() - if testingtablename in table.Name - ] - ) == 0 + assert ( + len( + [ + table + for table in model.Model.Tables.GetEnumerator() + if testingtablename in table.Name + ] + ) + == 0 + ) @pytest.mark.parametrize("model", testing_parameters)