Skip to content

get_query_table_aliases(query) is far from being final #97

@bandrisca

Description

@bandrisca

While I was testing on Oracle SQL:

  • missed LEFT JOINs (it was easy to add to the code)
  • missed table names starting with schema name (SCHEMA.TABLE)
  • missed aliases without AS (this is mentioned in the code but not handled)
  • missed recognizing subselects (e.g. FROM (SELECT ... ) )

I touched the code to overcome these issues and found solution for them. Tested with a huge SQL but may need to do some more testing.

New code:

def get_query_table_aliases(query: str) -> Dict[str, str]:
    """
    Returns tables aliases mapping from a given query

    E.g. SELECT a.* FROM users1 AS a JOIN users2 AS b ON a.ip_address = b.ip_address
    will give you {'a': 'users1', 'b': 'users2'}
    """
    aliases = dict()
    last_keyword_token = None
    last_table_name = None

    for token in get_query_tokens(query):
        # print(token.ttype, token, last_table_name)

        # handle "FROM foo alias" syntax (i.e, "AS" keyword is missing)
        # if last_table_name and token.ttype is Name:
        #     aliases[token.value] = last_table_name
        #     last_table_name = False

        if last_table_name:
            if token.value=='.': 
                last_table_name = last_table_name + token.value #add the dot
            if token.value==',' or token.is_keyword and token.value.upper()!='AS': #there is no alias
                aliases[''] = last_table_name
                last_table_name = False
            if prev_token.value.upper()=='AS': #previous keyword was AS then we found the alias
                aliases[token.value] = last_table_name
                last_table_name = False
            if token.ttype is Name: 
                if prev_token.value=='.': 
                    last_table_name = last_table_name + token.value #add Name to last_table_name
                else: #found alias
                    aliases[token.value] = last_table_name
                    last_table_name = False


        if last_keyword_token:
           if last_keyword_token.value.upper() in ["FROM", "JOIN", "INNER JOIN","LEFT JOIN"] and token.value!='(':
                last_table_name = token.value

        last_keyword_token = token if token.is_keyword else False
        prev_token = token

    return aliases

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions