Currently it returns a list of sqlparse.sql.Token objects. In all case sql-metadata code tries to keep the state of previous keywords as it iterates over tokens.
Instead of Token object return QueryTokenState dataclass with:
@dataclasses.dataclass
class SQLToken:
value: str
is_keyword: bool
is_name: bool
is_punctuation: bool
is_wildcard: bool
# and the state
last_keyword: Optional[str] # uppercased
previous_token: Optional[Token]
get_query_tokens will be responsible for keeping the state in returned tokens.
Token will also have sub-classes that will indicate their "function" within the query:
- table names
- keywords
- column names
- functions
http://datacharmer.blogspot.com/2008/03/mysql-proxy-recipes-tokenizing-query.html
MySQL Proxy ships equipped with a tokenizer, a method that, given a query, returns its components as an array of tokens. Each token contains three elements:
name, which is a human readable name of the token (e.g. TK_SQL_SELECT)
id, which is the identifier of the token (e.g. 204)
text, which is the content of the token (e.g. "select").
For example, the query SELECT 1 FROM dual will be returned as the following tokens:
1:
text select
token_name TK_SQL_SELECT'
token_id 204
2:
text 1
token_name TK_INTEGER
token_id 11
3:
text from
token_name TK_SQL_FROM
token_id 105
4:
text dual
token_name TK_SQL_DUAL
token_id 87
Currently it returns a list of
sqlparse.sql.Tokenobjects. In all casesql-metadatacode tries to keep the state of previous keywords as it iterates over tokens.Instead of
Tokenobject returnQueryTokenStatedataclass with:get_query_tokenswill be responsible for keeping the state in returned tokens.Token will also have sub-classes that will indicate their "function" within the query:
http://datacharmer.blogspot.com/2008/03/mysql-proxy-recipes-tokenizing-query.html
MySQL Proxy ships equipped with a tokenizer, a method that, given a query, returns its components as an array of tokens. Each token contains three elements:
name, which is a human readable name of the token (e.g. TK_SQL_SELECT)
id, which is the identifier of the token (e.g. 204)
text, which is the content of the token (e.g. "select").
For example, the query
SELECT 1 FROM dualwill be returned as the following tokens: