To do list:
-
Basic processing/ tokenizing
-
Word Embeddings + Texts Embedding by aggregation on word embedding
- Skip gram
- CBOW
- Pretrained models: GloVes
-
Doc2Vec
-
term frequency inverse document frequency
-
Pre-trained Trainsformers
- BERT
- GPT
-
Classification models:
- Logistic Regression
- Decision Tree
- MLP (RNN)