Name	Name	Last commit message	Last commit date
Latest commit History 10 Commits
LICENSE	LICENSE
README.md	README.md
card_classification.csv	card_classification.csv
classification.py	classification.py
conf_matrix.png	conf_matrix.png
docu_learn.py	docu_learn.py
explaination.png	explaination.png

Name

Last commit message

Last commit date

LICENSE

README.md

card_classification.csv

Active-Explainable-Classification

A set of tools for leveraging active learning and model explainability for effecient document classification

What is this?

One component of my vision of FULLY AUTOMATED competative debate case production. When I take in massive sums of articles from a news API, I need a way to classify these documents into various buckets. I have to generate my own labeled data for this. That is a problem. Most people don't realize that the sample effeciency in models which utilize transfer learning is so great that AI-assisted data labeling is extremely useful and can significantly shorten what is ordinarily a painful data labeling process.

We need a way to quickly create word embedding powered document classifiers which learn with a human in the loop. For some classes, an extremely limited number of examples may be all that is necessary to get results that a user would consider to be succesful for their task.
I want to know what my model is learning - so I integrate the word embeddings avalible with Flair, combine with Classifiers in Sklearn and PyTorch, and finish it off with the LIME algorithim for model interpretability (implemented within the ELI5 Library)

TODO: 1. Finish README - Cite relavent technologies and papers 2. Documentation/Examples/Installation Instructions 3. More examples

Examples

Toy example of a possible debate classifier seperating between 11 classes

ANB = Antiblackness, CAP = Capitalism, ECON = Economy, EDU = Education, ENV = Environment, EX = Extinction, FED = Federalism, HEG = Hegemony, NAT = Natives, POL = Politics, TOP = Topicality

Top matrix is a confusion matrix of my validation set

Bottom matrix is showing classification probabilities for each individual example in my validation set.

Takes in documents from the user using Standard Input - Then the model classifies, explains why it classified the way it did, and asks the user if the predicted label is the ground truth or not. User supplies the ground truth, the model incrementally trains on the new example, and the cycle continues. This is called active learning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Active-Explainable-Classification

What is this?

Examples

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Active-Explainable-Classification

What is this?

Examples

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages