Skip to content

Commit c116e97

Browse files
Update README.md
1 parent 51d169e commit c116e97

1 file changed

Lines changed: 10 additions & 7 deletions

File tree

README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,26 +3,29 @@ A set of tools for leveraging active learning and model explainability for effec
33

44
## What is this?
55

6-
One component of my vision of FULLY AUTOMATED competative debate case production. When I take in massive sums of articles from a news API, I need a way to classify these documents into various buckets. I have to generate my own labeled data for this. That is a problem. Most people don't realize that the sample effeciency in models which utilize transfer learning is so great that AI-assisted data labeling is extremely useful and can significantly shorten what is ordinarily a painful data labeling process.
6+
One component of my vision of FULLY AUTOMATED competative debate case production.
77

88

9-
1. We need a way to quickly create word embedding powered document classifiers which learn with a human in the loop. For some classes, an extremely limited number of examples may be all that is necessary to get results that a user would consider to be succesful for their task.
10-
11-
2. I want to know what my model is learning - so I integrate the word embeddings avalible with [Flair](https://github.com/zalandoresearch/flair), combine with Classifiers in Sklearn and PyTorch, and finish it off with the [LIME](https://arxiv.org/pdf/1602.04938.pdf) algorithim for model interpretability (implemented within the [ELI5](https://eli5.readthedocs.io/en/latest/index.html) Library)
9+
I want to take in massive sums of articles from a news API which will be placed in their corresponding file based on where my classifier says I should put them. I have to generate my own labeled data for this. That is a *problem*. Most people don't realize that the sample effeciency in models which utilize transfer learning is so great that AI-assisted data labeling is extremely useful and can significantly shorten what is ordinarily a painful data labeling process.
1210

1311

12+
1. We need a way to quickly create word embedding powered document classifiers which learn with a human in the loop. For some classes, an extremely limited number of examples may be all that is necessary to get results that a user would consider to be succesful for their task.
1413

14+
2. I want to know what my model is learning - so I integrate the word embeddings avalible with [Flair](https://github.com/zalandoresearch/flair), combine with Classifiers in Sklearn and PyTorch, and finish it off with the [LIME](https://arxiv.org/pdf/1602.04938.pdf) algorithim for model interpretability (implemented within the [ELI5](https://eli5.readthedocs.io/en/latest/index.html) Library)
1515

1616

1717
TODO:
18-
1. Finish README - Cite relavent technologies and papers
19-
2. Documentation/Examples/Installation Instructions
20-
3. More examples
18+
* Finish README - Cite relavent technologies and papers
19+
* Documentation/Examples/Installation Instructions
20+
* More examples
21+
* Figure out better way to store embeddings (stop moving the embeddings from GPU to CPU ineffeciently)
2122

2223
Changelog:
2324
8/8/2019 -
2425
* Added Keras model support - now utilizes by default KNN if not in Keras mode and a Neural Network if in Keras mode.
2526
* Added HTML exporting of model explanations.
27+
* Tested the possibility of doing Multilabel classification with TextExplainer... doesn't seem to work :(
28+
* Added pictures
2629

2730
## Examples
2831

0 commit comments

Comments
 (0)