Code and data for paper "Dialog Intent Induction with Deep Multi-View Clustering", Hugh Perkins and Yi Yang, 2019, to appear in EMNLP 2019.
Data is available in the sub-directory data, with a specific LICENSE file.
- decompress the
.bz2files indatafolder - download http://nlp.stanford.edu/data/glove.840B.300d.zip, and unzip
glove.840B.300d.txtintodatafolder
- run one of:
# no pre-training
python train.py --pre-epoch 0 --data-path data/airlines_processed.csv --num-epochs 50 --view1-col first_utterance --view2-col context
# ae pre-training
python train.py --pre-model ae --pre-epoch 20 --data-path data/airlines_processed.csv --num-epochs 50 --view1-col first_utterance --view2-col context
# qt pre-training
python train.py --pre-model qt --pre-epoch 10 --data-path data/airlines_processed.csv --num-epochs 50 --view1-col first_utterance --view2-col context
- to train on askubuntu, replace
airlineswithaskubuntuin the above command-lines
- for qt pretraining run:
python train_qt.py --data-path data/airlines_processed.csv --pre-epoch 10 --view1-col first_utterance --view2-col context --scenarios view1
- to train on askubuntu, replace
airlineswithaskubuntuin the above command-line