GitHub - ljtamang/dtAssesment: This project makes answer assessment of the answer provided by student with respect to ideal answer in deeptutor system.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
DT-Gradev1.0_data		DT-Gradev1.0_data
models		models
output		output
papers		papers
prediction_result		prediction_result
feature_selection.ipynb		feature_selection.ipynb
feature_selection.py		feature_selection.py
main.ipynb		main.ipynb
main.py		main.py
main_2.ipynb		main_2.ipynb
main_2.py		main_2.py
readme.txt		readme.txt

Repository files navigation

Instruction to run the project:
----------------------------------
1. Compile main.py. 
   Output: 
   a) Doc2Vec model : model_DBOW.doc2vec (PV-DBOW model) and model_DM.doc2vec (PM-DV model)
   b) score.csv : Diff similairty scores between student and refrence answers
2. Feed socres.csv to weka and classify using logistic regression with 10 fold cross valdiation.
   On using each of those scores individually and all of them together should output different prediciotn accuracy.
   Such result on my run are inside prediciton_result folder

Project Detail:
----------------------------------
Project name: dtAssesment
Input: 
	DT-Grade_v1.0_dataset.xml inside DT-Gradev1.0_data folder
 	DT-Grade_v1.0_dataset.xml  is file containg collection of quetions, answer given by student and set of refrence asnwers provided by expert. You can detail abut the data set in the paper DT-Grade_dataset pdf (Banjade, et al., 2016) inside DT-Gradev1.0_data folder

output:
a) 2 Doc2vec model inside modles folder
   	model_DBOW.doc2vec is doc2vec model using PV-DBOW approach
   	model_DM.doc2vec is doc2vec model using PV-DM approach

   	For detail about PV-DBOW and PV-DBOW apprach, read paper doc2vec-mikolov.pdf inside papers folder


b) silairty scores between student answer and the refrences answers
   socres.csv insdie ouptut foler contains different cosine simialirty scores between answer and refrences answers.
   
   DM_max_cosSim : max cosine simialrity scores between a answer and set of refrences answers using PV-DM vectors 
   DM_max_cosSim : averagge cosine simialrity scores of scores between a answer and set of refrences answers using PV-DM vectors 
   DBOW_max_cosSim : max cosine simialrity scores between a answer and set of refrences answers using PV-DBOW vectors 
   DBOW_mean_cosSim : averagge cosine simialrity scores of scores between a answer and set of refrences answers using PV-DBOW vectors 
   DM_DBOW_max_cosSim : max cosine simialrity scores between a answer and set of refrences answers using concatination of PV-DBOW  and PV-DM vectors 
   DM_DBOW_mean_cosSim : averagge cosine simialrity scores of scores between a answer and set of refrences answers using concatination of PV-DBOW  and PV-DM vectors 
   Annotation : gold label for student answers which can be correct, correct_but_incomplete, contradictory and incorrect

c) Predictor output : output of logistic regression computed using  weka inside predcition_result folder

   The result of predictor using logsitc classifer with default setting and 10 cross fold valiation looks likes as folows:

	Features: DM_max_cosSim
	Gold Label: Annotation
	Accuracy: 41.5556


	Features: DM_mean_cosSim
	Gold Label: Annotation
	Accuracy: 43.2222

	Features: DBOW_max_cosSim
	Gold Label: Annotation
	Accuracy: 40.8889

	Features: DBOW_mean_cosSim
	Gold Label: Annotation
	Accuracy: 40.8889


	Features: DM_DBOW_max_cosSim
	Gold Label: Annotation
	Accuracy: 40.8889

	Features: DM_DBOW_mean_cosSim
	Gold Label: Annotation
	Accuracy: 40.8889


	Features: DM_max_cosSim, DM_mean_cosSim, DBOW_max_cosSim, DBOW_mean_cosSim, DM_DBOW_max_cosSim, DM_DBOW_mean_cosSim
	Gold Label: Annotation 
	Accuracy: 42.1111