Question Classification - SVM, Logistic Regression, LSTM, BERT, Doc2Vec, TF-IDF

The objective is to build a question classification model. The questions have six different categories such as: Description(DESC), Entity(ENTY), Abbreviation(ABBR), Human(HUM), Location(LOC), Numeric Value(NUM). To investigate different approaches, the following data is used (downloaded from https://cogcomp.seas.upenn.edu/Data/QA/QC/): Training set 5(5500 labeled questions) Test set: TREC 10 questions Different data analyses have been performed and four different models are trained. The models are the followings: Tf-Idf + SVM: Tf-Idf is used for vectorizing texts and a linear model (i.