Identifying Well-formed Questions using Deep Learning
Date
2020-06-30
Authors
Chhina, Navnoor
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Deep Learning is dominant in the field of Natural language processing, thanks
to its performance over the statistical methods. The high performance is driven by
the recent transfer learning approaches, where we pre-train a language model over
a large corpus and use the pre-trained model for fine-tuning on any specific task.
Recent advancements in transfer learning suggest how using the pre-trained model
and fine-tuning for a few training steps can give us state-of-the-art results. Therefore,
in this project for the classification of well-formed natural language questions, we use
both the learning of a model from scratch and transfer learning. We can use the pre-
trained base models from BERT, ALBERT and XLNet and add a simple classifier
layer, which gives us better results than learning the model from scratch in very few
epochs. We also sample a subset of classified queries by our model and run the queries
on the Google search engine and confirm that using a model that can identify the
well-formedness of the queries would be helpful to the search engine by reducing the
downstream compounding errors for the natural language processing pipeline.
Description
Keywords
Deep Learning, Natural Language Processing, Data Science, Machine Learning