Identifying Well-formed Questions using Deep Learning

Date

2020-06-30

Authors

Chhina, Navnoor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Deep Learning is dominant in the field of Natural language processing, thanks to its performance over the statistical methods. The high performance is driven by the recent transfer learning approaches, where we pre-train a language model over a large corpus and use the pre-trained model for fine-tuning on any specific task. Recent advancements in transfer learning suggest how using the pre-trained model and fine-tuning for a few training steps can give us state-of-the-art results. Therefore, in this project for the classification of well-formed natural language questions, we use both the learning of a model from scratch and transfer learning. We can use the pre- trained base models from BERT, ALBERT and XLNet and add a simple classifier layer, which gives us better results than learning the model from scratch in very few epochs. We also sample a subset of classified queries by our model and run the queries on the Google search engine and confirm that using a model that can identify the well-formedness of the queries would be helpful to the search engine by reducing the downstream compounding errors for the natural language processing pipeline.

Description

Keywords

Deep Learning, Natural Language Processing, Data Science, Machine Learning

Citation