Paper Categorization Using Naive Bayes




Cui, Man

Journal Title

Journal ISSN

Volume Title



Literature survey is a time-consuming process as researchers spend a lot of time in searching the papers of interest. While search engines can be useful in finding papers that contain a certain set of keywords, one still has to go through these papers in order to decide whether they are of interest. On the other hand, one can quickly decide which papers are of interest if each one of them is labelled with a category. The process of labelling each paper with a category is termed paper categorization, an instance of a more general problem called text classification. In this thesis, we presented a text classifier called Iris that makes use of the popular Naive Bayes algorithm. With Iris, we were able to (1) evaluate Naive Bayes using a number of popular datasets, (2) propose a GUI for assisting users with document categorization and searching, and (3) demonstrate how the GUI can be utilized for paper categorization and searching.



Text classification