A Music Virtual Assistant Based on Machine Learning




Shameli Derakhshan, Shadan

Journal Title

Journal ISSN

Volume Title



This study focuses on the development of a music chatbot designed to address a gap in the digital music services market. The chatbot provides users with a seamless and interactive experience, enabling them to engage in music-related interactions. The enhancement that differentiates this chatbot from other standard UIs is while standard UIs of popular music streaming platforms like Spotify, Tidal, and Apple Music offer access to a vast library of songs and user-friendly interfaces, they may lack personalized and engaging interactions. That is, unlike traditional UIs where users navigate through menus and search boxes, the chatbot engages users in a conversation, allowing them to interact naturally using text which can later be improved to voice input. This creates a more human-like and enjoyable experience. The Music4all dataset is chosen in this research to train, develop, and evaluate the chatbot. This dataset contains data from 15,602 anonymous users, their listening histories, and 109,269 songs represented by their audio clips, lyrics, and 16 other metadata/attributes. Various techniques, including pattern-matching approaches, TF-IDF combined with machine learning algorithms, Word2vec embeddings, and the BERT model, were explored to determine the most effective methods for creating engaging and responsive music chatbots. To achieve this, the study initially involved data preparation and the creation of a JSON file containing patterns as features and artists as classes. Numerical values representing 60 classes and TF-IDF sparse matrices of 8027 songs were then fed into various machine learning algorithms, including decision trees, random forests, KNN, and SVM. This is followed by a comprehensive comparison of the metrics obtained from these algorithms. The experimental results indicate that the combination of TF-IDF and SVM yielded the best results for designing the chatbot, achieving a classification accuracy of 91\%. However, advanced methods such as the BERT model and Word2vec were found to be less useful due to over-fitting issues and since it is a classification problem for labeled data. Finally, the classification of artists was integrated into a Flask app, which provided the song name and ID based on user-requested tags. This study contributes valuable insights into the development of a music chatbot and highlights the most effective methods for classification and response generation using the given dataset.



Virtual Assitant, Chatbot, Machine Learning, Natural Language Processing