Audio analysis of customer calls for predicting purchase intentions: A novel approach to e-commerce insights

dc.contributor.authorYu, Miao
dc.contributor.supervisorLi, Kin Fun
dc.date.accessioned2024-11-27T16:19:14Z
dc.date.available2024-11-27T16:19:14Z
dc.date.issued2024
dc.degree.departmentDepartment of Electrical and Computer Engineering
dc.degree.levelMaster of Engineering MEng
dc.description.abstractClient audio recordings represent a valuable resource for many types of businesses. Utilizing these recordings to identify potential customers can help enhance purchase rates and reduce marketing costs, particularly with different kinds of machine learning methods that automatically label different groups, including positive, neutral, and negative buyers, instead of manual analysis. Though previous research has predominantly focused on text content analysis for this purpose, audio features, which effectively capture voice nuances such as tone, pitch, rhythm, and interaction patterns between interviewers and interviewees, may impact the model performance. This project explored an innovative method. It firstly investigates the effectiveness of emotion detection through audio features, leveraging two datasets: the Toronto Emotional Speech Set (TESS) and the Surrey Audio-Visual Expressed Emotion Dataset (SAVEE). Furthermore, hierarchical clustering techniques are applied to explore the relationship between emotion-related audio features and customer categories using audio data provided by VINN Auto, an e-commerce firm. Next, Exploratory Data Analysis (EDA) is conducted to find the correlation between interaction-related audio features and customer categories, including positive, neutral, and negative buyers within the same dataset after labeling it. Using supervised learning, the results indicate that integrating audio features, including emotion-related and interaction pattern features, can affect the performance of models like Support Vector Machines (SVM), Decision Tree, and Extreme Gradient Boosting (XGBoosts), particularly when combined with traditional audio content-related features such as Term Frequency-Inverse Document Frequency (TF-IDF) scores while applying adjusted weight configuration for positive class. After these exploration, an ensemble method using a soft voting mechanism across these three models is developed to assess whether it can enhance the identification of potential purchasers. The approach of combining emotion-related audio features, interaction pattern features, and content-based features like TF-IDF scores with tailored weight configurations highlights the value of collaborating audio features in customer identification tasks compared with only using content-based features like TF-IDF scores. It could be a robust strategy for improving classification outcomes for the relevant analysis in the future.
dc.description.scholarlevelGraduate
dc.identifier.urihttps://hdl.handle.net/1828/20805
dc.language.isoen
dc.rightsAvailable to the World Wide Web
dc.subjectpurchase intention
dc.subjectprotential purchasers
dc.subjectaudio
dc.subjectemotion-related features
dc.subjectinteraction-pattern features
dc.subjecttext
dc.subjectcontent-related features
dc.titleAudio analysis of customer calls for predicting purchase intentions: A novel approach to e-commerce insights
dc.typeproject

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yu_Miao_MEng_2024.pdf
Size:
3.9 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: