Computer vision-based tracking and feature extraction for lingual ultrasound

Al-Hammuri, Khalid

Computer vision-based tracking and feature extraction for lingual ultrasound

dc.contributor.author	Al-Hammuri, Khalid
dc.contributor.supervisor	Branzan Albu, Alexandra
dc.contributor.supervisor	So, Poman Pok-Man
dc.date.accessioned	2019-04-30T22:36:52Z
dc.date.available	2019-04-30T22:36:52Z
dc.date.copyright	2019	en_US
dc.date.issued	2019-04-30
dc.degree.department	Department of Electrical and Computer Engineering
dc.degree.level	Master of Applied Science M.A.Sc.	en_US
dc.description.abstract	Lingual ultrasound is emerging as an important tool for providing visual feedback to second language learners. In this study, ultrasound videos were recorded in sagittal plane as it provides an image for the full tongue surface in one scan, unlike the transverse plane which provides an information for small portion of the tongue in a single scan. The data were collected from five Arabic speakers as they pronounced fourteen Arabic sounds in three different vowel contexts. The sounds were repeated three times to form 630 ultrasound videos. The thesis algorithm was characterized by four steps. First: denoising the ultrasound image by using the combined curvelet transform and shock filter. Second: automatic selection of the tongue contour area. Third: tongue contour approximation and missing data estimation. Fourth: tongue contour transformation from image space to full concatenated signal and features extraction. The automatic tongue tracking results were validated by measuring the mean sum of distances between automatic and manual tongue contour tracking to give an accuracy of 0.9558mm. The validation for the feature extraction showed that the average mean squared error between the extracted tongue signature for different sound repetitions was 0.000858mm, which means that the algorithm could extract a unique signature for each sound and across different vowel contexts with a high degree of similarity. Unlike other related works, the algorithm showed an efficient and robust approach that could extract the tongue contour and the significant feature for the dynamic tongue movement on the full video frames, not just on the significant single and static video frame as used in the conventional method. The algorithm did not need any training data and had no limitation for the video size or the frame number. The algorithm did not fail during tongue extraction and did not need any manual re-initialization. Even when the ultrasound image recordings missed some tongue contour information, the thesis approach could estimate the missing data with a high degree of accuracy. The usefulness of the thesis approach as it can help the linguistic researchers to replace the manual tongue tracking by an automated tracking to save the time, then extracts the dynamics features for the full speech behavior to give better understanding of the tongue movement during the speech to develop a language learning tool for the second language learners.	en_US
dc.description.scholarlevel	Graduate	en_US
dc.identifier.uri	http://hdl.handle.net/1828/10812
dc.language	English	eng
dc.language.iso	en	en_US
dc.rights	Available to the World Wide Web	en_US
dc.subject	computer vision	en_US
dc.subject	lingual ultrasound	en_US
dc.subject	tracking	en_US
dc.subject	feature extraction	en_US
dc.subject	tongue	en_US
dc.title	Computer vision-based tracking and feature extraction for lingual ultrasound	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Al-hammuri_Khalid_MASc_2019.pdf
Size:: 5.67 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (ETD)