Tempo modulations in English

dc.contributor.authorKirkham, Sandra Patricia
dc.contributor.supervisorEsling, John H.
dc.date.accessioned2018-09-14T18:21:18Z
dc.date.available2018-09-14T18:21:18Z
dc.date.copyright2001en_US
dc.date.issued2018-09-14
dc.degree.departmentDepartment of Linguistics
dc.degree.departmentSchool of Languages, Linguistics and Cultures
dc.degree.levelDoctor of Philosophy Ph.D.en_US
dc.description.abstractThe goal of synthetic speech is to provide speech that is both comprehensible and natural sounding. While synthetic speech is drawing nearer to its goal, it has not yet attained a truly natural quality. Naturalness can be improved by incorporating prosodic rules for duration and intonation that are representative of natural speech. While duration models are widely used, they fail to replicate the variations evident in the tempo of natural speech. This project proposes a model of tempo modulations in English based upon phrasal foci. In order to replicate this pattern, the potential phonetic locations for altering the speech rate of English synthetic speech are explored. The results of a pilot study based on the readings of one speaker suggested that tempo modulations are predictable and not random, and that they are not expressed as equal expansions and compressions across all syllable constituents. Vowels, onsets, and codas exhibited varying degrees of change. These results motivated a study of the same phenomena in data derived from the readings of multiple speakers. The data for the main study were derived from two readings of each of five Canadian English sentences. The first reading varied the position of a focused word in the sentence and the second, only the tempo. Sentences that were neutral in terms of focus and tempo were included in both readings to create experimental controls. The readings were recorded and digitized to provide waveforms for duration measurement. Comparisons of average durations of focused syllables to the respective controls revealed significant differences given an alpha level of .05, providing evidence that a pattern of tempo modulations can be predicted. This pattern involved expansion and compression within the sentence. The pattern can be replicated using the results of the investigation of sites for tempo changes. The results reveal that at a fast tempo and a slow tempo, the durations of syllable constituents change significantly from the control at an alpha level of .01. The vowel, particularly one that comprises a syllable, is the primary site for expansion and compression. Stressed vowels have the largest compression, while unstressed vowels have the largest expansion. The degree of segmental change varies depending on the position of the syllable constituent. In stressed CVC syllables, codas and then onsets exhibit lessening degrees of compression. TU reverse is true for expansion, and the degree of change for these constituents is less than that for compression. However, only stops in these positions show a significant change from the control. It appears that expansions and compressions of segments are ranked according to syllable constituency These ranked expansions and compressions of syllable constituents can be incorporated into an existing duration model for synthetic speech in order to replicate the observed pattern of tempo modulations in English. This tempo pattern provides variation at a sentential level and is an improvement over rules for emphasis that are specific to the emphasized word or part thereof. The pattern is expressed by duration rules, and the addition of the criterion for syllable constituency increases the natural distribution of changes in tempo provided a model to bring synthetic speech closer to the natural goal.en_US
dc.description.scholarlevelGraduateen_US
dc.identifier.urihttp://hdl.handle.net/1828/10061
dc.languageEnglisheng
dc.language.isoenen_US
dc.rightsAvailable to the World Wide Weben_US
dc.subjectEnglish languageen_US
dc.subjectIntonationen_US
dc.subjectSpeech synthesisen_US
dc.subjectTempo (Phonetics)en_US
dc.subjectEnglish languageen_US
dc.subjectProsodic analysisen_US
dc.titleTempo modulations in Englishen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kirkham_SandraPatricia_PhD_2001.pdf
Size:
6.49 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: