logo logo
Feedback From Automatic Speech Recognition to Elicit Clear Speech in Healthy Speakers. American journal of speech-language pathology PURPOSE:This study assessed the effectiveness of feedback generated by automatic speech recognition (ASR) for eliciting clear speech from young, healthy individuals. As a preliminary step toward exploring a novel method for eliciting clear speech in patients with dysarthria, we investigated the effects of ASR feedback in healthy controls. If successful, ASR feedback has the potential to facilitate independent, at-home clear speech practice. METHOD:Twenty-three healthy control speakers (ages 23-40 years) read sentences aloud in three speaking modes: Habitual, Clear (over-enunciated), and in response to ASR feedback (ASR). In the ASR condition, we used Mozilla DeepSpeech to transcribe speech samples and provide participants with a value indicating the accuracy of the ASR's transcription. For speakers who achieved sufficiently high ASR accuracy, noise was added to their speech at a participant-specific signal-to-noise ratio to ensure that each participant had to over-enunciate to achieve high ASR accuracy. RESULTS:Compared to habitual speech, speech produced in the ASR and Clear conditions was clearer, as rated by speech-language pathologists, and more intelligible, per speech-language pathologist transcriptions. Speech in the Clear and ASR conditions aligned on several acoustic measures, particularly those associated with increased vowel distinctiveness and decreased speaking rate. However, ASR accuracy, intelligibility, and clarity were each correlated with different speech features, which may have implications for how people change their speech for ASR feedback. CONCLUSIONS:ASR successfully elicited outcomes similar to clear speech in healthy speakers. Future work should investigate its efficacy in eliciting clear speech in people with dysarthria. 10.1044/2023_AJSLP-23-00030
Evaluation of an Automatic Speech Recognition Platform for Dysarthric Speech. Calvo Irene,Tropea Peppino,Viganò Mauro,Scialla Maria,Cavalcante Agnieszka B,Grajzer Monika,Gilardone Marco,Corbo Massimo Folia phoniatrica et logopaedica : official organ of the International Association of Logopedics and Phoniatrics (IALP) INTRODUCTION:The use of commercially available automatic speech recognition (ASR) software is challenged when dysarthria accompanies a physical disability. To overcome this issue, a mobile and personal speech assistant (mPASS) platform was developed, using a speaker-dependent ASR software. OBJECTIVE:The aim of this study was to evaluate the performance of the proposed platform and to compare mPASS recognition accuracy to a commercial speaker-independent ASR software. In addition, secondary aims were to investigate the relationship between severity of dysarthria and accuracy and to explore people with dysarthria perceptions on the proposed platform. METHODS:Fifteen individuals with dysarthric speech and 20 individuals with nondysarthric speech recorded 24 words and 5 sentences in a clinical environment. Differences in recognition accuracy between the two systems were evaluated. In addition, mPASS usability was assessed with a technology acceptance model (TAM) questionnaire. RESULTS:In both groups, mean accuracy rates were significantly higher with mPASS compared to the commercial ASR for words and for sentences. mPASS reached good levels of usefulness and ease of use according to the TAM questionnaire. CONCLUSIONS:Practical applicability of this technology is realistic: the mPASS platform is accurate, and it could be easily used by individuals with dysarthria. 10.1159/000511042