Klaske van Sluis

Long-term stability of tracheoesophageal voices 83 possible when voice recordings of the 149-word Dutch text with neutral con- tent, “Tachtig dappere fietsers” [Eighty brave cyclists], were available from the same speaker with an interval of at least 7 years (T1 and T2, with 7-18 years in between). All speakers had undergone laryngectomy and used a Provox voice prosthe- sis (Atos Medical. Hörby, Sweden). 27 recordings were made during the latter half of the 1990’s (I), in 2007 (II), and in 2014 (III), in total 35 minutes of speech. No effort was made to ensure correct reading of the text so the actual words uttered vary somewhat. Speaker UCX did not complete the full text once and speaker K9S read a longer variant of the text once. For speaker KRH, there were recordings for all three periods (T1-T3). The recordings were made as part of different studies, each using different equipment (see Table 5.2). For this study, recordings were digitized and converted to 44.1 kHz sampling rate and 16-bit Signed Integer PCM encoding (RIFF/WAVE). No audio compres- sion had been used on the recordings. 5.2.2 Perceptual evaluation Recordings were evaluated by ten experienced SLPs (experts), including one of the authors (KvS). Experts did the evaluations at home in a self-paced online listening experiment. At the time, experts were not informed about the details of the speakers. All experts were female, mean age 29.9 year (range 22- 49). Eight were native speakers of Dutch. Two were native German speakers, who acquired Dutch as a second language. All experts were certified Dutch SLP’s. Evaluations were done using standard web browsers. There were two experiments. In experiment 1, the experts were asked to grade recordings of one single, long sentence as having better or worse speech intelligibility and voice quality. The experts used two slider rules as computerized visual-analog scales (VAS). In experiment 2, the same experts evaluated two pairs of short sentences from each speaker. The experts were asked to judge which version of the sentence in the pair was better and to what extent. The evaluation was again done using slider rules for speech intelligibility and voice quality. Experts could listen to the stimuli as often as they wanted. Stimuli were presented in pseudo-random order, different for each expert. The results were scored between 0-1000 (pseudo-continuous). Table 5.1: Available patients/recordings, see text. Period T1 T2 I 1996-1999 [18] 8 - II 2007 [19] 5 7 III 2014 - 6