|
|
The Speech and Hearing Log ...
A dog that recognises 200 spoken words
11 Jun 2004. A report in Science about a
border collie called Rico that has been tested on his ability to fetch
items by their name as spoken by his owner. Not only can Rico fetch
over 200 objects by name, the research shows that under some
circumstances he can learn the name of a new item after just a single
spoken example. Although Rico can't speak, or use sentences, his
vocabulary size is similar to a 3-year old child, and at the same level
as the best parrots, apes, and dolphins. What is interesting is the
idea that word acquisition could arise from fairly general pattern
recognition abilities, and of course at what point that simple process
is insufficient to explain how a child acquires language. And why
haven't we yet got a computer that is as intelligent as Rico?
Potent Tools for Speech Research
30 Jan 2004. Release 4.5 of the Speech Filing System
(SFS) tools for speech research include algorithms from the Entropic
Signal Processing System. If this means nothing to you, then you
probably didn't know that the pitch and formant estimation algorithms
from ESPS were considered state of the art when they were part of this
very expensive commercial package. But now these same algorithms
are available as free download as part of a set of tools that run on
PCs. Entropic were bought by Microsoft to get access to their
speech recognition technology and have been generous enough to return
the intellectual property in the ESPS tools to the community under an
open software licence.
English speakers "spread disease" in China
22 Jan 2004. Another crazed story involving doctors
misunderstanding language, this time from China. As reported in
the UK Guardian newspaper by the author of the Annals of Improbable Research, Dr Sakae Inouye, of Otsuma Women's University in Tokyo has reported to the Lancet
on his theory that the spread of SARS (Severe Acute Respiratory
Syndrome) among Chinese and English speakers is greater than among
Japanese speakers because of the presence of aspirated consonants in
English and Chinese. Without getting into the dangerous area of
why Dr Inouye believes in the superiority of his own language, it is
pretty easy to convince yourself that aspiration can only reduce
air-flow at the lips - after all it involves adding a resistance to air
flow at the glottis. If you are comparing human cultures in the
spread of disease, there are for more potent influences on
susceptibility than language.
Sopranos compromise intelligibility for power
10 Jan 2004. Physicists in New South Wales,
Australia have measured the frequency response of the vocal tract of
operatic sopranos singing vowels at high pitches. Their findings,
reported in Nature and a BBC News story,
are that the singers shift their formant frequencies onto harmonics of
the fundamental to ensure greatest output power regardless of the fact
that such a shift will change the quality of the vowel. Although
this has been suspected for some time, what seems to be new is the use of a sound probe technique to obtain high-quality estimates of vocal tract resonant frequencies during singing.
Child abuse in the name of better English pronunciation
3 Jan 2004. A shocking story in the UK Independent newspaper (copy here)
about Korean parents forcing medical surgery on their children in the
mistaken idea that it will improve their pronunciation of
English. Since Korean, like Japanese, treats [l] and [r] as
allophones of the same phoneme, speakers of those languages have
difficulty in perceiving or producing the distinction between English
phonemes /l/ and /r/. However some Korean parents have come
to believe that this difficulty in pronunciation is due to an
anatomical problem and have sent their children for surgery supposed to
make the tongue more "flexible". Another example alongside
circumcision of ignorance being used to justify child mutilation.
SoundBlaster MP3+ USB audio
13 Dec 2003. My first impressions of the SoundBlaster MP3+
external USB audio interface were quite positive. Installation
was straightforward on my Vaio laptop, audio output sounded fine, and
measured noise and harmonic distortion on the line-level inputs was
good. But then I tried to get a good signal from a
microphone. I found noise levels only 30 times less than the
maximum signal level, or a signal-to-noise ratio of only 30dB.
This is really bad, and worse even than the direct microphone input on
my Vaio. After all the good work Creative Labs did on the rest of the
box, they seem to have put some noisy pre-amplifier in the microphone
input circuit. If you're thinking of using this box for microphone
input, be prepared to buy a pre-amplifier and use the line inputs.
British accent a curse of American stroke victim
25 Nov 2003. BBC News reports of an American woman who on recovering from a stroke found her speech had changed to an British accent. Occurrences of Foreign Accent Syndrome
have been reported before, and evidence is that the accent is in the
minds of listeners rather than in the speech of the victims, but what I
found interesting about this report was that the woman treated her new
British English accent as a kind of disability. I'm not sure I
like my accent being described in this way! Accents help define
us as individuals - we wouldn't be the same person with a different
accent - and this is what really upset this woman, that her own image
of her own personality has been changed by how her stroke has affected
her speech.
Predicting hit songs with science
24 Nov 2003. There are some things that engineers should leave alone.
One of them is using science to find the secret of success of works of
art. Even if PolyPhonic HMI
have found statistical correlations between the physical properties of
popular music and its commercial success, that will really be of no use
in predicting the sales of new songs. What makes a hit is not to be
found in the sound alone - in the actual hertz and decibels and
milliseconds - but in our reaction to it in terms of its musicality,
its emotion and meaning, the appeal of the artist, its marketing, price
and availability, and so on. Hit Song Science is just a correlation of sound and profit, ultimately meaningless and ultimately cynical.
Waseda University Mechanical Talker speaks!
22 Nov 2003. I've just come across videos of the Waseda talking robot
actually speaking. The robot vocal tract is ingeniously operated by
wires which extend through the articulators - so that the tongue, for
example, can be pulled into position with a wire through the palate.
Here is a direct link to the robot saying "Waseda University" (in Japanese). It's an astonishing feat, but I'm unsure of the application - what does
a mechanical vocal tract teach us that we couldn't learn in computer simulation?
Speech technology helps create Mobile phone for visually impaired users
21 Nov 2003. The Owasys 22C
mobile phone has no display, large keys, and can communicate with its
owner using speech - through speech synthesis and speech recognition
technology. It was designed for blind and visually impaired
users, but the company has found that it is also popular among the
sighted elderly. There is an important lesson here: that interface
designs that work for the disabled are often really nothing more than good designs. Interfaces where ease of use has been given the right importance in the battle against creeping functionality.
Primopuel talking doll is hit with adults
30 Oct 2003. The Primopuel phenomenon in Japan has reached the western news agencies. Primopuel is a
doll produced by Bandai
that uses a number of sensors for things like touch, vibration,
temperature and sound to "react" to its owner by producing one of 280
phrases from a stored lexicon recorded by a six-year-old child.
Worryingly, such a simulation of a child seems to have won the hearts
of many Japanese adults, who see such a device as a
companion-substitute. Well, I guess it is more vocal than a dog or a
cat - but isn't the fact that it is designed to manipulate your emotions just a little worrying?
|