[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Business models and eye candy
Brandon J. Van Every wrote:
2 months ago I asked several text-to-speech researchers whether synthesis
for dramatic purposes was currently possible. As in, controlling tone of
voice, expression, etc. They said no. Estimates of when it'll be
available ranged from "that's our next generation product" to "people have
been saying they'll have this in 5 years for, like, forever."
Transplanted prosody will help; it uses the prosody (pitch, timing, volume)
from a real audio recording to control the pitch, timing, and volume of the
text-to-speech. I put some examples on http://www.mxac.com.au/m3d/tts.htm.
It's still a long way from what you're probably looking for.
I am emphasizing TTS because it's a cost issue; voice talent is expensive,
particularly if you want to create a large world. Pre-recorded speech (and
transplanted prosody) are also inflexible and can only play canned
responses.
Of course, TTS may not be enough "eye candy", and might even be seen as
worse than raw text by some players.
Mike Rozak
http://www.mxac.com.au