Speech.tos

GFA, ASM, STOS, ...

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

Grimmy
Atarian
Atarian
Posts: 5
Joined: Wed Jan 07, 2004 12:25 am
Location: Toulouse - France

Speech.tos

Post by Grimmy »

Hi,

Does somebody have any technical infos, sources or docs about this proggy (speech.tos) ? It is quite short (28Kb), are the phoneme sample-data compacted or non_pcm format or ?

thx
User avatar
tobe
Atari God
Atari God
Posts: 1459
Joined: Sat Jan 24, 2004 10:06 am
Location: Lyon, France
Contact:

Post by tobe »

Hello !

This prog remember me the one i used to play on C64 ! it has the same voice :lol: !

I think it generate waves from meta-data, the real work is done by the sequencer/mixer (no samples !).

Some recent voice generator use samples, but the trick is to mix not only phonemes but whole words togethers !

Bye !
Tobe.

edit : It could be nice to have a speech synthesis inside a demo screen to read the scrolltext with a ugly computer voice :D !
step 1: introduce bug, step 2: fix bug, step 3: goto step 1.
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

i recently dissasembled the speech.tos program. it seems it uses 3 sinus waveforms at different frequencies and amplitudes mixed toghether.. the third channel can be replaced with PSG noise to simulate a 'hiss'. i think this is a way to simulate the formants present in the human voice + adding some hiss.

i also understand how it makes the PSG replay the mixed wave, i think just by the usual interrupt that sets PSG channel volumes. many sample replay routs use the same.
it's definitely not as natural sounding as a 'diphone' speech synth (about 3000 indivual samples required), but it sounds nicely robotty :)

anyway, i have absolutely _no_ idea how the wave amplitudes and frequencies are controlled. somewhere in between phoneme conversion and the sample interrupt it got a bit blurry ;)
Grimmy
Atarian
Atarian
Posts: 5
Joined: Wed Jan 07, 2004 12:25 am
Location: Toulouse - France

Post by Grimmy »

Hmm... interresting !

I've used the record as wav option in Steem to take a closer look at the wave-form and i've noticed most phonemes produced by speech.tos have a main periodic signal (but modulated with 'something' else).

I will try to experiment this way, with some sin & R6 noise.
Thanx for the clue :D

If you find how the amplitudes & frequencies are managed, i will be happy to know about that.

That sound robotty but would be pretty cool if used in a demo :D
User avatar
tobe
Atari God
Atari God
Posts: 1459
Joined: Sat Jan 24, 2004 10:06 am
Location: Lyon, France
Contact:

Post by tobe »

Grimmy wrote:That sound robotty but would be pretty cool if used in a demo :D
I can imagine such a demo :lol: ! I hope you will release it !
step 1: introduce bug, step 2: fix bug, step 3: goto step 1.
damo
Atariator
Atariator
Posts: 25
Joined: Wed Mar 05, 2003 12:46 pm

Post by damo »

yeah i've been kicking these very ideas about for a while now.. havent had a whole lot of luck with the phoneme concatenation angle though.. maybe this sine modulation thing is a better option.

speech synthesis ROCKS!

damo
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

k00l! I'd enjoy seeing (hearing) a new speech synth. Anyway, I have no idea how the phonemes translate to sine freq/amp and hiss. I got lost in the code ;)

anyway, it seems stspeech has reserved words for 1040st (tenforty estee) and 520st (fivetwenty estee). =)

happy kooooaaaadddijnngg =)
User avatar
Nils Schneider
Atari User
Atari User
Posts: 42
Joined: Tue May 02, 2006 12:20 am
Location: Neuss, Germany
Contact:

Post by Nils Schneider »

Hi coders,

Maybe some of you can help me out here?

http://www.atari-forum.com/viewtopic.php?t=9063
User avatar
karlm
Atari Super Hero
Atari Super Hero
Posts: 713
Joined: Thu Nov 13, 2003 4:09 am
Location: Top of the World - Australia

Post by karlm »

Man do I feel old now :)

I remembered even the filename of this baby.
Used to be c_say.zoo, had a hard time finding even an unzoo program!
Now in zip format.
Has documentation on how to interface the speech2.tos program using C and also the documented assember code for speech2.tos.

Don't ask me to explain it to you all though, my vocabulary is not that expansive :)

poo, I must nearly be as old as Mug ... now that is scary :)

cheers

karlm
You do not have the required permissions to view the files attached to this post.
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

karlm: i guess you did the same as i did but some years earlier ;)

don't know if i told this before, but speech.tos works like this:

1) convert text -> phoneme stuffs
2) magic waveform + hiss generation from phonemes
3) waveform + hiss replay using YM

(1) and (3) are relatively easy. (2) is quite illusive (magic) and not understood by anyone. that's why this is basically the only speech synth ever used on ST. I checked some PeeCee C source for a miniature speech synth but this sounded like complete nads compared to stspeech. the concept was almost the same.. converting phonemes into some sinewaves (matching formants). the speech synth in planet potion (amiga ppc demo) and windoze are pretty good, though.

i just hope someone more knowledgable on this stuff comes to visit this board some day..
User avatar
Mug UK
Administrator
Administrator
Posts: 11639
Joined: Thu Apr 29, 2004 7:16 pm
Location: Stockport (UK)
Contact:

Post by Mug UK »

karlm wrote:Man do I feel old now :)

poo, I must nearly be as old as Mug ... now that is scary :)

karlm
Cheeky barst .. I may look 40+ but I'm only 36 :) Mind you, the grey beard doesn't help matters!
My main site: http://www.mug-uk.co.uk - slowly digging up the bits from my past (and re-working a few): Atari ST, Sega 8-bit (game hacks) and NDS (Music ripping guide).

I develop a free Word (for Windows) add-in that's available for Word 2007 upwards. It's a fix-it toolbox that will allow power Word users to fix document errors. You can find it at: http://www.mikestoolbox.co.uk
User avatar
ggn
Atari God
Atari God
Posts: 1258
Joined: Sat Dec 28, 2002 4:49 pm

Post by ggn »

STSPEECH has been converted to Windows and the Nintendo DS. Check this out: http://nintendo-ds.dcemu.co.uk/DSSpeech.shtml
and go nag the authors for a source relelase :D

:)

Trivia: The original credits A.D.Beveridge as its co-author. Wonder if that was the same as the 'Andy Beveridge' shown in the credits of Cybercon III (assembly line games rulez ;) )

George
is 73 Falcon patched atari games enough ? ^^
User avatar
tobe
Atari God
Atari God
Posts: 1459
Joined: Sat Jan 24, 2004 10:06 am
Location: Lyon, France
Contact:

Post by tobe »

karlm wrote:Man do I feel old now :)

I remembered even the filename of this baby.
Used to be c_say.zoo, had a hard time finding even an unzoo program!
Now in zip format.
Has documentation on how to interface the speech2.tos program using C and also the documented assember code for speech2.tos.

Don't ask me to explain it to you all though, my vocabulary is not that expansive :)

poo, I must nearly be as old as Mug ... now that is scary :)

cheers

karlm
:D :D :D :D :D

Thanks a lot Karlm for the sources :)
I'm commenting the code right now, it seems it need a few optimisations in the parsing function ;)
Maybe a demo with speech soon who know ?
step 1: introduce bug, step 2: fix bug, step 3: goto step 1.
User avatar
lp
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2605
Joined: Wed Nov 12, 2003 11:09 pm
Location: GFA Headquarters
Contact:

Post by lp »

Does anyone remember the commercial release 'Smooth Talker'? This app sounded really good, touch better than speech.tos ,but it was useless as it had an awkward GUI and you could not call it externally. Sounded great though. It would read text files out loud.

That one might be good one to hack if someone could dig up a copy. :)
Gunstick
Captain Atari
Captain Atari
Posts: 298
Joined: Thu Jun 20, 2002 6:49 pm
Location: Luxembourg
Contact:

Re: Speech.tos

Post by Gunstick »

this program has also a multitude of parametrisations

* you can change the pitch
* modification of talking speed
* put in the phonemes directly
* insert pauses

I had quite some fun with it, but somehow forgot how it works.
The program even made it into the german charts in a techno song (U96: Das Boot)

1..2..3.. techno

Georges
User avatar
karlm
Atari Super Hero
Atari Super Hero
Posts: 713
Joined: Thu Nov 13, 2003 4:09 am
Location: Top of the World - Australia

Post by karlm »

muguk wrote:
karlm wrote:Man do I feel old now :)

poo, I must nearly be as old as Mug ... now that is scary :)

karlm
Cheeky barst .. I may look 40+ but I'm only 36 :) Mind you, the grey beard doesn't help matters!
lol :) Don't laugh Mug, me ol china ... You only got two years on me then :P

/me young whippersnapper :D

glad you like it Tobe

cheers

karlm
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

gunstick:

didn't know it could change talking speed or pitch.. man.. it really is interesting stuffs.. wish i had smoe more time..

btw a bit off-topic, but i'm using a 4->16 bit adpcm for my demos now which seems to work very nice. i made a variation on this one:

http://www.syncscroller.net/psx/depack.c

is teh rules! =)
User avatar
ggn
Atari God
Atari God
Posts: 1258
Joined: Sat Dec 28, 2002 4:49 pm

Post by ggn »

earx wrote:gunstick:

didn't know it could change talking speed or pitch.. man.. it really is interesting stuffs.. wish i had smoe more time..

btw a bit off-topic, but i'm using a 4->16 bit adpcm for my demos now which seems to work very nice. i made a variation on this one:

http://www.syncscroller.net/psx/depack.c

is teh rules! =)
Hoping to hear the results in a finished project soon :)
is 73 Falcon patched atari games enough ? ^^
Sarek
Captain Atari
Captain Atari
Posts: 374
Joined: Sat Nov 20, 2004 12:30 pm

Post by Sarek »

We're all taking about speech engines!

You can always make your own speech engine out of samples of your own vocal tones. I created one of these a few years ago but the result had some imperfections. (Clicks between pheonems) I've never got around to fixing it:

1. Get a table of phonetic sounds.

2. Record all phonetic sounds into sampler using your own voice. (You should use constant vocal notes at the same pitch)

3. Write a program to convert text into phonetic strings.

4. The program should now sew bits of phonetic noise together and play it.

5. Playback pitch can be adjusted by expanding or compressing the wave.

You could probably do a Fourier analysis and "fade" the harmonics to create smooth transitions between pheonems.
Grimmy
Atarian
Atarian
Posts: 5
Joined: Wed Jan 07, 2004 12:25 am
Location: Toulouse - France

Post by Grimmy »

Two years later... :)

I'm quite interested if there's any one able to explain how speech.tos generate it's phonemes.

Up to now, I'm using a set of short 4bit phonemes samples, but it still take too much space, even compressed (to fit in a 4Kb intro for exemple). That's why I'm still looking for a way to generate these samples using some simple additive synthesis (I do not care if it sound robotic or not, if it's understandable, I will be quite happy yet :).

The attachement below include 2 speech software i've found on the ST, if it can be of any use. Give an hear to the one in the "SMOOTH" folder :)
You do not have the required permissions to view the files attached to this post.
User avatar
tobe
Atari God
Atari God
Posts: 1459
Joined: Sat Jan 24, 2004 10:06 am
Location: Lyon, France
Contact:

Post by tobe »

I'm currently studying the source code "c_say.s", adding comments and labels. As soon as i get something readable, i will post it here.

edit

Code: Select all

   EY   AY   OY   OW   WX   YX   AE   IY   ER   AO   UX   UH   AH   AA   OH
   AX   IX   IH   EH   DH   ZH   CH   CH   LX   RX   SH   NX   TH   /H   V
   Z    J    L    R    W    Y    Q    P    T    K    B    D    G    M    N
   F    S    -    ?    .         UL   UM   UN   IL   IM   IN
Does someone know what kind of phonemes it is ?
step 1: introduce bug, step 2: fix bug, step 3: goto step 1.
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

i now released a nicely fixed (and sub-optimal) version of STSPEECH for falcon (also all boosted ones: CT60, nemesis, CT2, phantom, etc): I removed the Self-Modifying Code, MOVEP instructions and YM shadow register access. The whole thing still runs in timer A and so, you can use it in parallel with most MP2/3 and MOD players.

the phonemes listed by tobe are very well disassembled in the C_SAY program. these phonemes match little programs like "noise pulse with specified frequency" or "3 formant waves at specified frequencies".

For instance, the "S" and "F" phonemes are noise-only using the pseudo-random generator of the YM2149. IIRC the "F" has a lower frequency than the "S", for the rest they are identical. And they don't even use an envelope (if i saw correctly) !

I now understand the whole ST Speech thing. The core of the thing is the phoneme table which contains little "structures" that translate phonemes into stuff a soundchip can understand.

ST Speech is not primarily very big because of these but also because of premultiplied sinewave tables that are DC.B'ed in the program.. Also the 8b sample -> 3 channel YM amplitude conversion table is DC.B'ed in there. Another costly thing is the code that translates english into phonemes and even the prompt (User Interface). When you trash some of this code and replace the rest with table generation code you may end up with a 4Ktro yet. Or a very attractive engine for use in combination with a 96ktro.

I can post my Falcon/TT/CT60 fixed C_SAY here this evening.
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

here it is. C_SAY.ZIP contains the compatible C_SAY + DSPMOD (DSPMOD only likes to be in STRAM), these may run in parallel :). FSPEECH.ZIP contains a compatible version of STSPEECH.
earx
Captain Atari
Captain Atari
Posts: 353
Joined: Wed Aug 27, 2003 7:09 am

Post by earx »

eh, here they are then..
You do not have the required permissions to view the files attached to this post.
User avatar
lp
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2605
Joined: Wed Nov 12, 2003 11:09 pm
Location: GFA Headquarters
Contact:

Post by lp »

If anyone happens to have a Hades, FSPEECH works. Very cool.
Nice work earx. :D

I downloaded the c_say archive, noticed there was a file called say.o in there. Appears to be a DRI format object file, so I linked this against a test program I made in GFA, but I don't get what sounds like correct speech. I then compiled the testsay.s, which worked. So I took the sayspl.s, added opt l2 and uncommented the .global lines and built an new object file. If I link against this new object file I get correct speech. My Hades uttered the word "Atari" from a compiled GFA program. lol

Do you plan to put the "english -> phonetic text" routine back in?
Post Reply

Return to “Coding”