How to perform speech synthesis?
There are several algorithms. The choice depends on the task they’re used for. The easiest way is to just record the voice of a person speaking the desired phrases. This is useful if only a restricted volume of phrases and sentences is used, e.g. messages in a train station, or schedule information via phone. The quality depends on the way recording is done. More sophisticated but worse in quality are algorithms which split the speech into smaller pieces. The smaller those units are, the less are they in number, but the quality also decreases. An often used unit is the phoneme, the smallest linguistic unit. Depending on the language used there are about 35-50 phonemes in western European languages, i.e. there are 35-50 single recordings. The problem is combining them as fluent speech requires fluent transitions between the elements. The intellegibility is therefore lower, but the memory required is small. A solution to this dilemma is using diphones. Instead of splitting at the transit
There are several algorithms. The choice depends on the task they’re used for. The easiest way is to just record the voice of a person speaking the desired phrases. This is useful if only a restricted volume of phrases and sentences is used, e.g. messages in a train station, or schedule information via phone. The quality depends on the way recording is done. More sophisticated but worse in quality are algorithms which split the speech into smaller pieces. The smaller those units are, the less are they in number, but the quality also decreases. An often used unit is the phoneme, the smallest linguistic unit. Depending on the language used there are about 35-50 phonemes in western European languages, i.e. there are 35-50 single recordings. The problem is combining them as fluent speech requires fluent transitions between the elements. The intellegibility is therefore lower, but the memory required is small. A solution to this dilemma is using diphones. Instead of splitting at the transit