Saturday, October 24, 2015

Speech Recognition Over IP Networks Hong Kook Kim

This chapter introduces the basic features of speech recognition over an IP-based network. First of all, we review typical lossy packet channel models and several speech coders used for voice over IP, where the performance of a network speech recognition (NSR) system can significantly degrade. Second, several techniques for maintaining the performance of NSR against packet loss are addressed. The techniques are classified into client-based techniques and server-based techniques; the former ones include rate control approaches, forward error correction, and interleaving, and the latter ones include packet loss concealment and ASR-decoder based concealment. The last part of this chapter is devoted to explaining a new framework of NSR over IP networks. In particular, a speech coder that is optimized for automatic speech recognition (ASR) is presented, where it provides speech quality comparable to the conventional standard speech coders used in the IP networks. In addition, we compare the performance of NSR using the ASR-optimized speech coder to that using a conventional speech coder.

Monday, June 8, 2015

Sherif Mohamed Abdel Monem Ph.D. Dissertation Speech Analysis Synthesis

Sherif Mohamed Abdel Monem Ph.D. Dissertation Speech Analysis Synthesis Recognition


Synopsis

     The research is about the study of the transmission line networks of the vocal tract with applications of speech analysis, synthesis and recognition. The output response is calculated for cascaded transmission line networks representing the vocal and nasal tracts. The reflection coefficients at various junctions of the transmission line network model correspond to the shape of the vocal tract at any particular time and mode of articulation. For lossless transmission line networks the analysis is presented in section IV-A1b, For lossy networks in section IV-A1b, and for networks with nonequal length segments in section IV-A1c.
     In the course of these derivations a new analysis technique for transmission line network response is developed (Ch. IV). In contrast to the usual detailed analysis of the forward and backward waves using a section by section approach, the entire transmission line configuration is considered a system. The pursuit of this new analysis leads to the introduction of several analytical tools which facilitates the computation. Examples of these analytical tools are: (a) loop-gains (Sec. IV-B and App. Q), pseudo-loop gains (Sec. IV-B and App. R), subtree-pseudo-loop gains (Sec. IV-B and App. R), subtree-pseudo-loop gains (Sec. IV-B and App. M) and path gains (sec. IV-A2 and Sec. IV-B).
     The synthesis of the transmission line networks is treated for various given parameters: (a) for input-output response (Sec. IV-Fa and Fb), (b) for a given characteristic equation polynomial (sec. IV-Eb). The recursive solution of the autocorrelation normal equations, as applied to the synthesis of transmission line network, is represented in Appendix M. A discussion of the use of an analog signal (i.e. a triangular pulse for the excitation of a transmission line network) in the analysis and synthesis of the transmission line model is explored in Sec. V-a.
    As an illustration of the use of a transmission model in various speech applications and a discussion of the adaptability of the cascaded transmission line network for the simulation of non-nasalized speech sounds and transmission line T network for the simulation of nasalized speech sounds are represented in Sec. V-B.
    The model
/////////////////////////////////////////////////////////////////////////////////////////////////


I-INTRODUCTION


     In many areas of engineering research, one deals with a system that is related to some physical process. One way to be able to study the performance of such system is through the examination of the physical phenomena involved. This can be achieved by finding a suitable model that best describes the system behavior, with reasonable and sufficient degree of simplicity and accuracy. Therefore, a search for such a model that satisfies these criteria seems of great importance.
     Many physical phenomena, such the one involving signal propagation through multimedia, are found to have common features and modes of propagation. For such systems, a unified model seems appropriate and highly desirable.

Wednesday, June 30, 2010

VOCAL TRACT ACOUSTICS

VOCAL TRACT ACOUSTICS USING THE TRANSMISSION LINE MATRIX
(TLM) METHOD
Samir El-Masri (1,2), Xavier Pelorson (1), Pierre Saguet (2) and Pierre Badin (1)
CONCLUSION
In this paper we have described an original numerical method
to simulate the acoustics of the vocal tract. Compared with
traditional FEM methods, this technique has the main
advantage to perform time domain simulations in a very simple
and accurate way. A study of higher acoustical modes has been
presented. We showed how, using the TLM method, it was
possible not only to evidence these modes but also to measure
their characteristics. Based on these findings, a new
propagation model has been derived and appears to be much
more accurate that the classical plane wave model. First
simulations using complex vocal tract geometries tend to
confirm the importance of these higher modes.

Modeling of the Vocal Tract Using Transmission Line Network

Modeling of the Vocal Tract Using Transmission Line Network

Friday, March 12, 2010