Kadazandusun Speech Recognition: A Case Study

Mohd Shamrie Sainin; Mohd Hanafie Haris

Mohd Shamrie Sainin; Mohd Hanafie Haris.

Transactions on Science and Technology

Back

ABSTRACT

Currently, there is no existing system that provides common information and utilities for Kadazandusun’s speech recognition since Kadazandusun speech has different features that are not available in other languages. This paper presents a preliminary experiment using one of the famous feature extraction methods which is Linear Prediction Cepstral Coefficients (LPCC). Further investigation on the speech data is using several classifier algorithms to investigate the recognition rate of Kadazandusun words. There are 6 words of Kadazandusun collected as an individual speech to test the feature extraction and the classifiers. The objectives of this study are to investigate LPCC feature extraction and to propose a suitable classifier algorithm for Kadazandusun speech data.

KEYWORDS: Kadazandusun; Feature Extraction; Speech Recognition; Speech; LPCC

Download this PDF file

REFERENCES

Abbaschian, B. J., Sierra-Sosa, D. & Elmaghraby, A. 2021. Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models. Sensors, 21(4), 1249.

Deshmukh, A. M. 2020. Comparison of Hidden Markov Model and Recurrent Neural Network in Automatic Speech Recognition. European Journal of Engineering Research and Science, 5(8), 1-8.

Jain, N. & Rastogi, S. 2019. Speech Recognition Systems - A Comprehensive Study Of Concepts And Mechanism. Acta Informatica Malaysia, 3(1), 1-3.

Jena, B., Mohanty, A. & Mohanty, S. K. 2020. Gender Recognition of Speech Signal using KNN and SVM. Proceedings of the International Conference on IoT based Control Networks and Intelligent Systems (ICICNIS 2020). 10-11 December 2020. Kerala, India.

Juan, S. S., Besacier L., & Tan, T. 2012. Analysis of Malay Speech Recognition for Different Speaker Origins. Proceedings of 2012 International Conference on Asian Language Processing. 13-15 November 2012. Hanoi, Vietnam, pp. 229-232.

Kewal, M., Amitesh, D., Rahul, K., Viraj, P. & Suvarna, P. 2020. Speech Recognition: General Idea and Overview. International Research Journal of Engineering and Technology (IRJET), 7(10), 947-953.

Këpuska, V. & Elharati, H. 2015. Robust Speech Recognition System Using Conventional and Hybrid Features of MFCC, LPCC, PLP, RASTA-PLP and Hidden Markov Model Classifier in Noisy Conditions. Journal of Computer and Communications, 3, 1-9.

Nainan, S., & Kulkarni, V. 2020. Enhancement in speaker recognition for optimized speech features using GMM, SVM and 1-D CNN. International Journal of Speech Technology, 24, 809–822

Ooi, C. A., Hariharan, M., Yaacob, S. & Lim, S. C. 2012. Classification of speech dysfluencies with MFCC and LPCC features. Expert Systems with Applications, 39(2), 2157-2165.

Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2) 257-286.

Sincy, V. T., Sreekumar, K. T., Santhosh, K. C. & Reghu R. P. C. 2014. Random forest algorithm for improving the performance of speech/non-speech detection. Proceedings of the First International Conference on Computational Systems and Communications (ICCSC). 17-18 December 2014. Trivandrum, India. pp 28-32.

Sloane, E. B., & Silva, R. J. 2020. Artificial intelligence in medical devices and clinical decision support systems. In: Ernesto, I. (Ed.). Clinical Engineering Handbook (2nd Edition). Academic Press.

Sullivan, A. G. & Albert, C. K. T. 1988. SABAH, land of the sacred mountain. Sabah Handicraft Centre.

Swarna, R. N. 2020. Bangla Broadcast Speech Recognition Using Support Vector Machine. Proceedings of the 2020 Emerging Technology in Computing, Communication and Electronics (ETCCE). 21-22 December 2020. Bangladesh. pp 1-6.