|Table of Contents|

A Speech Segmentation Algorithm Based on Improved Eigenvalue(PDF)

南京师范大学学报(工程技术版)[ISSN:1006-6977/CN:61-1281/TN]

Issue:
2011年03期
Page:
73-77
Research Field:
Publishing date:

Info

Title:
A Speech Segmentation Algorithm Based on Improved Eigenvalue
Author(s):
Ren Xinshe1Miao Hua 2Ma Qingyu3
1.Department of Educational Technology,Nanjing Normal University,Nanjing 210097,China
Keywords:
speech retrievalspeech segmentationimproved eigenvalue
PACS:
TN912.3
DOI:
-
Abstract:
With the rapid development of internet technology and media application,text-based retrieval cannot satisfy the requirements and auditory-visual processing can not be applied for the large data amount,so the emergence of speech retrieval is particularly important. An audio clip usually consists of many different types of audio segments with different meanings; therefore,it becomes a new method to perform speech retrieval with audio segmentation for modern media based on audio eigenvalue. In the article,the basic eigenvalue of each audio frame is compared with the average eigenvalue of the entire audio clip and then the improved eigenvalue can be obtained for audio segmentation by using the KNearest Neighbor algorithm. The experimental results show that the proposed algorithm based on the improved eigenvalue can efficiently improve the accuracy of audio segmentation.

References:

[1]李恒峰,李国辉. 基于内容的音频检索与分类[J]. 计算机工程与应用,2000,36( 7) : 54-56. Li Hengfeng,Li Guohui. Content-based audio retrieval and classification[J]. Computer Engieering and Applications,2000, 36( 7) : 54-56. ( in Chinese)
[2]朱爱红,李连. 基于内容音频检索综述[J]. 微机发展,2003,13( 12) : 58-61. Zhu Aihong,Li Lian. The summarization of content-based audio retrieval[J]. Microcomputer Development,2003,13( 12) : 58-61. ( in Chinese)
[3]张燕,唐振民. 基于MFCC 和HMM 的音乐分类方法研究[J]. 南京师范大学学报: 工程技术版,2008,8( 4) : 112-114. Zhang Yan,Tang Zhenmin. Research of music classification based on MFCC feature and HMM model[J]. Journal of Nanjing Normal University: Engineering and Technology Edition,2008,8( 4) : 112-114. ( in Chinese)
[4]张永皋,马青玉,孙青. 基于MFCC 和CHMM 技术的语音情感分析及其在教育中的应用研究[J]. 南京师范大学学报: 工程技术版,2009,9( 2) : 89-92. Zhang Yonggao,Ma Qingyu,Sun Qing. Investigation on speech emotion analyses and its application in education based on MFCC and CHMM techniques[J]. Journal of Nanjing Normal University: Engineering and Technology Edition,2009,9( 2) : 89- 92. ( in Chinese)
[5]Foote J. An overview of audio information retrieval[J]. Multimedia Systems,1999,7( 1) : 47-59.
[6]Saunders J. Real-time discrimination of broadcast speech /music[C]/ / Proc ICASSP96. Washington DC: IEEE Computer Society, 1996( 2) : 993-996.
[7]Scheirer E,Slaney M. Construction and evaluation of a robust multifeature music /speech discriminator[C]/ / Proc ICASSP97. Washington DC: IEEE Computer Society,1997( 2) : 1-4.
[8]Zhang Y B,Zhou J. Audio segmentation based on multi-scale audio classification[J]. Multimedia Systems,2004( 4) : 349- 352.
[9]Lu L,Zhang H J,Jiang H. Content analysis for audio classification and segmentation[J]. IEEE Trans Speech Audio Process, 2002,10( 7) : 504-516.
[10]Campbell J P,Jr. Speaker recognition: a tutorial [J]. Proceedings of the IEEE,1997,85( 9) : 1 437-1 462.
[11]Lu L,Jiang H,Zhang H J. A robust audio classification and segmentation method[C]/ / Proc 9th ACM Int Conf Multimedia. New York: ACM,2001: 203-211.
[12]El-Maleh K,Klein M,Petrucci G,et al. Speech /music discrimination for multimedia application[C]/ / Proc ICASSP00. Istanbul: IEEE Press, 2000: 2 445-2 448.

Memo

Memo:
-
Last Update: 2013-03-21