End-to-end Speaker Recognition Based on MTFC-FullRes2Net,ERICDATA高等教育知識庫
高等教育出版
熱門: 朱丽彬  黃光男  王美玲  王善边  曾瓊瑤  崔雪娟  
高等教育出版
首頁 臺灣期刊   學校系所   學協會   民間出版   大陸/海外期刊   政府機關   學校系所   學協會   民間出版   DOI註冊服務
篇名
End-to-end Speaker Recognition Based on MTFC-FullRes2Net
並列篇名
End-to-end Speaker Recognition Based on MTFC-FullRes2Net
作者 Li-Hong DengFei DengGe-Xiang ChiouQiang Yang
英文摘要

The feature extraction ability of lightweight convolutional neural networks in speaker recognition systems is weak. And recognition accuracy is poor. Many methods use deeper, wider, and more complex network structures to improve the feature extraction ability. But it makes the parameters and inference time increase exponentially. In the paper, we introduce Res2Net in target detection task to speaker recognition task and verify its effectiveness and robustness in the speaker recognition task. And we improved and proposed FullRes2Net. It has better multi-scale feature extraction ability without increasing the number of parameters. Then, we proposed the mixed time-frequency channel attention to solve the problems of existing attention methods to improve the shortcomings of convolution itself and further enhance the feature extraction ability of convolutional neural networks. Experiments were conducted on the Voxceleb dataset. The results show that the MTFC-FullRes2Net end-to-end speaker recognition system proposed in this paper effectively improves the feature extraction and generalization ability of the Res2Net. Compared to Res2Net, MTFC-FullRes2Net performance improves by 31.5%. And Compared to ThinResNet-50, RawNet, CNN+Transformer and Y-vector, MTFC-FullRes2Net performance is improved by 56.5%, 14.1%, 16.7% and 23.4%, respectively. And it is superior to state-of-the-art speaker recognition systems that use complex structures. It is a lightweight and more efficient end-to-end architecture and is also more suitable for practical application.

 

起訖頁 075-091
關鍵詞 speaker recognitionres2netattention mechanisms
刊名 電腦學刊  
期數 202306 (34:3期)
DOI 10.53106/199115992023063403006   複製DOI
QR Code
該期刊
上一篇
Traffic Sign Detection Based on Improved YOLOv5
該期刊
下一篇
A Dynamic Task Assignment Optimization Method for Multi-AGV System Based on Genetic Algorithm

高等教育知識庫  新書優惠  教育研究月刊  全球重要資料庫收錄  

教師服務
合作出版
期刊徵稿
聯絡高教
高教FB
讀者服務
圖書目錄
教育期刊
訂購服務
活動訊息
數位服務
高等教育知識庫
國際資料庫收錄
投審稿系統
DOI註冊
線上購買
高點網路書店 
元照網路書店
博客來網路書店
教育資源
教育網站
國際教育網站
關於高教
高教簡介
出版授權
合作單位
知識達 知識達 知識達 知識達 知識達 知識達
版權所有‧轉載必究 Copyright2011 高等教育文化事業股份有限公司  All Rights Reserved
服務信箱:edubook@edubook.com.tw 台北市館前路 26 號 6 樓 Tel:+886-2-23885899 Fax:+886-2-23892500