篇名 |
A Method of Detecting Approximate Repetitive News Documents
|
---|---|
並列篇名 | A Method of Detecting Approximate Repetitive News Documents |
作者 | Xueping Liang、Xiaojun Wen |
英文摘要 | In view of the phenomenon of too much repeated webpage on the Internet, this paper proposes an approximately duplicate webpage detection algorithm and system , which combined multi-feature fingerprint cluster detection with document similarity detection. In this scheme, the multi-feature fingerprint cluster detection is used first to ensure the precision and efficiency of the algorithm; for small portion of the document that not be recalled, approximately duplicate webpage detection algorithm is used to guarantee the recall rate. The scheme has good improvements in the aspects of precision and recall rate, and at the same time has a good balance on performance. |
起訖頁 | 104-109 |
關鍵詞 | approximate repetition of documents、document clusters、multi-feature fingerprint clusters |
刊名 | 電腦學刊 |
期數 | 201804 (29:2期) |
DOI |
|
QR Code | |
該期刊 上一篇
| Signal decimation representation associate with the algebraic signal processing |
該期刊 下一篇
| Research on Electric Power Monitoring System Based on Wireless Big Data Platform |