閱讀全文 | |
篇名 |
Information Retrieval Using the Reduced Row Echelon Form of a Term-Document Matrix
|
---|---|
並列篇名 | Information Retrieval Using the Reduced Row Echelon Form of a Term-Document Matrix |
作者 | Ufuk Parali、Metin Zontul、Duygu Celik Ertugrul |
英文摘要 | It is getting more difficult to retrieve relevant information regarding the user input query due to the large amount of information in the web. Unlike the conventional information retrieval (IR) algorithms, this study presents a new algorithm – reduced row echelon form IR method (rrefIR) – with higher average similarity precision to get more relevant and noise-free documents. For dimension reduction in the proposed algorithm, singular value decomposition (SVD) is applied on the reduced row echelon form – obtained by utilizing Gauss- Jordan method – of the covariance of term-document matrix (TDM). The rrefIR algorithm outperforms the LSI and COV algorithms with respect to Jaro-Winkler, Overlap, Tanimoto and Jaccard similarity measures in the means of average similarity precision. The physical reason for the better IR performance is the linear independent basis vectors set obtained by Gauss-Jordan operation. This basis set can be considered as the generating roots of the vector space spanned by TDM. Utilizing these vectors increases the latent semantic charateristics of the SVD phase of the proposed IR algorithm. |
起訖頁 | 1037-1046 |
關鍵詞 | Information retrieval、Gauss-Jordan、SVD、Similarity measures |
刊名 | 網際網路技術學刊 |
期數 | 201907 (20:4期) |
出版單位 | 台灣學術網路管理委員會 |
DOI |
|
QR Code | |
該期刊 上一篇
| Novel Attack Tree Analysis Scheme to Assess the Security Risks on the Cloud Platform |
該期刊 下一篇
| BitTorrent Locality-Awareness Application with Colorless ONUs in an Enhanced EPON System |