閱讀全文 | |
篇名 |
A Study on Text Classification: Term Weighting Algorithm Analysis
|
---|---|
並列篇名 | A Study on Text Classification: Term Weighting Algorithm Analysis |
作者 | Kuan-Hua Tseng、Chun-Hung Richard Lin、Jain-Shing Liu、Chih-Ming Andrew Huang、Yue-Han Wang |
英文摘要 | With the advancement of digital recording and storing technology, plus the huge growth of world wide web, people nowadays use digital texts instead of paper to write and record. In order to realize more text applications, the technology of text classification is gradually gaining attention recently. To achieve automatic text classification through machine learning, the related five technologies, including pre-processing, feature extraction, feature selection, term weighting and classification algorithm, are often discussed as well by many researches. In this paper, we are going to explore the impact of term weighting on text classification. Term weighting is definitely a very important part of text classification. The calculated weight should directly reflect the importance of the term in entire text to allow machine learning to achieve the best classified result. We applied some common term weighting methods to several pre-defined datasets and conducted the experiments. Instead of intuitively considering that the value of weight represents how important it is, it turned out that the result shows the term actually may not as important as the high scored weight represents. |
起訖頁 | 311-325 |
關鍵詞 | Text classification、Term weighting、Supervised term weighting |
刊名 | 網際網路技術學刊 |
期數 | 202103 (22:2期) |
出版單位 | 台灣學術網路管理委員會 |
DOI |
|
QR Code | |
該期刊 上一篇
| Survey on Communication for Mobile Sinks in Wireless Sensor Networks: Mobility Pattern Perspective |
該期刊 下一篇
| Blockchain-enabled Charging Scheduling for Unmanned Vehicles in Smart Cities |