Informatics Report Series


Report   

EDI-INF-RR-0121


Related Pages

Report (by Number) Index
Report (by Date) Index
Author Index
Institute Index

Home
Title:Rough Set-Aided Keyword Reduction for Text Categorisation
Authors: Alexios Chouchoulas ; Qiang Shen
Date:May 2001
Publication Title:Applied Artificial Intelligence
Volume No:15(9) Page Nos:843-873
Abstract:
The volume of electronically stored information increases exponentially as the state of the art progresses. Automated Information Filtering (IF) and Information Retrieval (IR) systems are therefore acquiring rapidly increasing prominence. However, such systems sacrifice efficiency to boost effectiveness. Such systems typically have to cope with sets of vectors of many tens of thousands of dimensions. Rough Set (RS) theory can be applied to reducing the dimensionality of data used in IF/IR tasks, by providing a measure of the information content of datasets with respect to a given classification. This can aid IF/IR systems that rely on the acquisition of large numbers of term weights or other measures of relevance. This paper investigates the applicability of RS theory to the IF/IR application domain and compares this applicability with respect to various existing TC techniques. The ability of the approach to generalise given a minimum of training data is also addressed. The background of RS theory is presented, with an illustrative example to demonstrate the operation of the RS-based dimensionality reduction. A modular system is proposed that allows the integration of this technique with a large variety of different IF/IR approaches. The example application, categorisation of E-mail messages, is described. Systematic experiments and their results are reported and analysed.
Copyright:
2002 by The University of Edinburgh. All Rights Reserved
Links To Paper
No links available
Bibtex format
@Misc{EDI-INF-RR-0121,
author = { Alexios Chouchoulas and Qiang Shen },
title = {Rough Set-Aided Keyword Reduction for Text Categorisation},
year = 2001,
month = {May},
volume = {15(9)},
pages = {843-873},
}


Home : Publications : Report 

Please mail <reports@inf.ed.ac.uk> with any changes or corrections.
Unless explicitly stated otherwise, all material is copyright The University of Edinburgh