Nikunj C. Oza's Publications

Sorted by DateClassified by Publication TypeClassified by Research Category

Multi-label ASRS Dataset Classification Using Semi Supervised Subspace Clustering

Multi-label ASRS Dataset Classification Using Semi Supervised Subspace Clustering. Mohammad Salim Ahmed, Latifur Khan, Nikunj C. Oza, and Mandava Rajeswari. In Conference on Intelligent Data Understanding (CIDU-2010), October 2010.

Download

[PDF]533.1kB  

Abstract

There has been a lot of research targeting text classiŞcation. Many of them focus on a particular characteristic of text data - multi-labelity. This arises due to the fact that a document may be associated with multiple classes at the same time. The consequence of such a characteristic is the low performance of traditional binary or multi-class classiŞcation techniques on multi-label text data. In this paper, we propose a text classiŞcation technique that considers this characteristic and provides very good performance. Our multi-label text classiŞcation approach is an extension of our previously formulated [3] multi-class text classiŞcation approach called SISC (Semi-supervised Impurity based Subspace Clustering). We call this new classiŞcation model as SISC-ML(SISC Multi-Label). Empirical evaluation on real world multi-label NASA ASRS (Aviation Safety Reporting System) data set reveals that our approach outperforms state-of-the- art text classiŞcation as well as subspace clustering algorithms.

BibTeX Entry

@inproceedings{ahkh10,
	author = {Mohammad Salim Ahmed, Latifur Khan, Nikunj C. Oza, and Mandava Rajeswari},
	title = {Multi-label ASRS Dataset Classification Using Semi Supervised Subspace Clustering},
	booktitle = {Conference on Intelligent Data Understanding (CIDU-2010)},
	month = {October},
	abstract = {There has been a lot of research targeting text classiÞcation. Many of them focus 
on a particular characteristic of text data - multi-labelity. This arises due to the fact that a 
document may be associated with multiple classes at the same time. The consequence of such a 
characteristic is the low performance of traditional binary or multi-class classiÞcation techniques on 
multi-label text data. In this paper, we propose a text classiÞcation technique that considers this 
characteristic and provides very good performance. Our multi-label text classiÞcation approach 
is an extension of our previously formulated [3] multi-class text classiÞcation approach called 
SISC (Semi-supervised Impurity based Subspace Clustering). We call this new classiÞcation model 
as SISC-ML(SISC Multi-Label). Empirical evaluation on real world multi-label NASA ASRS 
(Aviation Safety Reporting System) data set reveals that our approach outperforms state-of-the- 
art text classiÞcation as well as subspace clustering algorithms.},
	bib2html_pubtype = {Refereed Conference},
	bib2html_rescat = {Text Mining},
	year = {2010}
}

Generated by bib2html.pl (written by Patrick Riley ) on Sun Mar 20, 2011 23:51:43