Nikunj C. Oza's Publications

Sorted by DateClassified by Publication TypeClassified by Research Category

Classification of Aeronautics System Health and Safety Documents

Classification of Aeronautics System Health and Safety Documents. Nikunj C. Oza, J. Patrick Castle, and John Stutz. IEEE Transactions on Systems, Man, and Cybernetics, Part C, 39(6):670–680, 2009.

Download

[PDF]678.1kB  

Abstract

Most complex aerospace systems have many text reports on safety, maintenance, and associated issues. The Aviation Safety Reporting System (ASRS) spans several decades and contains over 700 000 reports. The Aviation Safety Action Plan (ASAP) contains over 12 000 reports from various airlines. Problem categorizations have been developed for both ASRS and ASAP to enable identification of system problems. However, repository volume and complexity make human analysis difficult. Multiple experts are needed, and they often disagree on classifications. Even the same person has classified the same document differently at different times due to evolving experiences. Consistent classification is necessary to support tracking trends in problem categories over time. A decision support system that performs consistent document classification quickly and over large repositories would be useful. We discuss the results of two algorithms we have developed to classify ASRS and ASAP documents. The first is Mariana---a support vector machine (SVM) with simulated annealing, which is used to optimize hyperparameters for the model. The second method is classification built on top of nonnegative matrix factorization (NMF), which attempts to find a model that represents document features that add up in various combinations to form documents. We tested both methods on ASRS and ASAP documents with the latter categorized two different ways. We illustrate the potential of NMF to provide document features that are interpretable and indicative of topics. We also briefly discuss the tool that we have incorporated Mariana into in order to allow human experts to provide feedback on the document categorizations.

BibTeX Entry

@article{ozca09,
	author = {Nikunj C. Oza and J. Patrick Castle and John Stutz},
	title = {Classification of Aeronautics System Health and Safety Documents},
	journal = {IEEE Transactions on Systems, Man, and Cybernetics, Part C},
	volume = {39},
	number = {6},
	pages = {670-680},
	abstract = {Most complex aerospace systems have many text reports on safety, maintenance, and associated issues. The Aviation Safety Reporting System (ASRS) spans several decades and contains over 700 000 reports. The Aviation Safety Action Plan (ASAP) contains over 12 000 reports from various airlines. Problem categorizations have been developed for both ASRS and ASAP to enable identification of system problems. However, repository volume and complexity make human analysis difficult. Multiple experts are needed, and they often disagree on classifications. Even the same person has classified the same document differently at different times due to evolving experiences. Consistent classification is necessary to support tracking trends in problem categories over time. A decision support system that performs consistent document classification quickly and over large repositories would be useful. We discuss the results of two algorithms we have developed to classify ASRS and ASAP documents. The first is Mariana---a support vector machine (SVM) with simulated annealing, which is used to optimize hyperparameters for the model. The second method is classification built on top of nonnegative matrix factorization (NMF), which attempts to find a model that represents document features that add up in various combinations to form documents. We tested both methods on ASRS and ASAP documents with the latter categorized two different ways. We illustrate the potential of NMF to provide document features that are interpretable and indicative of topics. We also briefly discuss the tool that we have incorporated Mariana into in order to allow human experts to provide feedback on the document categorizations.},
	bib2html_pubtype = {Journal Article},
	bib2html_rescat = {Text Mining},
	year = {2009}
}

Generated by bib2html.pl (written by Patrick Riley ) on Fri Mar 26, 2010 22:52:55