A semantic kernel for text classification based on iterative higher-order relations between words and documents

Yükleniyor...
Küçük Resim

Tarih

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Springer Verlag

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

We propose a semantic kernel for Support Vector Machines (SVM) that takes advantage of higher-order relations between the words and between the documents. Conventional approach in text categorization systems is to represent documents as a "Bag of Words" (BOW) in which the relations between the words and their positions are lost. Additionally, traditional machine learning algorithms assume that instances, in our case documents, are independent and identically distributed. This approach simplifies the underlying models, but nevertheless it ignores the semantic connections between words as well as the semantic relations between documents that stem from the words. In this study, we improve the semantic knowledge capture capability of a previous work in [1], which is called χ-Sim Algorithm and use this method in the SVM as a semantic kernel. The proposed approach is evaluated on different benchmark textual datasets. Experiment results show that classification performance improves over the well-known traditional kernels used in the SVM such as the linear kernel (one of the state-of-the-art algorithms for text classification system), the polynomial kernel and the Radial Basis Function (RBF) kernel.

Açıklama

Ganiz, Murat Can (Dogus Author) -- Conference full title: 13th International Conference on Artificial Intelligence and Soft Computing, ICAISC 2014; Zakopane; Poland; 1 June 2014 through 5 June 2014

Anahtar Kelimeler

Higher - Order Paths, Machine Learning, Semantic Kernel, Support Vector Machine, Text Classification

Kaynak

Lecture Notes in Computer Science

WoS Q Değeri

Scopus Q Değeri

Cilt

8467

Sayı

1

Künye

Altınel, B., Ganiz, M. C., & Diri, B. (2014). A semantic kernel for text classification based on iterative higher-order relations between words and documents. In L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz & L. A. Zadeh (Eds.), Lecture Notes in Computer Science (Volume 8467) (pp. 505-517). Cham: Springer. https://dx.doi.org/10.1007/978-3-319-07173-2_43

Onay

İnceleme

Ekleyen

Referans Veren