Semi-supervised Turkish Text Categorization with Word2Vec, Doc2Vec and FastText Algorithms

Yükleniyor...
Küçük Resim

Tarih

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

IEEE

Erişim Hakkı

info:eu-repo/semantics/closedAccess

Özet

In this study, the performance values of Word2Vec, Doc2Vec and FastText algorithms are compared on the text categorization problem based on a semi-supervised learning technique. The impact of some preprocessing techniques are also analyzed on a corpus that contains approximately 5 million Turkish news documents which are both in labeled and unlabeled manners. Naive Bayes, Support Vector Machines, Artificial Neural Networks, Decision Trees and Logistic Regression classification algorithms are used at the classification phase and the obtained results are shared.

Açıklama

Anahtar Kelimeler

word2vec, doc2vec, fasttext, text categorization

Kaynak

2019 27Th Signal Processing And Communications Applications Conference (Siu)
27th Signal Processing and Communications Applications Conference (SIU) -- APR 24-26, 2019 -- Sivas Cumhuriyet Univ, Sivas, TURKEY

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Onay

İnceleme

Ekleyen

Referans Veren