Semi-supervised Turkish text categorization with Word2Vec, Doc2Vec and FastText Algorithms [Word2Vec, Doc2Vec ve FastText Algoritmalari ile Yari Ö?reticili Türkçe Metin Siniflandirma]
Yükleniyor...
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Institute of Electrical and Electronics Engineers Inc.
Erişim Hakkı
info:eu-repo/semantics/closedAccess
Özet
In this study, the performance values of Word2Vec, Doc2Vec and FastText algorithms are compared on the text categorization problem based on a semi-supervised learning technique. The impact of some preprocessing techniques are also analyzed on a corpus that contains approximately 5 million Turkish news documents which are both in labeled and unlabeled manners. Naive Bayes, Support Vector Machines, Artificial Neural Networks, Decision Trees and Logistic Regression classification algorithms are used at the classification phase and the obtained results are shared. © 2019 IEEE.
Açıklama
27th Signal Processing and Communications Applications Conference, SIU 2019 -- 24 April 2019 through 26 April 2019 -- -- 151073
Anahtar Kelimeler
Doc2vec, Fasttext, Text categorization, Word2vec, Decision trees, Machine learning, Neural networks, Supervised learning, Support vector machines, Text processing, Classification algorithm, Doc2vec, Fasttext, Logistic regressions, Preprocessing techniques, Semi-supervised learning techniques, Text categorization, Word2vec, Signal processing
Kaynak
27th Signal Processing and Communications Applications Conference, SIU 2019












