Improve offensive language detection with ensemble classifiers

dc.contributor.authorEkinci E.
dc.contributor.authorOmurca S.I.
dc.contributor.authorSevim S.
dc.date.accessioned2021-06-14T20:25:10Z
dc.date.available2021-06-14T20:25:10Z
dc.date.issued2020
dc.department[0-Belirlenecek]en_US
dc.description.abstractSharing content easily on social media has become an important communication choice in the world we live. However, in addition to the conveniences it provides, some problems have been emerged because content sharing is not bounded by predefined rules. Consequently, offensive language has become a big problem for both social media and its users. In this article, it is aimed to detect offensive language in short text messages on Twitter. Since short texts do not contain sufficient statistical information, they have some drawbacks. To cope with these drawbacks of the short texts, semantic word expansion based on concept and word-embedding vectors are proposed. Then for classification task, decision tree and decision tree based ensemble classifiers such as Adaptive Boosting, Bootstrap Aggregating, Random Forest, Extremely Randomized Decision Tree and Extreme Gradient Boosting algorithms are used. Also the imbalanced dataset problem is solved by oversampling. Experiments on datasets have shown that the extremely randomized trees which takes word-embedding vectors as input are the most successful with an F-score of 85.66%. © 2020, Ismail Saritas. All rights reserved.en_US
dc.identifier.doi10.18201/ijisae.2020261592
dc.identifier.endpage115en_US
dc.identifier.issn2147-6799
dc.identifier.issue2en_US
dc.identifier.scopus2-s2.0-85091535148en_US
dc.identifier.scopusqualityQ3en_US
dc.identifier.startpage109en_US
dc.identifier.trdizinid371464en_US
dc.identifier.urihttps://doi.org/10.18201/ijisae.2020261592
dc.identifier.urihttps://hdl.handle.net/11376/3707
dc.identifier.volume8en_US
dc.indekslendigikaynakScopusen_US
dc.indekslendigikaynakTR-Dizinen_US
dc.institutionauthor[0-Belirlenecek]
dc.language.isoenen_US
dc.publisherIsmail Saritasen_US
dc.relation.ispartofInternational Journal of Intelligent Systems and Applications in Engineeringen_US
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanıen_US
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.subjectBabelNet; Ensemble classifiers; Offensive language; Short text classification; Twitter; Word2Vecen_US
dc.titleImprove offensive language detection with ensemble classifiersen_US
dc.typeArticleen_US

Dosyalar