Skip to main content

Table 3 10-fold cross validation performance on different feature sets combinations. (Feature sets: (a) Word n-grams; (b) POS tags; (c) Clusters; F: F-1 score; P: precision; R: recall; for the categories that do not indicate the metric, F-1 score are used)

From: Optimization on machine learning based approaches for sentiment analysis on HPV vaccines related tweets

Feature sets

(a)

(a) + (b)

(a) + (c)

(a) + (b) + (c)

Micro-averaging

F

0.7208

0.7263

0.7255

0.73

Macro-averaging

P

0.5402

0.5438

0.5396

0.5477

R

0.4386

0.4468

0.4442

0.4576

F

0.4841

0.4905

0.4872

0.4986

Unrelated

0.8599

0.864

0.859

0.8618

Neutral

0.6181

0.6226

0.625

0.6231

Positive

0.7021

0.7098

0.7123

0.7136

NegSafety

0.7277

0.734

0.7357

0.7542

NegEfficacy

0.2593

0.3214

0.2593

0.3793

NegCost

0

0

0

0

NegResistant

0

0

0

0

NegOthers

0.4645

0.4614

0.4724

0.4753