Skip to main content

Table 5 Summary of the performance of each indicator

From: Identification of conclusive association entities in biomedical articles

Indicator

Potential in CAE identification

Limitation in CAE identification

TF

TF works well for those entities whose TF = 1 or TF ≥ 4, as non-CAEs tend to have TF = 1, and few of them have TF ≥ 4.

Many CAEs and non-CAEs have TF values falling between 2 and 3.

IDF

IDF values of non-CAEs fall in the IDF spectrum, while nearly no CAEs have IDF values falling in the lower 30% part, making IDF helpful to filter out non-CAEs with lower IDF values.

Many CAEs and non-CAEs have IDF values fall in the middle parts of the spectrum (i.e., between 35 and 65%).

CoOcc

None.

CAEs and non-CAEs tend to have similar CoOcc values.

AvgTF

CAEs tend to have AvgTF > 10% of the maximum AvgTF, while non-CAEs tend to have AvgTF ≤ 10% of the maximum.

None.

TITLE

When compared with non-CAEs, CAEs are more likely to appear in titles of articles.

Most CAEs do not appear in the titles of articles.

AbstractX

None.

CAEs and non-CAEs have somewhat uniform and similar distributions at different positions in the abstract.