Skip to main content

Table 5 Summary of the performance of each indicator

From: Identification of conclusive association entities in biomedical articles

Indicator Potential in CAE identification Limitation in CAE identification
TF TF works well for those entities whose TF = 1 or TF ≥ 4, as non-CAEs tend to have TF = 1, and few of them have TF ≥ 4. Many CAEs and non-CAEs have TF values falling between 2 and 3.
IDF IDF values of non-CAEs fall in the IDF spectrum, while nearly no CAEs have IDF values falling in the lower 30% part, making IDF helpful to filter out non-CAEs with lower IDF values. Many CAEs and non-CAEs have IDF values fall in the middle parts of the spectrum (i.e., between 35 and 65%).
CoOcc None. CAEs and non-CAEs tend to have similar CoOcc values.
AvgTF CAEs tend to have AvgTF > 10% of the maximum AvgTF, while non-CAEs tend to have AvgTF ≤ 10% of the maximum. None.
TITLE When compared with non-CAEs, CAEs are more likely to appear in titles of articles. Most CAEs do not appear in the titles of articles.
AbstractX None. CAEs and non-CAEs have somewhat uniform and similar distributions at different positions in the abstract.