From: Identification of conclusive association entities in biomedical articles
Indicator | Potential in CAE identification | Limitation in CAE identification |
---|---|---|
TF | TF works well for those entities whose TF = 1 or TF ≥ 4, as non-CAEs tend to have TF = 1, and few of them have TF ≥ 4. | Many CAEs and non-CAEs have TF values falling between 2 and 3. |
IDF | IDF values of non-CAEs fall in the IDF spectrum, while nearly no CAEs have IDF values falling in the lower 30% part, making IDF helpful to filter out non-CAEs with lower IDF values. | Many CAEs and non-CAEs have IDF values fall in the middle parts of the spectrum (i.e., between 35 and 65%). |
CoOcc | None. | CAEs and non-CAEs tend to have similar CoOcc values. |
AvgTF | CAEs tend to have AvgTF > 10% of the maximum AvgTF, while non-CAEs tend to have AvgTF ≤ 10% of the maximum. | None. |
TITLE | When compared with non-CAEs, CAEs are more likely to appear in titles of articles. | Most CAEs do not appear in the titles of articles. |
AbstractX | None. | CAEs and non-CAEs have somewhat uniform and similar distributions at different positions in the abstract. |