Skip to main content

Table 1 Importance of each feature for the prediction according to the Information Gain measure

From: Large scale biomedical texts classification: a kNN and an ESA-based approaches

Feature Description Information gain
Feature 1 Number of neighbours in which the label is assigned 0.16
Feature 2 Sum of similarity scores between the document and all the neighbours’ document where the label appears 0.17
Feature 3 Check whether all constituted tokens of the label appear in the target document 0.01
Feature 4 Check whether one of the label entries appears in the target document 0.03
Feature 5 Frequency of the label if it is contained in the document 0.03
Feature 6 Check if the label is contained in the document title 0.02