From: Large scale biomedical texts classification: a kNN and an ESA-based approaches
Feature | Description | Information gain |
---|---|---|
Feature 1 | Number of neighbours in which the label is assigned | 0.16 |
Feature 2 | Sum of similarity scores between the document and all the neighbours’ document where the label appears | 0.17 |
Feature 3 | Check whether all constituted tokens of the label appear in the target document | 0.01 |
Feature 4 | Check whether one of the label entries appears in the target document | 0.03 |
Feature 5 | Frequency of the label if it is contained in the document | 0.03 |
Feature 6 | Check if the label is contained in the document title | 0.02 |