Journal of Biomedical Semantics

Table 2 The data sets used in our experiments

From: A cascade of classifiers for extracting medication information from discharge summaries

Data Sets	# of Summaries	# of Entries	# of Fields	# of Name	# of Dose	# of Freq	# of Mode	# of Duration	# of Reason
Training set	110	5970 (54.3)	14886 (135.3)	5684 (51.7)	2929 (26.6)	2740 (24.9)	2146 (19.5)	302 (2.7)	1085 (9.9)
Dev set	35	2401 (68.6)	5988 (171.1)	2302 (65.8)	1163 (33.2)	1096 (31.3)	880 (25.1)	111 (3.2)	436 (12.5)
Test set	251	8936 (35.6)	22041 (87.8)	8495 (33.8)	4387 (17.5)	3999 (15.9)	3307 (13.2)	511 (2.0)	1342 (5.3)

The numbers in parentheses are the average numbers of entries or fields per discharge summary.

Back to article page

ISSN: 2041-1480

Contact us

General enquiries: journalsubmissions@springernature.com