Skip to main content

Table 2 The data sets used in our experiments

From: A cascade of classifiers for extracting medication information from discharge summaries

Data Sets # of Summaries # of Entries # of Fields # of Name # of Dose # of Freq # of Mode # of Duration # of Reason
Training set 110 5970 (54.3) 14886 (135.3) 5684 (51.7) 2929 (26.6) 2740 (24.9) 2146 (19.5) 302 (2.7) 1085 (9.9)
Dev set 35 2401 (68.6) 5988 (171.1) 2302 (65.8) 1163 (33.2) 1096 (31.3) 880 (25.1) 111 (3.2) 436 (12.5)
Test set 251 8936 (35.6) 22041 (87.8) 8495 (33.8) 4387 (17.5) 3999 (15.9) 3307 (13.2) 511 (2.0) 1342 (5.3)
  1. The numbers in parentheses are the average numbers of entries or fields per discharge summary.