Table 7 Analysis of three best runs on the MEDLINE data

From: Thematic clustering of text documents using an EM-based approach

  Number of clusters p-value
Parkinson's Disease 46.0 2.56E-10
Huntington's Disease 21.5 4.11E-11
  1. For each MEDLINE dataset, clustering was performed 500 times, and the best run was selected. The number of clusters and the average of p-values of the 10 strongest MeSH terms in each cluster were recorded. This was repeated three times, and averages of the resulting values are given in this table.