Skip to main content

Table 1 HCLS core statistics [32] of evaluated datasets, vocabularies and extracted schemata

From: Enabling ad-hoc reuse of private data repositories through schema extraction

  HCLS Metric 6.6.1.1 6.6.1.2 6.6.1.3 6.6.1.4 6.6.1.5 6.6.1.6 6.6.1.7
  number of unique triples typed entities subjects properties objects classes literals
LOV Vocabulary Corpus 833834 129827 171168 1209 145498 1469 180680
PRs Dataset 150000 20000 20000 20 10003 3 70717
Vocabulary schema.org (1) 8427 1617 1619 15 476 31 3193
  foaf (2) 631 84 86 15 38 9 154
  merge of (1), (2) 9058 1701 1705 23 508 38 3335
Schema directly instantiated 47 23 23 3 5 2 0
  locally inferred 576 95 95 13 71 9 118
  LOV inferred 2345 208 208 87 379 16 850
GenDR Dataset 11609 1123 1123 24 1232 13 5158
Vocabulary GenDR Vocabulary 192 20 20 8 6 5 116
Schema directly instantiated 361 37 37 10 16 7 105
  locally inferred 380 37 37 10 16 7 124
  LOV inferred 911 71 71 58 127 12 370
Orphanet Dataset 377947 28871 28871 38 42891 29 144773
Vocabulary Orphanet Vocabulary 402 40 40 9 7 5 239
Schema directly instantiated 799 67 67 12 41 7 217
  locally inferred 840 68 68 12 41 7 256
  LOV inferred 1380 102 102 59 153 12 506
Homologene Dataset 7189742 869981 869981 14 1420471 10 2865019
Vocabulary Homologene Vocabulary 62 7 7 8 6 5 38
Schema directly instantiated 184 24 24 10 13 7 40
  locally inferred 190 24 24 10 13 7 46
  LOV inferred 721 58 58 58 124 12 292