Skip to main content

Table 1 HCLS core statistics [32] of evaluated datasets, vocabularies and extracted schemata

From: Enabling ad-hoc reuse of private data repositories through schema extraction

 

HCLS Metric

6.6.1.1

6.6.1.2

6.6.1.3

6.6.1.4

6.6.1.5

6.6.1.6

6.6.1.7

 

number of unique

triples

typed entities

subjects

properties

objects

classes

literals

LOV

Vocabulary Corpus

833834

129827

171168

1209

145498

1469

180680

PRs

Dataset

150000

20000

20000

20

10003

3

70717

Vocabulary

schema.org (1)

8427

1617

1619

15

476

31

3193

 

foaf (2)

631

84

86

15

38

9

154

 

merge of (1), (2)

9058

1701

1705

23

508

38

3335

Schema

directly instantiated

47

23

23

3

5

2

0

 

locally inferred

576

95

95

13

71

9

118

 

LOV inferred

2345

208

208

87

379

16

850

GenDR

Dataset

11609

1123

1123

24

1232

13

5158

Vocabulary

GenDR Vocabulary

192

20

20

8

6

5

116

Schema

directly instantiated

361

37

37

10

16

7

105

 

locally inferred

380

37

37

10

16

7

124

 

LOV inferred

911

71

71

58

127

12

370

Orphanet

Dataset

377947

28871

28871

38

42891

29

144773

Vocabulary

Orphanet Vocabulary

402

40

40

9

7

5

239

Schema

directly instantiated

799

67

67

12

41

7

217

 

locally inferred

840

68

68

12

41

7

256

 

LOV inferred

1380

102

102

59

153

12

506

Homologene

Dataset

7189742

869981

869981

14

1420471

10

2865019

Vocabulary

Homologene Vocabulary

62

7

7

8

6

5

38

Schema

directly instantiated

184

24

24

10

13

7

40

 

locally inferred

190

24

24

10

13

7

46

 

LOV inferred

721

58

58

58

124

12

292