Skip to main content

Table 2 Data sets used in evaluation

From: FlexiTerm: a flexible term recognition method

Data set

Size (KB)

Documents

Sentences

Tokens

Distinct tokens

Distinct stems

1

145

100

906

24,096

3,430

2,720

2

150

100

949

26,174

3,837

3,049

3

169

100

1,949

40,461

4,404

3,422

4

300

100

3,022

55,845

5,402

4,504

5

73

100

960

13,093

946

824

  1. Quantitative description of the corpora.