Active learning for ontological event extraction incorporating named entity recognition and unknown word handling

Table 2 Proposed algorithm of active learning with TEES

Input: labeled document pool L, unlabeled document pool U, batch size b
// Initialization
E R ₀ = the set of events/relations annotated on L
Learn a TEES model M ₀ from E R ₀
i = 0 // the index of the current round
// Active Learning Loop
while U is not empty:
i += 1
for each document D _{i j} in U:
Document informativity score I(D _{i j})=0
for each sentence S _k in D _{i j}:
Apply M _i−1 to S _k and collect the resultant events/relations set \(ER_{S_{k}}\)
for each event/relation er s.t. er \(\notin ER_{s_k}\):
I(D _{i j}) += informativity score I(S _k,e r)
I(D _{i j}) = I(D _{i j}) / sizeOf(D _{i j})
Rank D _{i j} in U based on I(D _{i j}) and select the top b documents,
designated as B
Remove B from U, add B to L, and add the annotations on B to E R _i−1,
designated as E R _i
Learn a new model M _i from E R _i

ISSN: 2041-1480