Skip to main content

Table 2 Token-specific orthographic features extracted by regular expressions

From: Ambiguity and variability of database and software names in bioinformatics

Name

Description

isAcronym

token is an acronym

containsAllCaps

all the letters in the token are capitalised

isCapitalised

token is capitalised

containsCapLetter

token contains at least one capital letter

containsDigits

token contains at least one digit

isAllDigits

token is made up of digits only