From: A framework for ontology-based question answering with application to parasite immunology
Orthographic feature | Regular expression |
---|---|
HASDASH | .*-.* |
INITDASH | -.* |
ENDDASH | .*- |
INITCAPS | [A-Z].* |
INITCAPSALPHA | [A-Z][a-z].* |
REALNUMBERS | [-0-9]+[.,]+[0-9.,]+ |
NATURALNUMBER | [0-9]+ |
ALLCAPS | [A-Z]+ |
CAPSMIX | [A-Za-z]+ |
DIGIT | .*[0-9].* |
SINGLEDIGIT | [0-9] |
DOUBLEDIGIT | [0-9][0-9] |
GENEPATT | .*[tbglmjfrnix0-9]+[.][0-9]+.* |
DNASEQUENCE | [ACTG]+ |
HASROMAN | .*\\b[IVXDLCM]+\\b. |
ROMAN | [IVXDLCM]+ |