From: Exploiting document graphs for inter sentence relation extraction
Information | Configuration | Parameters | |
---|---|---|---|
Dependency embeddings | Dependency type | LUT \(\mathbf {W}^{e}_{typ}\) size 72×150 | 10800 |
 | Dependency direction | LUT \(\mathbf {W}^{e}_{dir}\) size 2×150 | 300 |
Token embeddings | FastText embeds | Pre-trained 300−dim vector | − |
 | Character embeddings | LUT \(\mathbf {W}^{e}_{c}\) size 85×50 | 4250 |
 |  | biLSTM with 50 units | 40400 |
 | POS tag | LUT \(\mathbf {W}^{e}_{t}\) size 57×50 | 2850 |
 | WordNet embeds | Fixed spare 45−dim vector | − |
Augmented information | Base distance embeds | 32−dim vector | 32 |
 | Self attention score | We,be transform from 832 dim to scalar | 833 |
 | Heuristic attention | Linear | − |
 | Kernel filters | 100 filters size 832×1 | 83300 |
Shared weight-CNN | 128 filters each region-size (1,2,3) | 2056320 | |
Classifier | Fully-connected MLP | Do not use | − |
 | Softmax | 2 classes | 768 |
Total number of parameters | 2199853 |