From: De-identifying free text of Japanese electronic health records
Dataset name | MedNLP | Dummy-EHRs | Pathology Reports |
---|---|---|---|
# of documents | 50 reports | 32 pairs of records and summaries | 1000 reports |
# of sentences | 2244 | 8183 | 3012 |
# of tokens | 42,621 | 154,132 | 194,449 |
# of all tags | 490 | 3017 | 295 |
# of age tags | 56 | 39 | 0 |
# of hospital tags | 75 | 170 | 31 |
# of person tags | 0 | 135 | 224 |
# of sex tags | 4 | 16 | 0 |
# of time tags | 355 | 2657 | 40 |
Example in original Japanese text | 工場に勤めている 64歳 の < x > 男性 。 | 施設入所中で寝たきりの 86歳 | <<院外標本 <h > 静大皮フ科クリニック</h > 、 < p > 桑田 智</p> |
Example translated into English | A < a > 64-year-old</a > <x > man</x > works in a factory | An <a > 86-year-old</a > <x > woman</x > bedridden in a nursing home. Total assistance required | <<Ex-hospital sample < h > Shizudai Dermatology Clinic</h > , < p > Satoshi Kuwata</p> |