Skip to main content

Defining health data elements under the HL7 development framework for metadata management



Health data from different specialties or domains generallly have diverse formats and meanings, which can cause semantic communication barriers when these data are exchanged among heterogeneous systems. As such, this study is intended to develop a national health concept data model (HCDM) and develop a corresponding system to facilitate healthcare data standardization and centralized metadata management.


Based on 55 data sets (4640 data items) from 7 health business domains in China, a bottom-up approach was employed to build the structure and metadata for HCDM by referencing HL7 RIM. According to ISO/IEC 11179, a top-down approach was used to develop and standardize the data elements.


HCDM adopted three-level architecture of class, attribute and data type, and consisted of 6 classes and 15 sub-classes. Each class had a set of descriptive attributes and every attribute was assigned a data type. 100 initial data elements (DEs) were extracted from HCDM and 144 general DEs were derived from corresponding initial DEs. Domain DEs were transformed by specializing general DEs using 12 controlled vocabularies which developed from HL7 vocabularies and actual health demands. A model-based system was successfully established to evaluate and manage the NHDD.


HCDM provided a unified metadata reference for multi-source data standardization and management. This approach of defining health data elements was a feasible solution in healthcare information standardization to enable healthcare interoperability in China.


Accurate and comprehensive information structures are the key point for biomedical and healthcare information exchanges. To realize information sharing, there must be a standardized method to represent the information. Novel patterns developed for this representation makes semantic information sharing a reality. The ontology is the most popular method that provides the basis for the information model classes [1, 2]. Information models that express the relationships among classes can provide an accurate context for data semantics expression [3, 4]. The Health Level Seven International (HL7) standards have become universal for the exchange, integration, sharing and retrieval of health information [5,6,7]. The HL7 Development Framework (HDF) is a framework for modelling and administrative processes, and deliverables used by HL7 to produce specifications that are used by the healthcare information management community to overcome challenges and barriers to interoperability among computerized healthcare-related information systems [8,9,10,11]. HL7 version 3 (v3) is based on HDF methodology and generates messages and electronic documents for the clinical information exchange [12,13,14,15]. The HL7 Reference Information Model (RIM) which is the main core in HL7 v3 covers all aspects of healthcare information and can be compatible with existing data standards and knowledge models and thus can serve as the foundation for information integration across platforms and systems [9, 16, 17]. RIM defines a series of classes and subclasses, attributes, data types and value domains related to medical activities; furthermore, RIM provides a clear, common context and semantics that all standards and norms can cohere with [6, 18]. RIM has been introduced to China and released as a national standard in 2013 [19]. There have been ongoing efforts in RIM modelling and application, most of which focus on ontological engineering of RIM [20,21,22], clinical data interoperability [23,24,25,26], domain knowledge representation [27,28,29,30], database development [31], and knowledge and data integration [29, 32], while few studies seek to implement and validate RIM for data collection and management on the countrywide level.

Chinese Health Standards Commission developed and issued a health data element dictionary in 2011 as a national health data standard [33]. The dictionary gathers data elements (DEs) recorded and collected in various domains of health sectors. DEs were described through six properties, including data element identifier, name, definition, permitted values, data type and format [34, 35]. However, some DEs are mutually inclusive, intersect, or overlap because they usually come from different business collection forms (e.g. chronic disease management, planned immunization, women’s healthcare). The consistency and comparability for data exchange and sharing cannot be guaranteed [36]. Moreover, with further development of health services demands and information technologies, more DEs will be created from different fields, projects and organizations. The infinite increase of DEs poses a challenge for their centralized management and standardization.

Healthcare data management is a domain with various proposed solutions and knowledge that accumulated through years of research. Many efforts which try to facilitate information semantic interoperability have already been developed. HL7 Fast Health Interoperability Resources (FHIR) takes a modular approach and represents the atomic/ granular healthcare data (e.g., heart rate, procedure, medication, allergies) as independent modular entities. The main advantage of FHIR is that it’s easier to implement as it uses an API-based approach and a choice of JSON or XML or RDF for representing the data [37, 38]. The IHE Data Exchange (DEX) profile proposed a metadata registry to search and retrieve metadata definitions, and flexible mapping between clinical research and patient care data elements [39]. The ISO/IEC 11179 model provides a standard metadata model for the representation of data elements and provides a methodology for the registration of the descriptions of data elements through this standard model to the metadata registries [40].

Although these standards have a good foundation in enabling semantic interoperability for healthcare data, we continue to use the methodology of HL7 v3 when building the NHDD for three main reasons: firstly, HL7 v3 adopts a series of information models and graphical modeling methods to ensure standard coding and implementation, and enabling semantic interoperability through defined terms and data types. Secondly, RIM is the core of HL7 v3 and highly abstract. It is an international shared information model and is also the root of all information models and structures in v3 development process. Lastly, most importantly, HL7 RIM has been adopted in China and already become a national standard, and is now widely used in the construction of many Chinese medical information systems. To avoid large changes and maintain the consistency of the existing series of standards and applications in China, we continue to use the methodology of HL7 v3 and customize the metadata.

In view of international experiences and general applicability of HL7 methodology in healthcare fields, this study is intended to develop a Health Concept Data Model (HCDM) and National Health Data Dictionary (NHDD) based on HL7 RIM and HDF methodology, and then to develop a model-based information system for convenient metadata collection and management, with the aim to facilitate healthcare information standardization and healthcare interoperability in China.

Implementation and result

HCDM structure and definition

The HCDM adopted three-level architecture of HL7 RIM: class, attribute and data type. Class describes aspects of the health and care business with their significant characteristics through their Attributes and their relationships to other Classes. Attribute describes the properties of Classes and provide common data definitions for classes. Data type defines the allowable values of attributes and what these values “mean”.

HCDM metadata and comparison with HL7 RIM

The construction of HCDM mainly came from HL7 RIM and was adapted based on the needs of the national health system (Table 1). Firstly, six classes and their attributes directly used contents of HL7 RIM. Then 4640 data items from 55 data sets of national health system were classified (through Chinese text classification toolkit THUCTC launched by the Natural Language Processing Laboratory of Tsinghua University [41, 42]) into these six classes of HCDM. Lastly, sub-classes and attributes of HL7 classes were adjusted and optimized according to actual classification results.

Table 1 55 data sets and 7 health business domains


HCDM has the same backbone with six major classes of HL7 RIM: Entity, Role, Rolelink, Participation, Act, Act Relationship. In HCDM, Entity represents the physical things and beings that are of interest to, and take part in health care. Role establishes the roles that entities play as they participate in health care acts. Rolelink represents relationships between individual roles. Participation expresses the context for an act in terms such as who performed it, for whom it was done, where it was done, etc. Act represents the actions that are executed and must be documented as health care is managed and provided. Act Relationship represents the binding of one act to another, such as the relationship between an order for an observation and the observation event as it occurs.

Based on classification results, HCDM reduced 11 subclasses (Entity-living subject, Role-patient, Role-LicensedEntity, Role-Access, Participation-ManagedParticipation, Act-Observation-diagnosticImage, Act-Supply-Diet, Act-Account, Act-ControlAct, Act-Device Task and Act-Working list) and added one subclass (Act-Exposure) to RIM because currently no data is essentially attributed to those reduced subclasses (e.g., Act-ControlAct, Act-Device Task, Role-patient, Role-LicensedEntity). The added subclass (Act-Exposure) which is not listed separately in RIM is currently indispensable for health data management. Classes RoleLink and ActRelationship have no subclasses in HCDM and RIM. Finally, HCDM has 14 subclasses/secondary classes and 1 tertiary class, while RIM has 21 subclasses and 5 tertiary classes (Table 2).

Table 2 Class comparison and reasons for differences between HCDM and HL7 RIM


Attributes of classes in HL7 RIM were also adjusted and trimmed according to the data classifications. Some attributes of classes and subclasses were added or removed in HCDM. For example, administrative division code (used for identifying national administrative districts) and housing type code (used for differentiating family housing types) were added attributes, and RiskCode in class “Entity” was removed because there are no entities about risk information in collected data sets. Eventually, compared with HL7 RIM, 8 attributes which meet current needs of different health fields were added in HCDM including person-nationality code, person-household type code, organization-administrative division code, organization-level code, organization-type code, employee-family income per capita, financial transaction-payer code, financial transaction-way of payment code. The comparison of attributes of class “Entity” between HCDM and RIM are shown in Table 3.

Table 3 Attributes of class Entity between HCDM and HL7 RIM

Data type

Metadata’s data types were referenced to Data Types Specification (R2) [43] of HL7 RIM and made some adjustments. The HL7 v3 data type is purely semantic and the hierarchical structure and attributes’ data types are in the relative high level. In HCDM, the abstract principle is using lower (more specific) rather higher (more general) level at the same condition in order to facilitate formal expression of DEs. Eventually, there are 16 data types in HCDM as follows: II, ED, BL, INT, PQ, Real, MO, URLST, TS, AD, EN, CS, CV, CE, CD and ANY. Some data types are so fundamental that there are no distinguishable semantic components (e.g. BL). The composite data types contain additional data types that are referenced as components or subcomponents (e.g. PQ:value and unit). The attribute ANY is usually avoided to use if possible for its unspecific attribute expression. The data type of the same attribute is also different between in HCDM and in HL7 RIM.

In total, HCDM was developed with 6 classes, 15 sub-classes, 100 attributes and 100 data types. Its framework was expressed by the Unified Modelling Language and shown in Fig. 1, which has been issued as a China’s health industry standard in May 2020 [44].

Fig. 1
figure 1

Framework of HCDM. HCDM has 6 classes, 15 sub-classes,100 attributes and 100 data types. Each class has several attributes and data types to represent its semantics. The green rectangles represent parent classes and the blue ones represent sub-classes. Hollow arrows represent the inheritance relationship from parent class to child class

Data elements derived from HCDM and their description

Data elements were derived by constraining metadata (Class, Attribute and Data type) in HCDM and described according to ISO/IEC 11179 metamodel which defines how a data element can be classified and semantically described, named, identified, stored, retrieved, and managed [45, 46]. A data element comprises two parts in ISO/IEC 11179 metamodel: Data Element Concept and Value Domain. A Data Element Concept joins an Object class (like a person) with its Property (like sex) [47]. The Value Domain is the set of permissible values for one or more data elements. The mapping concept of ISO/IEC 11179 metamodel to HCDM are as follows: the Object Class in ISO/IEC 11179 metamodel corresponds to the Class in the HCDM, the Property of Object Class corresponds to the Attribute of Class, and the data type of Value Domain corresponds to the Data Type of attribute in the HCDM (Table 4).

Table 4 Mapping relationship between ISO/IEC 11179 metamodel and HCDM

Based on the HCDM, national health data dictionary (NHDD), which includes three types of DEs (initial DE, general DE, domain DE), was developed and has also been issued as a China’s health industry standard in May 2020 [48]. Initial DEs were formed by the combination of classes, attributes and data types in HCDM. General DEs were generated by de-composing the semantic components of data types of initial DEs. Domain DEs were defined or specified by constraining general DEs through terms in controlled vocabulary.

Initial data elements

100 initial DEs were extracted from HCDM and represented through data types (foundation, basic and quantities). The initial DEs serves as a bridge between the HCDM and general DEs, and so they have no corresponding specification on the semantic expression. As shown in Fig. 2, the initial DE person’s address is formed by constraining the Class (DE:Object class) “person”, Attribute (DE:Property) “address” of person and the Data type (DE:data type)"AD”.

Fig. 2
figure 2

Abstract process of initial data elements. The left side indicates the initial data elements abstract process, and the right side shows an example for initial data element person’s address, which is formed by constraining the object class “person”, the attribute “address” of person and data type “AD” in the Health Concept Data Model

General data element

General DEs are independent of specific domain context to be maintained at a higher level. 144 general DEs were developed from initial DEs. The mapping method from ISO/IEC 11179 metamodel to the HCDM was as the same as initial DE’s derivation. But data types of general DEs were developed through further specializing initial DEs’data types. Basing on initial DEs’ data types, we unfolded the components of HCDM data types. The general DE was then formed by the combination of initial DE and each unfolded components of Data Type.

Such specialization mainly aimed at ANY which is the data type for value from medical observation. ANY can be specified into quantitative measurements, liter, index values, ranges, ordinals, nominal, etc. Based on actual demand, 19 metadata items were adopted in this work from ISO/IEC 11179 to describe general DEs. Table 5, taking Person Nationality Code as an example, presents standardized description of the general DE.

Table 5 Standardized description of general DE Person Nationality Code

In addition, six categories of representation format for general DEs were also defined according to ISO/IEC 11179–3: text, symbols, values, date, time and code. When some similar DEs appeared repeatedly, only one DE was retained such as code system identifiers and system names which repeated in all general DEs with coded attribute (entity class code, entity code, role code, act code, etc.), only one code system identifiers and system names was retained in NHDD.

Domain DE and Controlled vocabulary

General DEs are largely independent of specific domain context and usually need to be localized before being adopted by domain data developers. Such localization should follow a unified rule to avoid semantic confusion for information sharing. Controlled vocabularies were developed on the basis of the standard Health Information Value Codes (standard number: WS 364) and by referring to HL7 vocabularies [50]. There are currently 12 controlled vocabularies in NHDD: Entity classCode and Entity code, EntitydeterminerCode, Entity URLScheme, Entity telecommunicationAddressUse, Person addressType, Role classCode and Role code, Rolelink code, Participation typeCode, Act classCode and Act code, Act moodCode, Act relationshipCode, and Act statusCode.

The Entity classCode for each object class provides all possible subtypes (can be further subdivided) or instance (can’t be further subdivided) of the object class for localization of the general DEs. The controlled vocabulary Entity classCode provides restrictions for general DEs to be specified into one or more domain DEs. Entity is specialized into instances of human, microorganisms animals plants listed in the controlled vocabularies for the general DEs of Entity classCode and Entity Code. The link between Controlled vocabularies Entity Class Code and Entity Code is shown in Table 6 in which codes are the permissible value set for classCode and code of “Entity” in Fig. 1.

Table 6 Controlled vocabularies Entity Class Code and Entity Code

Consequently, related general DEs can be constrained into specific domain DEs. As shown in Fig. 3, “Entity name” of general DE can be constrained to a domain DE “doctor’s name” based on the term “human”, “doctor”, and to a domain DE “surgeon’s name” based on the term “human”, “surgeon” (subtype of “doctor”) in the vocabulary of “Entity Code” and “Role Code”, and to “operator’s name” based on the term “human”, “operator” in the vocabulary of “Entity Code” and “Participation Code Type”. The “Entity name” of general DE can also be constrained to a domain DE “operation doctor’s name” based on the vocabularies combination (pre-coordinated) of the “Entity Code (term: human)”, “Role Code (term: doctor)” and “Participation Code Type (term: operator)”.

Fig. 3
figure 3

The relationship of general DE, controlled vocabulary and domain DE. “Entity name” of general DE can be constrained to the domain DE “doctor’s name” based on the term “doctor” and the domain DE “surgeon’s name” based on the term “surgeon” (subtype of “doctor”) in the vocabulary of “roleCode”, and to “operator’s name” based on the term “operator” in the vocabulary of “participationCodeType”. The “entity name” ofgeneral DE can also be constrained to the domain DE “operation doctor’s name” based on the vocabularies combination (pre-coordinated) of the “roleCode (term: doctor)” and “participationCodeType (term: operator)”

In total, domain DEs are standardized through 22 metadata items, including 14 data element attributes and 6 value domain attributes, which are all from the ISO/IEC11179 model. Among them, the metadata item named “Metadata Reference” can be related to NHDD and the “Relation Type” can be constrained to the class in HCDM. Value domain attributes indicate the relationship between domain DEs and controlled vocabularies.

The relationships of HCDM, initial DEs, general DEs and domain DEs are shown in Table 7.

Table 7 The process of forming initial data elements, general data elements and domain data elements in the class “Entity”

The web-based system for HCDM

Based on HCDM and NHDD, the web-based system ( was developed to facilitate centralized management for healthcare metadata. Main functions of the system include: data element management (input, search, browse, edit, etc. for data elements and other metadata items, such as data element concepts, value domains, data sets, etc.), import, export of DEs and data sets (excel, word, pdf, XML formats), and system maintenance. Users can be authorized to browse or edit the content of the system. If a user needs to add a new metadata item, or to update an existing one, he/she should apply for user permission firstly, the added or updated metadata must be inspected and approved by authorized organization before publishing.

The system was constructed basing on a cloud architecture and using Java 2 Platform, Enterprise Edition (J2EE). It supports the access from cross-platform, cross-region and cross-network operations, and also supports the standards of simple object access protocol, eXtensible Markup Language (XML), workflow management coalition, etc. Distributed transaction processing mechanism was adopted to ensure a high consistency of distributed operation transactions and information, to prevent data inconsistency caused by the partial server or network runtime failure of distributed system.

The relationships among HCDM, data elements and value domains are connected through web links in the system. The value sets of general DEs are linked to the classification scheme which contains the value codes of general DEs and domain DEs. Figure 4 is a display interface of initial DE in the system, including DE’s Chinese name, English name, data type and edit function. The input and interface of domain DEs are shown in Fig. 5. For instance, by constraining “entity” and “role” (from HCDM) to “person” and “patient” (from controlled vocabularies), general DE “person’s marital status code” will be constrained to the domain DE “patient’s marital status code” accordingly.

Fig. 4
figure 4

A display interface of initial DE in the system, including initial DE’s Chinese name, English name, data type, edit and delete function

Fig. 5
figure 5

The input and revise interface of domain DE in the system. Domain DEs are standardized through 22 metadata descriptions, including 14 data element attributes and 6 value code attributes. Among them, data element attributes reflect relationships among domain DEs, HCDM and NHDD. Value domain attributes reflect the relationship between domain DEs and controlled vocabularies


Our research is focused on developing the HCDM and NHDD to manage healthcare metadata. There are some advantages in the paper. Firstly, the approach to constrain the metadata has potential to use other projects such as HL7 FHIR, IHE DEX profile to enable semantic interoperability because our domain-specific metadata appears little different from ISO/IEC 11179 metadata registry approach.

Secondly, when other healthcare organizations want to develop their own specific information systems based on system of HCDM and NHDD, general DEs can be specified or localized in the information system for data collection, representation, storage and exchange. Through data element specialization, the definitions for general data elements in the dictionary are constrained consistently to fit specific scenarios by complying with the controlled vocabularies. The dictionary plays a unified reference role for data element specifications of various domains in this process, in which the meaning of data from multiple sources are consistent or at least comparable.

Thirdly, the object classes in the model can be specified step by step following the hierarchy of classes. The volume increase of domain DEs becomes manageable through the constraint of controlled vocabularies, and furthermore, domain DEs have a high degree of semantic consistency by these metadata.

Lastly, HCDM and NHDD can be extended and improved according to future information needs. Compared with HL7 RIM, the HCDM is better suited to practical needs of health data standards management in China. The classes and attributes of HCDM can be appropriately adjusted and extended with the growth or change of health metadata, but the core class will be stable to ensure consistency with related standards. In addition, domain metadata items can be added or revised along with the changes in the health data itself.

The literature [51] achieves syntactic and semantic interoperability between clinical care and research domains by developing a federated semantic metadata registry framework. Although our research is also aimed to develop a metadata framework to enable semantic interoperability, their mechanism is mainly based on the ISO 11179, whereas ours mainly based on HL7 RIM in developing the national HCDM and made a standardized description of metadata according to ISO/IEC 11179.

Some limitations must be acknowledged in the paper. One is that some emerging standards such as HL7 FHIR have not yet been adopted in our development process, and there would be challenges in maintaining consistency with existing standards and achieving interoperability with other international projects in the future. In subsequent work, we will consider those standards such as FHIR and IHE DEX in standard updating according to actual needs. The other is that, despite the availability of the web-based systems, the creation of the standardized domain DEs is relatively complex and we need to strengthen staff training and advancing the implementation process.


In summary, based on HL7 RIM and actual health services demands, we built the HCDM to provide a unified metadata reference for multi-source data standardization and management, and then developed a web-based system to for its implementation and evaluation. Through a period of practical use, this project has been proved feasible in its designed function.


Health data standards were adapted based on the needs of the national health system. 55 data sets (4640 data items) were used as the main data source to establish HCDM, which are currently categorized into 7 health business domains (Table 1). Data sets are related to medical activities enacted by the Chinese National Health Information Standard Committee [52]. We are mainly concerned with the health information of individuals, so data sets of health supervision which are more about information of groups were removed from the data source.

The development process and its implementation of this work mainly included 6 steps as follows (Fig. 6):

Fig. 6
figure 6

The work process of HCDM, NHDD and its implementation. There are mainly 6 steps for our work process: step 1 establishes the HCDM, step 2 extracts the initial DE, step 3 constructs the general DE, step 4 develops controlled vocabularies and domain Des, step 5 develops the web-based system and step 6 evaluates and optimize HCDM and NHDD

Step 1: Establish the HCDM. The HCDM establishment mainly came from HL7 RIM and Chinese actual health information needs, and adjusted and optimized basing on the classification results of data items. Firstly, six classes and their attributes directly used the contents of HL7 RIM’s classes. Secondly 4640 data items from 55 data sets were classified into six classes of HCDM. Subclasses and attributes of HL7 classes were adjusted and trimmed according to actual classification results.

Step 2: Extract the initial DE according to the knowledge on the ontological representation of the ISO/IEC11179 metamodel and the HCDM. The mapping relationships were found between ISO/IEC 11179 metamodel and HCDM to describe data elements.

Step 3: Construct the general DE. The generation of general data elements was constrained by the HCDM and the initial DE. The normalized description of general DEs adopted ISO/IEC 11179 metamodel.

Step 4: Develop controlled vocabularies and domain DEs. Based on standard WS 364 and HL7 vocabularies, controlled vocabularies (value sets) were developed to ensure that all the data items have been included in selected data sets in developing domain DEs. As such, all general DEs and their value sets were standardized to form NHDD.

Step 5: Develop the web-based system. Based on HCDM and NHDD, a web-based system was developed to implement the centralized management for healthcare metadata, and also to evaluate and optimize the HCDM and NHDD. The system is running on the Chinese Health Information Standard Portal and is managed by the national health statistics and information centre.

Step 6: Evaluate and optimize HCDM and NHDD. Based on problems occurred in system’s construction and implementation, the model and DEs in NHDD were further adjusted and optimized to meet actual requirements in health information interoperability.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.



Health Level Seven


HL7 Development Framework

HL7 v3:

HL7 version 3


Data element


Reference Information Model


Health Concept Data Model


National Health Data Dictionary


Unified Code for Units of Measure


International Organization for Standardization/International Electrotechnical Commission


Wei Sheng (Standard)


Integrating the Healthcare Enterprise Data Element Exchange


HL7 Fast Health Interoperability Resources


  1. Moner D, Maldonado JA, Robles M. Archetype modeling methodology. J Biomed Inform. 2018;79:71–81.

    Article  Google Scholar 

  2. Shvaiko P, Euzenat J. Ontology matching: state of the art and future challenges. IEEE Trans Knowl Data Eng. 2013;25(1):158–76.

    Article  Google Scholar 

  3. Topaz M, Seger DL, Goss F, Lai K, Slight SP, Lau JJ, et al. Standard information models for representing adverse sensitivity information in clinical documents. Methods Inf Med. 2016;55(2):151–7.

    Article  Google Scholar 

  4. Gesner E, Collins SA, Rocha R. Pain documentation: validation of a reference model. Stud Health Technol Inform. 2015;216:805–9. PMID: 26262163.

  5. Health Level Seven. HL7 Standards. Accessed 23 Feb 2022.

  6. Priyatna F, Alonso-Calvo R, Paraiso-Medina S, Corcho O. Querying clinical data in HL7 RIM based relational model with morph-RDB. J Biomed Semantics. 2017;8(1):49.

    Article  Google Scholar 

  7. Martínez-García JA, Escalona MJ, Parra-Calderón CL. Working with the HL7 metamodel in a Model Driven Engineering context. J Biomed Inform. 2015;57:415–24.

    Article  Google Scholar 

  8. Health Level seven. HL7 Development Framework. Accessed 23 Feb 2022.

  9. Cruz WA, Garcia R. Modeling of ubiquitous technology integration process in health services. Annu Int Conf IEEE Eng Med Biol Soc. 2010;2010:446–9.

    Article  Google Scholar 

  10. Meehan RA, Mon DT, Kelly DNPMK, Rocca M, Dickinson G, MSc JR, et al. Increasing EHR system usability through standards: conformance criteria in the HL7 EHR-system functional model. J Biomed Inform. 2016;63:169–73.

    Article  Google Scholar 

  11. McClay J, Park P, Marr SD, Langford LH. The HL7 standards-based model of emergency care information. Stud Health Technol Inform. 2013;192:1180. PMID: 23920954.

  12. Slavov V, Rao P, Paturi S, Swami TK, Barnes M, Rao D, et al. A new tool for sharing and querying of clinical documents modeled using HL7 version 3 standard. Comput Methods Prog Biomed. 2013;112(3):529–52.

    Article  Google Scholar 

  13. Beeler GW. HL7 version 3 – an object-oriented methodology for collaborative standards development. Int J Med Inform. 1998;48(1):151–61.

    Article  Google Scholar 

  14. Kuo JW, Kuo AM. Integration of health information systems using HL7: a case study. Stud Health Technol Inform. 2017;234:188–94. PMID: 28186039

  15. Ott S, Rinner C, Duftschmid G. Expressing patient selection criteria based on HL7 V3 templates within the open-source tool ART-DECOR. Stud Health Technol Inform. 2019;260:226–33. PMID: 31118342.

  16. Cosío-León MA, Ojeda-Carreño D, Nieto-Hipólito JI, Ibarra-Hernández JA. The use of standards in embedded devices to achieve end to end semantic interoperability on health systems. Comp Stand Inter. 2018;57:68–73.

    Article  Google Scholar 

  17. Health Level Seven. HL7 Reference Information Model. Accessed 23 Feb 2022.

  18. Orgun B, Vu J. HL7 ontology and mobile agents for interoperability in heterogeneous medical information systems. Comput Biol Med. 2006;36(7-8):817–36.

    Article  Google Scholar 

  19. General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China. GB/T 30107–2013 HL7 V3 Reference Information Model [S].

  20. Iqbal AM. An OWL-DL Ontology for the HL7 Reference Information Model. Toward Useful Services for Elderly and People with Disabilities. ICOST 2011. Lect Notes Comput Sci. 6719;168–75. Springer, Berlin, Heidelberg.

  21. Calvo RA, Rey DP, Medina SP, Claerhout B, Hennebert P, Bucur A. Enabling semantic interoperability in multi-centric clinical trials on breast cancer. Comput Methods Prog Biomed. 2015;118(3):322–9.

    Article  Google Scholar 

  22. Blobel BG, Engel K, Pharow P. Semantic interoperability –HL7 version 3 compared to advanced architecture standards. Methods Inf Med. 2006;45(4):343–53. PMID: 16964348.

  23. Alonso-Calvo R, Paraiso-Medina S, Perez-Rey D, Alonso-Oset E, Stiphout RV, Yu S, et al. A semantic interoperability approach to support integration of gene expression and clinical data in breast cancer. Comput Biol Med. 2017;87:179–86.

    Article  Google Scholar 

  24. Ellouze AS, Bouaziz R, Ghorbel H. Integrating semantic dimension into openEHR archetypes for the management of cerebral palsy electronic medical records. J Biomed Inform. 2016;63:307–24.

    Article  Google Scholar 

  25. Viangteeravat T, Anyanwu MN, Nagisetty VR, Kuscu E, Sakauye ME, Wu DJ. Clinical data integration of distributed data sources using health level seven (HL7) v3-RIM mapping. J Clin Bioinformatics. 2011;1(1):32.

    Article  Google Scholar 

  26. Rico-Diez A, Aso S, Perez-Rey D, Alonso-Calvo R, Bucur A, Claerhout B, Maojo V. SNOMED CT normal form and HL7 RIM binding to normalize clinical data from cancer trials. Int Conf BioInform BioEng. 2013.

  27. Goossen WT, Ozbolt JG, Coenen A, Park HA, Mead C, Ehnfors M, et al. Development of a provisional domain model for the nursing process for use within the health level 7 reference information model. J Am Med Inform Assoc. 2004;11(3):186–94.

    Article  Google Scholar 

  28. Goossen W. Model once, use multiple times: reusing HL7 domain models from one domain to the other. Stud Health Technol Inform. 2004;107(Pt 1):366–70. PMID: 15360836.

  29. Perez-Rey D, Alonso-Calvo R, Paraiso-Medina S, Munteanu CR, Garcia-Remesal M. SNOMED2HL7: a tool to normalize and bind SNOMED CT concepts to the HL7 reference information model. Comput Methods Prog Biomed. 2017;149:1–9.

    Article  Google Scholar 

  30. Moreira MWL, Rodrigues JJPC, Sangaiah AK, Al-Muhtadi J, Korotaev V. Semantic interoperability and pattern classification for a service-oriented architecture in pregnancy care. Future Gener Comp Sy. 2018;89:137–47.

    Article  Google Scholar 

  31. Bouaud J, Guézennec G, Séroussi B. Combining the generic entity-attribute-value model and terminological models into a common ontology to enable data integration and decision support. Stud Health Technol Inform. 2018;247:541–5. PMID: 29678019.

  32. Zhang YF, Tian Y, Zhou TS, Araki K, Li JS. Integrating HL7 RIM and ontology for unified knowledge and data representation in clinical decision support systems. Comput Methods Prog Biomed. 2016;123:94–108.

    Article  Google Scholar 

  33. National Health and Family Planning Commission of the People’s Republic of China. WS 363.1-WS 363.17, Health data element dictionary, 2011. Standards Press of China.

  34. National Health and Family Planning Commission of the People’s Republic of China. WS 363.1–2011, Health data element dictionary Part 1: General specification. Standards Press of China. Accessed 23 Feb 2022.

  35. Liu DH, Xu YY. Analysis of HL7 services-aware interoperability framework and standard requirements for semantic interoperability. Chin J Health Inform Manage. 2014;11(4):376–80.

  36. Lou MM, Yang Z, Liu DH, Cao Y, Li X, Jiang K. The development of conceptual health data model based on domain information. China Digital Med. 2015;10(1):74–7.

  37. Khalifa A, Mason CC, Garvin JH, Williams MS, del Fiol G, Jackson BR, et al. Interoperable genetic lab test reports: mapping key data elements to HL7 FHIR specifications and professional reporting guidelines. J Am Med Inform Assoc. 2021;28(12):2617–25.

    Article  Google Scholar 

  38. Shivers J, Amlung J, Ratanaprayul N, Rhodes B, Biondich P. Enhancing narrative clinical guidance with computer-readable artifacts: authoring FHIR implementation guides based on WHO recommendations. J Biomed Inform. 2021;122:103891.

    Article  Google Scholar 

  39. Integrating the Healthcare Enterprise. IHE Data Exchange. Accessed 23 Feb 2022.

  40. Ulrich H, Kern J, Tas D, Kock-Schoppenhauer AK, Ückert F, Ingenerf J, et al. QL4MDR: a GraphQL query language for ISO 11179-based metadata repositories. BMC Med Inform Decis Mak. 2019;19(1):45.

    Article  Google Scholar 

  41. Zhipeng Guo, Yu Zhao, Yabin Zheng, Xiance Si, Zhiyuan Liu, Maosong Sun. THUCTC: An Efficient Chinese Text Classifier. 2016. Accessed 23 Feb 2022.

  42. Li JY, Sun MS. Scalable Term Selection for Text Categorization. Proc. of the 2007 Joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL). Prague: Association for Computational Linguistics; 2007. p. 774–82.

  43. Health Level seven. Data Types - Abstract Specification, Release 2. Accessed 23 Feb 2022.

  44. National Health and Family Planning Commission of the People’s Republic of China. WS/T 672–2020, National conceptual data model for health and population information [S]. Accessed 23 Feb 2022.

  45. International Organization for Standardization. ISO/IEC 11179, Information Technology -- Metadata registries (MDR)[S]. Accessed 23 Feb 2022.

  46. International Organization for Standardization. ISO/IEC 11179–3, Information Technology -- Metadata registries (MDR)-Part 3: Registry metamodel and basic attributes [S]. 2013. Accessed 23 Feb 2022.

  47. Stausberg J, Harkener S. Metadata of registries: results from an initiative in health services research. Stud Health Technol Inform. 2021;281:18–22.

    Article  Google Scholar 

  48. National Health and Family Planning Commission of the People’s Republic of China. WS/T 671–2020, National data dictionary for health and population information [S]. Accessed 28 Oct 2021.

  49. National Bureau of Quality and Technical Supervision of China. Codes for the representation of names of countries and regions [S]. 2011. Accessed 23 Feb 2022.

  50. Health Level seven. HL 7 Vocabulary. Accessed 23 Feb 2022.

  51. Sinaci AA, Laleci Erturkmen GB. A federated semantic metadata registry framework for enabling interoperability across clinical research and care domains. J Biomed Inform. 2013;46(5):784–94.

    Article  Google Scholar 

  52. National Health and Family Planning Commission of the People’s Republic of China. Health information standards. Accessed 23 Feb 2022.

Download references


Not applicable.


This work is supported by the National Natural Science Foundation of China (Grant No. 81471757), Key R & D Program of Shaanxi Province (2021SF-193, 2020SF-246), and Logistics Science and Technology Youth Cultivation Program (20QNPY047).

Author information

Authors and Affiliations



Yang Z and Jiang K completed the main information modelling, vocabulary building and article drafting. Lou M and Liu J all participated in data collection and analysis. Gong Y guided the research methods, provided helpful comments. Zhang LL was responsible for the organization and management of the system development. Bao XY developed the web-based system based on the HCDM and NHDD. Liu DH, obtained funding and assisted in conception and design of this study, and refined and standardized of data elements. Yang P, obtained funding and contributed to partial model construction, assisted in writing, revising and refining the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Danhong Liu or Peng Yang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Z., Jiang, K., Lou, M. et al. Defining health data elements under the HL7 development framework for metadata management. J Biomed Semant 13, 10 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: