Re-use of health data to train algorithms: European Union EHDS Proposal

The European Health Data Space Regulation (still a proposal) will bring huge changes to the handling and possibilities of electronic health records in the European Union. It will create a common environment, with interoperable standards and easier access to health data both for the primary use (providing healthcare services) and for the secondary uses.

One of the envisaged secondary purposes of the processing of electronic health records is the legitimation of the training, testing and evaluating of algorithms. However, the requirements for this secondary use of health data are strict.

Use of health data for secondary purposes

Health data is collected in several different health care settings. However, this data is invaluable for other purposes as well that are not directly related to the purposes for which it is originally collected. The intention of this legislation is to enable health data to be re-used more widely for research, innovation, policy making, regulatory purposes, and patient safety.

Envisaged secondary uses in relation to algorithms

The European Health Data Space Regulation (the “EHDS”) proposal expressly states that health data contained in electronic health records (“EHRs”) can be processed for training, testing and evaluating of algorithms, including in medical devices, AI systems and digital health applications.

However, there is a condition: this secondary use must satisfy any of the following purposes:

  • contributing to the public health or social security;
  • ensuring high levels of quality and safety of health care; 
  • ensuring high levels of quality and safety of medicinal products; or 
  • ensuring high levels of quality and safety of medical devices. 

The requester (data user) shall be in a position to prove that the access is necessary for any of those purposes. 

However, the data user shall take into account that there are prohibited purposes, so any algorithm benefited from the accessed data shall not be used to (among others):

  • taking decisions detrimental to a natural person based on their electronic health data;
  • excluding natural persons from the benefit of an insurance contract or modifying their contributions and insurance premiums;
  • developing products or services that may harm individuals and societies at large, including tobacco, alcoholic beverages, etc.
  • advertising or marketing activities towards health professionals, organisations in health or natural persons.

Which categories of data shall be available for training / testing the algorithm?

The EHDS envisages a quite broad list of categories of data that shall be available to reuse, including; (i) EHRs, (ii) pathogen genomic data, (iii) genetic data, (iv) identification data related to health professionals, (v) electronic health data from clinical trials, and (vi) electronic health data from biobanks.

Anonymized data vs. pseudonymized data

The data user shall lay down in the data access request whether it needs to access anonymized or pseudonymized data. If the data user only needs anonymized data, the process is easier, as the health data access body (if the request is granted) will provide that non-personal dataset. 

However, if the data user requires pseudonymized data (meaning data where any information which could be used to identify an individual has been replaced with a pseudonym or other value which does not allow the individual to be directly identified), the applicant should explain why this is necessary and why anonymous data would not suffice. Besides, it shall identify the legal basis of processing under art. 6 GDPR (either exercise of a task in the public interest assigned by law or legitimate interest). In this case, an ethical assessment may be requested based on national law. 

The EDHS does not change the qualification or not of personal data or anonymization: general GDPR principles will apply.

Who is the relevant body to grant access to data?

The access applications will be managed by a unique body: the health data access body. Data users seeking access to electronic health data from more than one Member State shall submit a single application to one of the concerned health data access bodies of their choice. However, where an applicant requests access to electronic health data only from a single data holder, that applicant may file a data access application or a data request directly to this data holder.

Process to obtain the data

The applicant shall submit a data access application detailing: 

  • the intended use of the health data;
  • the description of the data (format, source, geographical coverage…);
  • whether anonymous or pseudonymous data is required (with the explanations detailed above);
  • a description of the safeguards adopted;
  • an estimation of the period during which the electronic health data is needed;
  • a description of the tools and computing resources needed for a secure environment.

The health data access body shall issue or refuse a data permit within 2 months of receiving the data access application (that can be extended for another 2 months). Where a health data access body fails to provide a decision within the time limit, the data permit shall be issued.

The data permit shall set out the general conditions applicable to the data user. A data permit shall be issued for the duration necessary to fulfil the requested purposes which shall not exceed 5 years (that may be extended once). 

Once the request has been granted, the health data access body will request the data from the data holder, that will need to provide the data to the data user within 2 months. 

Data users shall make public the results or output of the secondary use of electronic health data, including information relevant for the provision of healthcare, no later than 18 months after the completion of the electronic health data processing.

How is the data to be provided to the data user?

The health data access bodies shall provide access to electronic health data only through a secure processing environment. The data users shall only be able to download non-personal electronic health data from the secure processing environment.

Can data holders and health data access bodies charge access fees for secondary use?

Yes. Where the data in question are not held by a public body (e.g. a private healthcare provider), the fees may also include compensation for part of the costs for collecting the electronic health data.

Fees shall be transparent and proportionate to the cost of collecting and making electronic health data available.

What is the data protection relationship between data users and data holders and data access bodies?

The EHDS is clear about this. Health data access bodies and data users shall be deemed joint controllers. Data holders do not have any data processor or joint controllership role vis-à-vis data users, except when there is a single data provider and the request is directly handled by the same, in which case they will be considered joint controllers.

Next steps

  • Companies interested in the use of health data to train / test algorithms should keep an eye on this proposal for the regulation as this will bring business opportunities.
  • A robust compliance program to ensure protection of information and accountability would be necessary.
  • We recommend monitoring the legislative development of the EHDS.

Authored by Gonzalo F. Gallego, Juan Ramón Robles, and Cristina Baron.


This website is operated by Hogan Lovells International LLP, whose registered office is at Atlantic House, Holborn Viaduct, London, EC1A 2FG. For further details of Hogan Lovells International LLP and the international legal practice that comprises Hogan Lovells International LLP, Hogan Lovells US LLP and their affiliated businesses ("Hogan Lovells"), please see our Legal Notices page. © 2024 Hogan Lovells.

Attorney advertising. Prior results do not guarantee a similar outcome.