What do companies need to know about the Data Mining Exception of the Copyright Directive?

The European Union enacted last year the Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market (the “Copyright Directive”). European Union states shall transpose the Directive by 7 June 2021. The countdown has started.

This Directive has the aim of allowing different players to make use of text and data mining techniques regarding information available to the public online, without the permission of the relevant copyright right holders. However, right holders have the possibility to defend their IP assets from mining extraction. 

According to the same Copyright Directive, the European Union is aware that the use of these techniques can allow the analysis of large amounts of data in different areas of life and for various purposes, including for government services, complex business decisions and the development of new applications. On the other hand, the use of these techniques are generally considered to be innocuous, or if any the harm created through this exception would be minimal. The balance between the benefits for society for the use of these techniques and the IP rights of the rights holders leans towards the benefits for the society.

What is text and data mining?

Data mining is defined as an automated analytical technique aimed at analyzing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations”. Part of the concept is that the collected data is analyzed under some parameters (using algorithms, IA, etc.) at high scale in order to generate new data (different from the one extracted).

As stated in some preparatory papers of the European Institutions (for instance this and this), data mining has some common steps:

  • Access to content;
  • Extraction and / or copying of content; and
  • Mining of text and/or data and knowledge discovery, which requires the pre-processing of relevant text and data and extraction of structured data, in order to then analyze such output; recombining it to identify patterns into the final output.

And what is the new exception about?

The information that is available online is huge, and it can generate added-value if processed using complex techniques that create new and valuable information in different forms such as patterns, new relations, etc. However, the information that is usually available in the internet may be subject to the protection of IP laws. The purpose of the Copyright Directive is establishing a new exception to the currently existing IP laws for any player to be able to use text and data mining techniques.

For instance, the new exception would apply to the following IP rights:

However, the Copyright Directive has explicitly required Member States to further specify the exception to the abovementioned IP rights. As of today, in Spain the exception has not been yet defined.

Moreover, the exceptions do not apply:

  • If the work has not been accessed lawfully by the beneficiary. This means that any unconsented access by a company will not trigger the protection of the exception.
  • In relation to other rights that may exist. For instance, if by applying data and text mining techniques a company collects EU personal data, it may be subject to the rules of the General Data Protection Regulation.

Finally, reproductions and extractions may be retained for as long as is necessary for the purposes of text and data mining.

If a rights holder does not want his / her data to be extracted by third parties using data mining techniques, what can he / she do?

Rights holders are allowed to put in place mechanisms to impede third parties to extract information on the basis of the exceptions of the Copyright Directive. The fact that there is a new exception to the general copyright laws (as explained above) does not mean that if a rights holder wants to protect its data against the extraction by third parties there is nothing to do.

What can a reluctant rights holder do to avoid the extraction of its information?

  • Reserve those rights by the use of machine-readable means, including metadata and terms and conditions of a website or a service;
  • Protect valuable information for registered users only, so third parties may not “lawfully access” the information without credentials; and
  • Include information that is not under the protection of the Copyright Directive, so the extraction may be protected by other laws.

If a company is interested in extracting information for data and text mining techniques, what can it do to ensure compliance?

If a company is interested in applying the exception under the Copyright Directive to apply data and text mining techniques, it should:

  • Verify that the information that is accessing is only protected by the IP rights subject to the exception, and that there are no other laws that may be breached (e.g. data protection laws, trade secret laws, etc.);
  • Verify that only publicly available information is accessed, and that the access is conducted in a way that is not contrary to good faith; and
  • Verify that the owner of the website / app etc. has not reserved his / her rights by the use of machine-readable means, including metadata and terms and conditions of a website or a service.

Is there anything else?

Yes, the Copyright Directive has set a standard, but each EU country has room for the transposition. This means that in many aspects the content of the Directive will not be homogeneous. We recommend monitoring the final text in each EU member.

Besides, the Directive contains exceptions for cultural and non-profitable organizations and other provisions related to the protection of press publications concerning online uses, use of protected content by online content-sharing service providers, etc. but we will talk about it another day…



Authored by Victor Mella and Juan Ramón Robles.


This website is operated by Hogan Lovells Solutions Limited, whose registered office is at 21 Holborn Viaduct, London, United Kingdom, EC1A 2DY. Hogan Lovells Solutions Limited is a wholly-owned subsidiary of Hogan Lovells International LLP but is not itself a law firm. For further details of Hogan Lovells Solutions Limited and the international legal practice that comprises Hogan Lovells International LLP, Hogan Lovells US LLP and their affiliated businesses ("Hogan Lovells"), please see our Legal Notices page. © 2022 Hogan Lovells.

Attorney advertising. Prior results do not guarantee a similar outcome.