Author: Maryna Manteghi, PhD researcher, University of Turku, Finland.
Photo credit: rjcastillo, via Wikimedia commons
Introduction
The EU’s new Data Act, a cornerstone of its data strategy, took effect on January 11, 2024, and will be implemented in September 2025. This regulation seeks to break down barriers hindering data access for individuals and companies, promoting a balanced and fair data distribution within society. It emphasizes easier access to and utilization of extensive digital datasets, particularly those originating from sensors and machines within the Internet of Things (IoT). To unlock data concentrated in the hands of a few, the Act reevaluates the Database Directive’s relevance in our data-driven world, though without directly modifying it.
Specifically, Article 43 of the Data Act states that the unique protection given to database creators who invest heavily in data acquisition, verification, or presentation (as per Article 7 (1) of the Database Directive) doesn’t apply to data sourced from or produced by connected products or their related services. The Data Act defines a “connected product” as a device collecting, generating, or gathering data about its usage or surroundings and capable of transmitting this data electronically, through physical connections, or by on-device access, with its main function not being data storage, processing, or transmission. A “related service” is a digital service, excluding electronic communication, such as software, linked to the product upon purchase or lease in a way that its absence would hamper the product’s functionality.
Article 43, along with Recital 112, excludes machine-generated data-holding databases from this unique protection to safeguard user rights to access, utilize, and share such data (Articles 4 and 5 of the Data Act). While this could prevent excessive intellectual property protection on specific database types, certain aspects require clarification to guarantee fair data access and use in our digital age.
Potential Limitations of Article 43 in Scientific Research
Examining Article 43 from a research standpoint raises concerns. Excluding machine-generated data databases from this unique protection doesn’t inherently grant researchers automatic access or usage rights. Database owners could still block or limit access through contracts or technological safeguards like passwords or robots.txt files. While Recital 5 highlights the regulation’s goal to prevent exploitative contract imbalances hindering fair data access and use, it only mentions third-party rights and data sharing, leaving the impact of these limitations on the unique database right unclear.
Another concern arises from using mixed databases combining data under the Data Act’s purview and “derived” or “inferred” data outside its scope. For example, researchers studying databases with data from health monitoring devices would need permission to access and use databases containing information derived from this collected data (like statistical data), requiring licensing or other legal avenues. These derivative databases could be under the unique protection, compelling researchers to seek authorization from database owners if their research involves copying a significant portion of the database’s content. However, differentiating between data types covered by the regulation and those outside it can be challenging for researchers. Excluding derived or inferred data from Article 43 lacks strong justification, as this data could fulfill the criteria for machine-generated data outlined in Recital 15 of the Data Act. Specifically, the provision requires such data to reflect digitized user actions and events, hold value for the user, and support innovation and the development of digital and other services benefiting the environment, health, and a circular economy.
Furthermore, the regulation seeks to simplify machine-generated data accessibility for users, businesses, and exceptionally, public sector bodies needing such data, without explicitly emphasizing scientific research. Researchers could benefit from provisions allowing users to share machine-generated data with third parties, including research or non-profit organizations. Moreover, they could rely on Article 14, compelling data holders to provide machine-generated data to public sector bodies, including potentially research organizations, in exceptional cases.
Research organizations can share this data with individuals or organizations for scientific research, provided these actors operate on a non-profit basis or within a state-recognized public interest mission. This means independent researchers or private institutions involved in public-private research partnerships might struggle to access machine-generated data even indirectly. Distinguishing between commercial and non-commercial activities within such collaborations is often difficult in practice.
Concluding Remarks
In summary, Article 43 could be improved to explicitly prevent contracts or technological measures from overriding its provisions at the expense of user rights, thus ensuring better access and use of machine-generated data. To guarantee efficient and broad access to raw machine-generated datasets for research, explicitly addressing researchers’ needs by including them as beneficiaries is crucial. Furthermore, incorporating derived or inferred data within the Data Act’s scope would enhance data availability and integrity for research purposes. Adopting these recommendations could create a research-friendly environment, bolstering the EU’s research capacity globally.
