AJ Richter, Author at TechGDPR

Is an IP address considered personal data?

AJ Richter — Tue, 24 Mar 2026 07:33:49 +0000

The concept of personal data lies at the heart of the General Data Protection Regulation (GDPR), shaping the scope of its protections and obligations. Among the most debated examples of such identifiers are IP addresses. While often perceived as neutral technical data, regulatory authorities and courts within the European Union have clarified that IP addresses can constitute personal data when they enable identification, directly or indirectly. Understanding why IP addresses fall within the GDPR’s scope requires examining legal interpretation, regulatory guidance, and practical realities of online data processing.

What qualifies as personal data?

Article 4.1 of the GDPR defines personal data as “any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.”

The EDPB explicitly identifies IP addresses as being personal data due to their ability to identify individual data subjects. If an IP address is successfully anonymized, then under the GDPR it is no longer considered personal data.

The French Data Protection Authority (CNIL) ruled over a case dealing with the transfer of personal data to a company not in the EU. In the decision, the CNIL wrote:

“It should be noted that online identifiers, such as IP addresses or information stored in cookies can commonly be used to identify a user, particularly when combined with other similar types of information. This is illustrated by Recital 30 GDPR, according to which the assignment of online identifiers such as IP addresses and cookie identifiers to natural persons or their devices may “leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.” In the particular case where the controller would claim to not have the ability to identify the user through the use (alone or combined with other data points) of such identifiers, he would be expected to disclose the specific means deployed to ensure the anonymity of the collected identifiers. Without such details, they cannot be considered anonymous.”

What is an IP address?

An IP address is a way of identifying a device or user attached to the Internet. It is a set of numbers that distinguishes how the device requests and receives information from the Internet. The two main formats are IPv4 and IPv6. Originally, IPv4 was the sole way of identifying devices but it does not allow for as many unique addresses that are needed in the modern age.

The format of IPv4 addresses are xxx.xxx.xxx.xxx where x is a decimal number. The format of IPv6 addresses is hexadecimal (2001:db8::ff00:42:8329), which means a value can be 0-9A-F. Static IP addresses are IP addresses that are constant and dynamic IP addresses can change over time. IP addresses can identify explicit addresses or the exact location of devices.

The GDPR perspective on IP addresses

The GDPR explicitly includes “online identifiers” (e.g., IP addresses) as personal data when they can identify a person. Even if the controller doesn’t have the identifying data itself, if there are means reasonably likely (e.g., legal processes to get ISP logs) to link an IP to a person, then it qualifies as personal data. This logic comes from the CJEU case Breyer (C-582/14). The CJEU relied on Recital 26 of the GDPR, which states that in determining whether a person is identifiable, “to determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.”

IP addresses can be personal data if the controller has legal ways to obtain additional info to identify someone via an ISP. This is due to the objective possibility of identification of a data subject. Under the GDPR there is less concern with whether it is probable or whether it has happened and the concern lies with whether it is objectively possible to identify an individual. Given an IP address, it is possible to identify an individual. EDPB decisions affirm that online identifiers like IP addresses are often treated as personal data because they can be combined with other information to profile or identify a data subject.

Personal data vs PII

Personal data, in the context of the GDPR, covers a much wider range of information than personally identifiable information (PII), commonly used in North America. In other words, while all PII is considered personal data, not all personal data is PII. For more information about PII vs personal data, read our blog post on the matter.

Device IDs, IP addresses and Cookies are considered as personal data under GDPR. According to the definition of the PII; however, they are not PII because they are anonymous and cannot be used on their own to identify, trace, or identify a person.

PII includes any information that can be used to re-identify anonymous data. Information that is anonymous and cannot be used to trace the identity of an individual is non-PII. Device IDs, cookies and IP addresses are not considered PII for most of the United States. But some states, like California, do classify this data as PII. California classifies aliases and account names aspersonal information as well.

Controllers must treat IP addresses as personal data

For organizations, this means IP addresses cannot be treated as neutral technical data. Controllers must:

Identify a lawful basis for processing (e.g. consent, legitimate interest, contract performance).
Provide transparency in privacy notices, clearly explaining why IP addresses are collected, who receives them (e.g., third-party providers), and how long they are retained.
Apply data minimisation and storage limitation, ensuring IP data is only collected when necessary and retained for no longer than required.

In practice, this is highly relevant when embedding third-party services such as Google Fonts or analytics tools. Whenever a website loads resources from Google servers, the user’s IP address is transmitted to Google by default. Even when using Google Analytics with IP anonymisation enabled, the IP address is initially collected before truncation. The anonymisation feature represents a commitment by Google not to further process the full IP address, but technically, the IP is still transmitted during the request phase. From a strict GDPR perspective, this transmission itself constitutes processing.

ePrivacy Directive

IP address collection via cookies or similar tracking technologies also engages the ePrivacy Directive. Where IP processing is linked to tracking or storing information on a user’s device, prior consent is generally required unless the processing is “strictly necessary” for providing the requested service. This creates a dual compliance requirement: organizations must assess both a GDPR lawful basis and ePrivacy consent obligations.

Anonymisation, pseudonymisation & risks

Pseudonymisation can reduce risks and demonstrate accountability, but it does not remove GDPR applicability. Organizations must still implement appropriate technical and organisational safeguards. In order to pseudonymize IP addresses, it is necessary to obscure the IP address. This is often done by:

For IPv4 addresses, the last segment is replaced with a zero or removed.
- Example: 123.456.789.123 → 123.456.789.0
For IPv6 addresses, a similar approach is applied, truncating the last portion.

Guidance from the European Data Protection Board makes clear that true anonymization must be irreversible. Simple IP truncation or masking is typically considered pseudonymization, not anonymization. This is because re-identification may still be possible, especially when combined with other data points. IP truncation reduces identifiability but does not automatically result in anonymisation. In most cases it constitutes pseudonymisation, meaning GDPR obligations still apply. Simply put: IP truncation is a risk-reduction measure (pseudonymization), not true anonymization under GDPR standards, unless re-identification is demonstrably impossible.

Real-world examples

Analytics and server logs: IP addresses used for traffic analysis remain personal data.
Security and abuse detection: Legitimate interest may apply, but retention must be limited.
Advertising and profiling: IP-based tracking combined with cookies generally requires prior consent and careful transparency measures.

Conclusion

Under the GDPR, personal data encompasses far more than obvious identifiers such as names or identification numbers. It includes any information that can reasonably be linked to an individual. IP addresses, whether static or dynamic, fall within this definition when identification is objectively possible. This identification includes even if indirect or requiring additional data from third parties. Reach out to TechGDPR for any help with regards to understanding the nuances of data protection legislative requirements.

The post Is an IP address considered personal data? appeared first on TechGDPR.

Does the GDPR apply to my US company?

AJ Richter — Tue, 10 Feb 2026 09:35:09 +0000

Introduction

The usual assumption of most US businesses is, “the GDPR is an EU regulation, hence it does not impact my organisation.” This belief results most often in unnecessary risk. The US equivalent of this misconception would be a company registered in Texas thinking its services don’t fall under the scope of the CCPA.

The GDPR has extraterritorial effect, that is, it has effect on and more often than not, does affect organisations which are outside the European Union.

Note that since Brexit, the UK has maintained GDPR provisions but further adapted them to its body of laws, this is known as the UK GDPR which adds an additional but small level of complexity for transfers of data outside the UK. For the sake of simplicity, the term GDPR used in this article will also apply to the UK.

What is the GDPR and why it has global reach

The GDPR is the code name for the UK and the EU’s General Data Protection Regulation. It shields the personal data of individuals who are within the European Union, provides rights to the data owners (i.e. individuals) and lays out obligations for the organisations handling that data. It has a general territorial scope such that it may apply to organisations outside of the EU if certain conditions are fulfilled.

A US company may be controlled by the GDPR if it is:

Providing goods or services to data subjects in the European Union (EEA and UK)

This trigger is independent of payment or contractual terms. A business will be deemed to be targeting or envisaging an EU audience if it engages in any of the following activity:

Sending physical goods or providing access to digital services into a member state of the EU/EEA/UK;
Taking payments in a European currency such as Euros;
Running campaigns that market to email recipients in the EU/EEA/UK; and
Providing a website or service in a language that is widely spoken across the EU/EEA/UK.

Tracking the behavior of users in the European Union

This trigger is extremely applicable to digital-first companies today. If your business is tracking or profiling users in the European Union, the GDPR will most likely apply. This includes practices like:

Tracking European Union website and app users with analytics tools;
Placing cookies or other tracking tags on the devices of users in the European Union which triggers additional requirements from the ePrivacy Directive and other local laws; and
Running targeted advertisement campaigns against users within the European Union on the basis of their online behavior.

Article 3 of the GDPR expressly sets out these conditions. These are detailed in additional guidance by the European Data Protection Board (Guidelines 05/2021). Registration of an organization outside of the EU does not necessarily remove a business from scope.

What constitutes personal data under the GDPR?

The GDPR defines personal data as any information relating to an identified or identifiable natural person. This definition is deliberately broad. This is to encompass a wider range of data than the concept of “personally identifiable information” (PII) used in other jurisdictions. It is critical for any organisation to understand what information falls under this comprehensive definition to determine its compliance obligations.

Personal data includes, but is not limited to:

Direct identifiers: A person’s name, email address, physical address, or telephone number.
Online identifiers: An individual’s Internet Protocol (IP) address, browser cookies, and device identifiers (IP/MAC address, IMEIs, …).
Pseudonyms like user IDs, vehicle numbers (VINs), randomly chosen usernames, hashes…
Metadata in context like timestamps,
Special categories of data: Biometric data, such as fingerprints or facial recognition information. To learn more about sensitive data under the GDPR, that is addressed in Art.9 of the GDPR and our blog article detailing the differences between PII and personal data.
Other information: Video or photo recordings, and an individual’s location data.
IoT data associated with a device purchaser, owner, user, maintenance person, etc…

If your organization collects any of this information from individuals in the European Union, it is processing personal data and must assess its compliance obligations under the GDPR.

What if my business doesn’t comply?

Non-compliance with the GDPR will result in massive financial and reputational losses. Supervisory authorities can impose fines of up to twenty million euros or four percent of the annual global turnover of an organization. This is decided by whichever is the greater. The GDPR has a highly structured framework of administrative fines, which can be applied in two tiers:

Tier 1: Up to €10 million, or 2% of the company’s total annual turnover worldwide in the preceding financial year. This is decided by whichever is the greater.
Tier 2: Up to €20 million, or 4% of the company’s total annual turnover worldwide in the preceding financial year. This is decided by whichever is the greater.

Enforcement is also a legitimate concern for U.S. companies. For example, Clearview AI, a U.S.-based firm, was the subject of enforcement action and fines by multiple EU data protection authorities for processing EU individuals’ personal data lacking a sufficient legal basis.

Along with fines, organizations can anticipate loss of customer trust, damage to their reputation, and legal restrictions on their data processing activities. Enforcement action against household names demonstrates that regulators are willing to act against organizations outside the European Union when the GDPR applies.

A simple checklist for your U.S. company

To allow you to consider at a glance whether the GDPR applies to your business, ask yourself the following questions:

Does your company’s website, app, or service deliver goods or services to individuals in the European Union?
Do you use instruments that monitor the online behavior of individuals in the European Union?
Does your company process the personal data of any of your staff members working in the European Union?
Do you implement any vendor tool to carry any of that data processing for you?

If you answered yes to any of these queries, then it is highly likely your company is subject to the GDPR.

Real-life examples of when the GDPR applies

An online store in the United States accepting payment in euros and shipping goods to customers in the European Union;
A company processing payroll for a remote employee working in the European Union;
A marketing company running targeted campaigns aimed at audiences within the European Union.

Conversely, a strictly internal website with no European customer targeting and only incidental EU visits generally will not be subject to the GDPR.

Special Case: United States companies with EU-Based employees

The processing of employees’ personal data in the European Union triggers GDPR obligations. Some examples are maintaining personal records, processing sensitive information, and monitoring work performance. Paying an employee in the European Union without additional data processing might not necessarily trigger full GDPR compliance requirements. That being the case HR processes need to be carefully reviewed. Please check out our blog article on how the GDPR and effects HR data for non EU-companies for further information.

Your next steps toward compliance

If your business is subject to the GDPR, it’s essential to be forward-leaning with regards to compliance.

Carry out a data mapping exercise: This will lead to Records of Processing Activities, the details of which are outlined in Art. 30 of the GDPR. Record all personal data your organization gathers and processes, the reason for the data, and where it is stored;
Determining a lawful basis for all your data processing activities: This provides a documented and valid legal rationale for collecting and using personal data. This could be e.g., user consent, contractual necessity with the person, or legitimate interest of your organization, EU legal obligation;
Drafting accessible privacy notices: Provides an intelligible and accessible privacy notice describing data collection, purposes, storage, and data sharing practices;
Respecting the rights of data subjects: Enable individuals to exercise their rights under the GDPR. These rights include access, rectification, erasure, restriction, and objection;
Appointing a Data Protection Officer (DPO): Appoint a DPO where required. This could be due to processing vast volumes of sensitive personal data or conduct systematic monitoring of individuals;
Consider an EU Representative: If your business is established outside of the European Union, you may need to have a representative within one of the member states under Article 27; and/or
Seek expert advice: The GDPR is complex. For complete compliance, it would be ideal to obtain a professional GDPR compliance audit.

Conclusion

Whether the GDPR affects an American business or not is not a matter of a business’s physical presence, but if it has a connection with individuals in the European Union. If your business offers goods or services to EU residents or monitors their activities, then it is very likely the GDPR will affect you. The penalty for failure to comply can be extremely high, both financially and with regard to one’s reputation.

It is suggested that all U.S. businesses conduct an internal examination of data processing operations. If unsure, securing a professional GDPR compliance assessment can guarantee a clear and secure path forward.

The post Does the GDPR apply to my US company? appeared first on TechGDPR.

AI Data Retention Strategy under the GDPR and the EU AI Act: Reconciling the Regulatory Clock

AJ Richter — Wed, 26 Nov 2025 15:11:23 +0000

Artificial Intelligence (AI) is reshaping industries, but organizations developing AI systems face a critical, often overlooked strategic risk: managing the retention of training data in compliance with European Union (EU) law. The GDPR emphasizes rapid deletion of personal data, while the EU AI Act requires long-term archival of system documentation. Navigating these conflicting requirements is essential for legal compliance, operational efficiency, and risk mitigation. An effective AI data retention strategy under the GDPR and the EU AI Act is now essential for organisations developing, deploying, or governing artificial intelligence systems in the European Union.

Executive Summary: The Dual Compliance Imperative and Strategic Findings

Organisations that leverage advanced data processing, particularly those developing complex Artificial Intelligence (AI) systems, face a critical and often unrecognized strategic risk: the prolonged retention of training data. European Union (EU) law establishes conflicting imperatives regarding data lifecycle management, creating a fundamental compliance challenge. The General Data Protection Regulation (GDPR) mandates personal data erasure as soon as the data is no longer required for its established purpose, while the newly implemented EU AI Act demands lengthy archival of system documentation.

The GDPR is the primary constraint on personal data, and the AI Act governs long-term retention of non-personal audit and system records.

The Inescapable Regulatory Conflict: Delete Now vs. Document for a Decade

The core of the conflict lies in the tension between personal data protection and system accountability. The GDPR is clear: personal data must be erased once its specific processing purpose is fulfilled. This is enforced by the Storage Limitation Principle (Article 5(1)(e)). Retention beyond this defined necessity, even if the data might be useful for future research or system retraining, is deemed a direct violation unless a new, distinct, and lawful purpose is established.

Conversely, the EU AI Act introduces stringent requirements for system traceability, particularly for High-Risk AI Systems (HRAS). Providers of HRAS must maintain comprehensive technical documentation, quality management system records, and conformity declarations for up to 10 years after the system is placed on the market (Article 18, EU AI Act). This requirement applies to system records, ensuring long-term accountability, but does not override the fundamental protection afforded to individuals’ data under the GDPR.

The GDPR Foundation: The “Storage Limitation” Principle

The entire framework of data retention under EU law rests on the GDPR’s Storage Limitation Principle (Article 5(1)(e)).This foundational rule dictates that personal data must be kept “for no longer than is necessary for the purposes for which the personal data are processed.” This is the core principle driving all retention decisions.

Personal data shall be:
(e) kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed; personal data may be stored for longer periods insofar as the personal data will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) subject to implementation of the appropriate technical and organisational measures required by this Regulation in order to safeguard the rights and freedoms of the data subject (‘storage limitation’);
GDPR Article 5(1)(e)

The GDPR does not set generic retention times, instead placing the full burden on the data controller to define, document, and justify a specific deletion timeline for every category of data. If personal data (which is defined broadly to include information beyond PII, like cookie IDs) is used to train a system, the retention clock starts ticking. Organisations leveraging advanced data processing face a critical strategic risk: retaining training data for too long. The GDPR is unambiguous; personal data must be erased once its specific processing purpose. Retention beyond that, even for potential future research, is a direct violation unless a new, distinct, and lawful purpose is established.

Defining the Critical Strategic Risk for GDPR non-compliance

The strategic risk is precisely defined by failing to establish, document, and legally justify a specific deletion timeline for every category of personal data used in the training process. The absence of generic retention times in the GDPR places the full burden of definition and justification squarely upon the data controller.

This environment forces organizations to confront a critical trade-off: is the unproven, speculative future value of raw personal data worth the risk of fines and potential data breaches? The calculation strongly favors deletion. As,

Failing to define and document specific deletion timelines exposes organizations to GDPR violations.
Retaining data for future retraining or academic purposes is legally indefensible once the initial training purpose is fulfilled.
Financial penalties for non-compliance can exceed the cost of implementing compliant, minimal-data systems.

The EU AI Act Layer: Traceability and Documentation

The EU AI Act introduces a layered approach to retention centered on system accountability rather than individual personal data. The rules are tied to the system’s risk profile, with High-Risk AI Systems (HRAS) (EU AI Act, Chapter 3) having the most stringent obligations.

Data Governance (Article 10) for HRAS requires that training, validation, and testing data sets be relevant, representative, and free of errors. While not a direct retention rule, this implicitly requires maintaining data sets for a period necessary for auditing and quality checks during the development phase.

The most critical requirement is Documentation Retention (Article 18): HRAS providers must keep key records (Technical Documentation, Quality Management System, etc.) for 10 years after the system is placed on the market. This 10-year rule applies to documentation and metadata, not the raw personal data itself, which must be deleted sooner under the GDPR. This 10-year period covers documentation, quality records, and conformity declarations. It is vital to understand that this does not override the GDPR’s Storage Limitation Principle (Article 5(1)(e)).

Raw personal data used for training must still be deleted sooner. However, the requirement for Record-Keeping (Logging) (Article 12) means that systems must automatically record events and usage logs. While these logs should ideally be anonymised, their retention period must be “appropriate” extending the non-personal data record-keeping timeline. This mandates a long-term, non-personal data retention strategy that must be carefully integrated with the strict, short deletion cycles required by the GDPR for raw personal data.

Blending the GDPR and EU AI Act Requirements

The intersection of the GDPR and the EU AI Act necessitates a blended compliance strategy, particularly concerning purpose and identification. The GDPR’s Purpose Limitation principle (Article 5(1)(b)) demands that the purpose for processing, such as system training, be explicitly defined. This definition directly dictates the maximum legal retention period for personal data.

Personal data shall be:
(b) collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes; further processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes shall, in accordance with Article 89(1), not be considered to be incompatible with the initial purposes (‘purpose limitation’);
GDPR Article 5(1)(b)

Implementing De-Identification in Your AI Data Retention Strategy under the GDPR and the EU AI Act

The best path for long-term data use is de-identification:

Pseudonymisation only reduces identifiability; the data remains personal data under the GDPR and the Storage Limitation Principle still applies.
Anonymisation is the only legal release valve. If the data is permanently and irreversibly stripped of identifiers; it is no longer considered personal data (GDPR Recital 26). Therefore, it can be retained indefinitely.

It’s critical to remember that while the raw personal data must be deleted, the trained system itself (the output) can be retained.

Reconciling the GDPR’s Right to Erasure with the EU AI Act Traceability

The most direct legal challenge is reconciling the GDPR’s Right to Erasure (Article 17) with the ongoing need for system traceability under the AI Act. If a system is trained on personal data, the controller must maintain the technical ability to honor an erasure request.

This is the Purpose Limitation Conflict: if the initial purpose (training) is complete, retaining the raw personal data is a violation of the GDPR. Developers must implement technical solutions like secure deletion protocols immediately after a system is finalised. Using robust, irreversible anonymisation is the only way to retain data sets without triggering the GDPR’s strict retention clock.

When facing overlapping regulations, the GDPR always acts as the primary constraint on personal data. Its Storage Limitation Principle sets the hard ceiling for raw personal data retention. This is regardless of the EU AI Act’s documentation rules.

The crucial legal distinction is that PII and other personal data used to create the system must be subject to rigorous deletion procedures the moment the training purpose ends. The technical documentation, metadata, and system logs (which should contain no personal data) are then subject to the EU AI Act’s extended 10-year retention rules. This hierarchy demands that the deletion process (the GDPR) must happen first, leaving only the audit trail (EU AI Act) behind.

The documentation required under the EU AI Act must serve dual purposes: it must confirm the system’s data quality (EU AI Act) and must also provide evidence of the deletion or robust anonymization event, confirming that the GDPR timeline was honored.

Table: Comparison of differences

Summary	GDPR (Personal Data Protection)	EU AI Act (HRAS Accountability)
Asset	Raw PII, Pseudonymous Data, Identifiable Metadata.	Technical Documentation, QMS, System Logs (Non-Personal), Conformity Records.
Core Principle	Storage Limitation (Delete when purpose ends).	Accountability & Traceability (Document for 10 years).
Max Retention Period	Defined by Controller’s Justified Purpose (Short/Medium Term).	10 years after the system is placed on the market.
Legal Hierarchy	Primary binding constraint on identifiability.	Governs the necessary audit trail after GDPR constraints are met.
Highest Penalty Risk	4% Global Annual Turnover (Financial).	Operational disruption, market access denial.

The Financial & Operational Cost of AI Data

Compliance is not just a cost, but a powerful risk mitigator. Storing raw personal data beyond the necessary period is a direct violation of the GDPR’s Storage Limitation Principle. This exposes an organisation to fines of up to 4% of global annual turnover (GDPR Article 83).

Beyond the fines, excessive data retention creates massive operational liability. Longer storage times mean higher infrastructure costs and a larger surface area for security breaches. Every day the data is held, the probability of a costly Data Subject Request (DSR) increases, demanding expensive legal and technical personnel to fulfill. Compliant, timely deletion is ultimately the most financially responsible strategy.

Should you store raw personal data for training?

Organisations often retain raw data for perceived future utility, perhaps for retraining a system. The GDPR forces a hard strategic trade-off: is the speculative future value of that raw personal data worth the immediate, tangible risk of massive fines and data breaches?

The EU AI Act demands auditable records, but these should be built from fully anonymised data or non-personal data metadata. The cost calculation is simple: the threat of financial penalty for retaining personal data too is a much greater risk or potential cost than developing a compliant, data-minimal system. A mature data strategy prioritises de-identification and deletion over retention, significantly reducing the organisation’s regulatory and financial exposure.

Data Type	Legal Status	Retention Requirement	Effect on AI Systems
Raw Personal Data (PII)	Personal data under the GDPR	Must be deleted as soon as the training purpose ends (Article 5(1)(e))	Limits availability for retraining; requires technical deletion pipelines; increases compliance complexity if data spans multiple systems
Pseudonymised Data	Still personal data under the GDPR	Same as raw personal data; cannot retain for 10-year audit	Provides limited utility for internal processing, but retention beyond purpose is legally risky; still triggers Data Subject Requests and fines if not deleted
Irreversibly Anonymised Data	Non-personal data (Recital 26)	Can be retained indefinitely	Supports long-term model auditing, retraining, bias checks, and the EU AI Act traceability; safe to store for 10-year audit requirements
Metadata / Technical Documentation	Non-personal data	Retention required up to 10 years under the EU AI Act (Articles 10, 18)	Supports HRAS compliance; ensures traceability without exposing personal data; must be designed to avoid inclusion of PII
System Logs	Non-personal / anonymized	Retention period must be “appropriate,” often aligned with the EU AI Act 10-year audit	Enables audit and monitoring; must be anonymized to avoid GDPR violations; operational impact includes storage and secure access management

Strategic Recommendations

The regulatory landscape governing AI development in the EU is defined by a critical tension:

the immediate obligation to protect individual privacy (GDPR) and
the extended obligation to ensure system safety and traceability (EU AI Act).

Compliant data management requires recognizing the GDPR’s Storage Limitation Principle as the absolute constraint on personal data retention. This is regardless of the EU AI Act’s documentation timelines. The solution is architectural separation, where raw personal data is subject to automated deletion, and the audit trail is constructed exclusively from non-personal, irreversibly anonymized assets.

TLDR;

Under the GDPR, personal data must be deleted once its specific purpose is fulfilled. This limits how long raw training data can be stored.
For AI developers, this means models cannot indefinitely rely on historical raw personal data. This can potentially impact retraining strategies and model evolution.

The post AI Data Retention Strategy under the GDPR and the EU AI Act: Reconciling the Regulatory Clock appeared first on TechGDPR.

Introducing the Blockchain DPIA Template for GDPR Compliance

AJ Richter — Tue, 21 Oct 2025 13:26:40 +0000

The Blockchain DPIA Template: Ensuring GDPR Compliance in a Decentralized World

Blockchain is transforming industries by enabling transparency, trust, and decentralization. However, when it comes to handling personal data, blockchain presents significant challenges. The GDPR places strict requirements on data processing, many of which are difficult to reconcile with blockchain’s core characteristics. The European Data Protection Board (EDPB) recently issued draft guidance (Guidelines 02/2025 on processing of personal data through blockchain technologies, for public consultation) where they suggested that when personal data is processed on a blockchain a Data Protection Impact Assessment (DPIA) has to be carried out, and with a low threshold for data being ‘personal’, even transactions would be personal data in many cases.

We created a comprehensive Blockchain DPIA Template that helps organisations meet these requirements by providing a structure and toolkit to assess, document, and manage privacy risks in blockchain systems.

Request our Blockchain DPIA Free Template

Why Blockchain Needs a Data Protection Impact Assessment

A Data Protection Impact Assessment, or DPIA, is a crucial process mandated by the GDPR for processing activities that pose a high risk to the rights and freedoms of individuals, or are on specific blacklist. It helps organizations identify and minimize the data protection risks of a project. For emerging technologies like blockchain, which often involve novel data processing methods, conducting a thorough DPIA is not just a legal requirement but a fundamental step towards responsible innovation. This article introduces our new blockchain specific DPIA template, designed to help navigate the complexities of GDPR compliance in decentralized environments.

The challenge of the GDPR in decentralized systems

Blockchain technology introduces features that directly affect privacy and data protection. The GDPR requires organisations to uphold data subject rights, such as the right to erasure, the right to rectification, and the right to access. These rights can be difficult to enforce on an immutable and distributed ledger.

In a typical blockchain network, data is stored across many nodes, sometimes in different legal jurisdictions. This raises questions about international data transfers and how organisations can maintain control over the information they process.

Blockchain’s inherent characteristics present unique challenges for GDPR compliance. Its immutability, for instance, clashes with the fundamental right to erasure. The global distribution of blockchain nodes also complicates data transfers and jurisdictional oversight.

Risks of non-compliance

If an organisation fails to adequately assess and mitigate data protection risks, it may face regulatory action, reputational harm, or loss of user trust. A blockchain DPIA is a critical step to show accountability and demonstrate compliance with the GDPR.

Failing to comply with the GDPR can result in significant fines and severe reputational damage. For blockchain projects, where trust and transparency are paramount, avoiding such risks is critical for long term success.

About the Blockchain DPIA Template

Who is it for?

The blockchain DPIA template is designed for privacy professionals, compliance officers, legal teams, blockchain developers, and project leads. It provides a structured way to assess the data protection implications of blockchain-based processing.

This template is an invaluable resource for privacy professionals, blockchain developers, and data protection officers, or DPOs, who are grappling with GDPR compliance in the blockchain space.

What does it include?

The template guides users through all required areas of a DPIA under the GDPR:

Description of the processing operations
Legal basis and necessity assessment
Identification of risks
Safeguards and technical measures
Data subject rights and governance structures

It focuses on blockchain-specific concerns such as data immutability, public ledger transparency, pseudonymisation, and decentralised accountability.

The template provides a comprehensive framework covering various aspects of a blockchain project. It systematically addresses processing operations, establishes the appropriate legal basis, facilitates thorough risk assessment, and outlines necessary safeguards to uphold data subject rights.

Alignment with GDPR Article 35 and privacy by design principles

Our template is meticulously aligned with Article 35 of the GDPR, which mandates DPIAs for high risk processing. It also strongly promotes privacy by design principles, encouraging privacy considerations from the very initial stages of development.

Key Features and Structure of the Template

Comprehensive processing description

The template helps users map how personal data flows through blockchain systems. This includes both on-chain and off-chain components, data categories, infrastructure models, and participating entities. The template offers a structured approach to mapping how personal data flows and is processed within blockchain environments, a critical first step in any DPIA.

Risk identification tailored to blockchain

The template includes a detailed risk taxonomy specifically designed for blockchain environments. It highlights risks such as:

Immutability preventing data deletion
Broad visibility of data on public chains
International data transfers to unknown jurisdictions
Difficulties in exercising data subject rights

It specifically addresses the unique risks posed by blockchain technology, including issues related to immutability, transparency, and decentralized governance.

Measures to reduce risk and demonstrate compliance

The template includes practical tools and suggestions for implementing effective risk mitigation strategies and technical safeguards, such as encryption, pseudonymization, and the appropriate use of off chain storage solutions. These are aligned with the GDPR principles of data protection by design and by default.

Benefits of Using This Template

Saves time and ensures completeness

The blockchain DPIA template includes ready-to-use sections, prompts, and examples. It reduces the risk of overlooking key aspects of the GDPR and ensures all critical issues are addressed. Using a pre designed template significantly saves time and helps ensure that no critical aspect of your DPIA is overlooked.

Builds trust with regulators and stakeholders

A well-documented DPIA shows that your organisation takes data protection seriously. It provides a clear record of decisions, risk mitigation strategies, and safeguards, which can be shared with regulators or partners. Demonstrating a commitment to data protection through a thorough DPIA builds trust with regulators and enhances user confidence in your blockchain project.

Supports privacy-respecting innovation

The template helps teams think about data protection from the start. It supports innovation that respects individual rights and meets the expectations of users and regulators alike. Ultimately, this template supports and promotes responsible innovation, allowing blockchain projects to thrive while respecting individual privacy rights.

How to Use the Template Effectively

Integrating it early in the blockchain development lifecycle.
A collaborative approach involving legal, technical, and compliance teams is essential for a holistic and accurate DPIA.
Periodic reviews and updates as the project evolves.

The TechGDPR Blockchain DPIA Template

Our blockchain DPIA template provides a practical solution for navigating these complexities. It helps ensure that blockchain projects are built with privacy and accountability in mind. DPIAs are not merely a bureaucratic hurdle; they are an indispensable tool for ensuring that blockchain technology develops in a privacy respecting manner. By proactively identifying and mitigating data protection risks, we can foster a future where decentralized systems empower individuals while upholding their fundamental rights.

Our Blockchain DPIA Template is available for free and can below.

Request our Blockchain DPIA Free Template

The post Introducing the Blockchain DPIA Template for GDPR Compliance appeared first on TechGDPR.

GDPR Compliance for AI: Managing Cross-Border Data Transfers

AJ Richter — Wed, 23 Jul 2025 07:33:02 +0000

Artificial intelligence (AI) is based on large and varied datasets to train models and enhance functionality. Though AI often works across borders, data protection regulations such as the EU General Data Protection Regulation (GDPR) impose stringent controls on transferring personal data abroad.

The question is evident: how do businesses employ global AI systems and continue to comply with the GDPR cross-border data transfer principles? It is essential to understand the link between AI and personal data and its impact through the legal landscape governing cross-border transfers.

Understanding the AI and the GDPR Landscape

Artificial intelligence systems will typically need to use humongous amounts of data, of which may include personal data. This data is typically obtained from various jurisdictions and processed using cloud platforms, data centers, and development teams in various countries. The worldwide infrastructure complicates the fulfillment of the GDPR since it inhibits the transfer of personal data beyond the European Economic Area (EEA) and United Kingdom.

The GDPR is grounded in fundamental principles of lawfulness, fairness, transparency, limitation of purpose, and data minimization. It also requires accuracy, limitation of storage, integrity, confidentiality, and accountability. These principles should be adhered to by any AI system that involves personal data even when data is transported.

Cross-border data transfers happen when personal data is moved from the EEA to a third country. These are addressed by Chapter V of the GDPR, which dictates the legal frameworks organisations must obey. Since most AI systems are international data processing, virtually all of them are confronted with this regulatory challenge.

Focal Compliance Challenges in Cross-Border AI Projects

There are a few challenges that make it hard to regulate cross-border data in AI:

Terabytes of information: AI systems read text, images, video, audio, and behavior data in volumes that older compliance procedures find difficult to keep up with. It’s no small challenge to collect, categorize, and safeguard these datasets across borders.
Pseudonymization risks: So-called anonymized data can in fact facilitate re-identification, particularly when combined with additional datasets. It is important to understand the difference between pseudonymized and anonymized data.
Lack of transparency: Most AI systems, especially deep learning-based systems, are “black boxes.” This uninterpretability may hinder the ability of organizations to show compliance with the GDPR, especially purpose limitation and data minimization.
Shifting rules: Regular updated guidance from national authorities and the European Data Protection Board (EDPB) on AI, transfers abroad, and the way the two interoperate. Just requirements mount with the arrival of legislation such as the EU AI Act.
Third-party risk: Third-party data suppliers, cloud vendors, and outsourcing data processors are all more likely to be in the AI supply chain. Unless they are properly managed, they bring inherent third-party risk through non-compliance, data loss, or unauthorized transfers.

Legal Frameworks for GDPR-Compliant Cross-Border Transfers

The GDPR provides a range of legal frameworks for cross-border transfers of personal data beyond the EEA, depending on conditions and limitations.

Adequacy decisions are among them. The European Commission will be in a position to determine that a non-EEA nation ensures “adequate” protection for personal data, and data can flow freely. These decisions have been granted to Japan and Switzerland, and the same has been granted to the United States under the new EU–U.S. Data Privacy Framework. Adequacy decisions are not absolute, however, and can be invalidated, as was the invalidation of Privacy Shield.
For organizations in countries not issuing an adequacy decision, Standard Contractual Clauses (SCCs) are the most used. Contractual clauses maintain international data transferred from being reduced below EU levels. Organizations must perform Transfer Impact Assessments and introduce additional safeguards since the Schrems II judgment, in order to lawfully use SCCs.

Binding Corporate Rules (BCRs) is a further possibility for multinationals. They are internal codes of conduct that have to be approved by a data protection authority and are legally enforceable against the corporate group. It is a scalable solution to implement for intragroup data transfers, but it may be time-consuming and costly to obtain the approval.
The GDPR also has limited derogations for certain situations, including where the individual provides unambiguous consent or where a transfer must be conducted in order for a contract to be formed. Exceptions are few and not to be generalized or bulked.

Practical Steps to Remain Compliant

To effectively administer cross-border data transfers, follow these best practices:

Map data flows: Determine where personal data comes from, is processed, and travels.
Perform Data Protection Impact Assessments (DPIAs): DPIAs for riskier AI projects ensure assurance of risk identification in the areas of discrimination, bias, and data protection and transfer risk assessment.
Improve data governance: Establish policies and roles that ensure accountability to operating, technical, and legal teams.This ensures consistency and accountability when dealing with personal data.
Enforce security controls: There must also be organizational and technical controls. These include secure development of AI models, access controls, pseudonymization, and encryption. Security audits and penetration tests done on a regular basis can combat threats that can be used in performing cross-border transfers.
Manage third parties: Secure good data processing terms and ensure all suppliers comply with the GDPR. Any AI supplier or cloud provider dealing with your personal data on your behalf must be subject to rigorous due diligence. This includes negotiating good DPAs and ensuring vendors apply GDPR-level controls.
Train your staff: Make sure staff is educated about their part to play with regard to AI and international processing of data. A specific incident response plan also needs to be created to handle any AI system-related breaches.

Readiness and Regulation

Regulatory requirements are changing. The EU AI Act and industry-specific guidelines from the EDPB and others will keep transforming what looks like compliance with AI. Leading-edge businesses are already constructing governance structures in accordance with the GDPR and these new rules. Technologies such as data flow mapping automation, real-time risk management, and Transfer Impact Assessments run on a regular basis become typical. Legal, technical, and compliance staff need to interact so that AI ingenuity is converged into regulatory requirements.

Conclusion

Cross-border transmissions of AI data under the GDPR is not impossible, but difficult. With good understanding of the regulatory frameworks, operating on high-risk subjects, and adopting good mitigations, organizations can deploy effective AI technologies in immaculate compliance.

Creating AI responsibly involves creating it legally. Now is the time to audit your cross-border data transfer processes, enhance your governance structure, and embed compliance in all areas of your AI work.

The post GDPR Compliance for AI: Managing Cross-Border Data Transfers appeared first on TechGDPR.

How to build trustworthy AI from the ground up with Privacy by Design?

AJ Richter — Wed, 25 Jun 2025 12:15:30 +0000

We now live in a time where technologies such as artificial intelligence are increasingly woven into the fabric of existence. AI is invisibly present performing an array of functions such as showing recommendations, fraud detection, disease prediction, and traffic navigation. However, concern about privacy is growing along with the benefits of these technologies. Questions like who owns the data the model is trained on, if users can consent to algorithmic choices that are above their comprehension, and how do we avoid danger before it happens are some of the extremely concerning questions.

Privacy by Design (PbD) is crucial here. We cannot shy away from saying it’s a good idea, but framing it as ‘critical’ is much closer to the mark. Dr. Ann Cavoukian’s developed framework is integral to embedding privacy in AI infrastructures. It is important to understand how AI developers can infuse PdD into reality alongside explaining the reasoning behind the importance of preserving user privacy.

Understanding PbD starts from the foundation of believing that privacy comes when the service is not looking for or pre-configured by users, but instead set as a default feature.

Understanding Privacy by Design: Principles at the Core

Privacy by Design is based upon the notion that privacy should be the natural default and not an optional feature one must find or switch on. Instead of responding to privacy violations, PbD has companies anticipate them and prevent them from occurring in the first place. Its seven design principles are not idealistic goals; they are pragmatic recommendations for integrating ethical data handling at every stage of the design process.

Picture Privacy by Design as building privacy into a cake rather than sprinkling privacy on top as sprinkles. PbD is an innovative approach to building privacy into systems in the first place.

Here are the seven main principles in more detail:

Proactive not reactive; preventive not remedial: Anticipate risks before they arise. Don’t wait for a breach to act.
Privacy as the default setting: Individuals shouldn’t have to request privacy. It should be automatic.
Privacy embedded into design: Build systems that make it impossible to forget privacy because it’s built in, not added later.
Full functionality by being positive-sum, not zero-sum: Achieve both privacy and innovation; one shouldn’t come at the expense of the other.
End-to-end security and lifecycle protection: Protect data from the moment it’s collected until it’s deleted.
Visibility and transparency: Systems must be open to inspection, review, and explanation.
Respect for user privacy: Keep the user at the center with simple controls and clear, honest communication.

The Unique Privacy Challenges in AI

AI is different from typical software. Its reliance on enormous collections of data and capacity to infer sensitive material from ostensibly harmless points of data make it highly invasive. Voice, text, image, or behavior-trained models can identify not only user tendencies but mood, political orientation, or state of health as well.

This poses a sequence of privacy threats:

Over collection: AI is starved for data, and therefore developers overcollect.
Inferred data: Models have the ability to make truly excellent predictions, often more than what users have expressed in so many words.
Opacity: Most AI models are “black boxes,” where even the developers aren’t necessarily sure how the decisions are being made.

Ignoring privacy can result in:

Fines and lawsuits under legislations such as the GDPR, the EU AI Act and the CCPA.
Loss of customer and user trust.
PR disasters that bury your brand.

Good privacy is not only good business, but good ethics as well.

Best Practices for Integrating PbD in AI Development

In order to design Privacy by Design properly for AI systems, developers need to be strategic as well as practical. Below are crucial steps to follow:

Begin with Privacy Impact Assessments (PIAs): Before creating anything, perform a PIA to discover privacy threats and analyze how your AI system processes information. This way, threats are identified and addressed upfront, instead of once it is deployed. Begin your AI project by questioning:

What information is required?
What are the threats?
How are users safeguarded?

Adopt data minimization and purpose limitation: Collect data only if it’s needed to accomplish a precise, well-defined purpose. This minimizes risk and simplifies handling of privacy obligations. Refrain from the temptation to “collect now, decide later.”
Take advantage of privacy-enhancing technologies: Differential privacy adds noise to statistics, preventing data tracing back to individuals. Federated learning learns models on user devices, reducing central data aggregation. These technologies maintain utility while keeping user identities secure.
Encourage transparency and explainability: Transparency does not solely involve open-sourcing code but more importantly explaining in simple terms how the system functions, what information is used, and what the model is deciding. Interpretation of models and tools such as model cards can assist.
Ensure secure access and data encryption: Both in transit and at rest, data should be encrypted. Controls on access must be strong, restricting access to data by role and need. Regular audits should be performed to ensure compliance.
Build ethical oversight: Develop cross-disciplinary review boards consisting of technologists, legal specialists, ethicists, and community members. Such bodies can review projects for privacy, fairness, and unintended effects.
Design for user empowerment: Provide users with the ability to see, control, and remove their information. Provide privacy controls that are understandable and accessible. Opt-in is the norm, not sneaky default options or unclear text.

Lessons from the real world

Let’s see who’s doing it right and who didn’t:

Apple has been a leader in on-device computing and differential privacy. Their health features, for instance, store personal data locally and anonymized.
Google applies federated learning in its Gboard keyboard to allow for predictive text without ever transmitting what users are typing.
But Clearview AI and Cambridge Analytica are cautionary tales. They were firms that did not respect user privacy and lost lawsuits, penalties, and long-term public distrust.
- Clearview AI scraped billions of images without permission and was met with worldwide outrage.
- Cambridge Analytica harvested Facebook data for political campaigns and sparked worldwide alarm about AI and privacy.

The Trade-Offs and Challenges Ahead

With the best of intentions, it’s hard to implement PbD for AI. There are compromises:

Data minimization vs. performance: Data about people can restrict how much data you process, which can have an impact on model performance because lower numbers of data points can result in lower-performing models.
Anonymity vs. fairness: Reducing bias relies on demographic information, which introduces new privacy issues. To be fair, there is often a requirement for data on race or gender, which is sensitive.
Technical expertise: Federated learning or differential privacy is required to utilize these, which calls for expert know-how as well as computational resources.

These are challenges that are worthwhile overcoming. With privacy as a competitive advantage and a legal requirement, businesses embracing PbD will be far ahead of their competitors for long-term achievement.

What’s coming next?

Regulations are solidifying. The EU AI Act and other initiatives are establishing new norms. Meanwhile, technologies such as homomorphic encryption (so computation can be performed on encrypted information) and synthetic data (which simulates real data without revealing real users) are opening up new paths for privacy-led innovation. These technologies will help AI developers to prioritize how to create systems that safeguard people.

As AI reshapes society, privacy must not be treated as an afterthought. It’s a design choice that reflects an organization’s values, foresight, and respect for its users. Integrating Privacy by Design isn’t just about avoiding penalties; it’s about building systems that are ethical, resilient, and worthy of trust. If you’re building AI, you’re shaping the future. Make it one where people feel safe and respected. By using Privacy by Design, you’re not just avoiding trouble; you’re building trust, improving outcomes, and showing users you’ve got their back.

Every line of code and every product decision is an opportunity to do better. Start now. Make privacy the foundation, not the fix.

The post How to build trustworthy AI from the ground up with Privacy by Design? appeared first on TechGDPR.

How Privacy Enhancing Technologies (PETs) Can Help Organizations Stay GDPR Compliant

AJ Richter — Tue, 13 May 2025 09:22:00 +0000

Safeguarding personal information is now more important than ever. 95% of customers will not engage with companies that cannot offer adequate safeguards for their data. With data protection regulations like the General Data Protection Regulation (GDPR), organizations are under constant pressure to protect sensitive data while ensuring compliance. Privacy Enhancing Technologies (PETs) have emerged as powerful tools to achieve this balance. These technologies not only help secure personal data but also support GDPR compliance by minimizing risks and enhancing confidentiality.

But what are PETs exactly, and how can they help organizations meet GDPR standards? PETs are crucial to securing data and serve a critical role PETs in modern data privacy.

What Are Privacy Enhancing Technologies (PETs)?

Privacy Enhancing Technologies (PETs) are a set of tools and techniques designed to protect personal data throughout its lifecycle. PETs can help reduce the risk to individuals while enabling further analysis of personal data without a controller necessarily sharing it, or a processor having access to it. They aim to minimize the exposure of sensitive information while still enabling data processing. PETs can be categorized based on their primary function: minimization, confidentiality, and control.

Some of the key types of PETs are as follows:

Anonymization: This technique removes or alters personal identifiers so data cannot be traced back to an individual. Under the GDPR, true anonymization is considered irreversible; allowing the data to be stored and used without further GDPR constraints.
Pseudonymization: Unlike anonymization, pseudonymization replaces private identifiers with artificial labels. Although it is reversible under strict controls, it adds a layer of protection by decoupling personal identifiers from the dataset. It is very important to understand pseudonymized data is not the same as anonymized data.
Encryption: Encryption converts data into a coded format, accessible only with a specific decryption key. This ensures that even if the data is intercepted, it remains unreadable to unauthorized parties.
Synthetic data: This allows organizations to create artificial data that mimics real data but preserves user privacy. Synthetic data is often used in AI and machine learning as well as software testing and development.
Differential privacy: This is a mathematical concept that adds randomness or noise to data analysis, making it more difficult to identify individuals.

Confidential computing: This form of data processing prevents unauthorized access to data during computation. It is often used in cloud computing and for healthcare and financial services.
Federated learning: This machine learning approach allows multiple organizations to train algorithms collaboratively without sharing raw data, enhancing both privacy and compliance.
Trusted execution environments: Secure hardware or software environments within a system that provide an isolated area of execution of sensitive operations and protect code and data from external tampering.

By using these technologies, organizations can significantly reduce the risk of data breaches and support GDPR’s core principles. PETs help to ensure that an individual’s data is better protected to avoid any potential data breaches or misuse of data.

GDPR Principles Supported by PETs

The GDPR is built around principles that prioritize data protection at every stage of processing. PETs offer a practical path to compliance by reinforcing these key principles.

The key GDPR Principles can be reinforced through the usage of PETs:

Data Minimization (Article 5): PETs like anonymization and pseudonymization ensure that only necessary personal data is processed, reducing exposure. Techniques like differential privacy also enable organizations to analyze data sets without exposing individual identities, aligning with GDPR’s minimization principle.
Integrity and Confidentiality (Article 5): Technologies such as encryption protect data against unauthorized access, maintaining its confidentiality and integrity. Homomorphic encryption, for instance, allows for computations on encrypted data without revealing its contents, offering enhanced protection.
Technical and Organizational Measures (Article 25): Implementing PETs as part of system design supports privacy by design, a core requirement of the GDPR. This includes pseudonymizing or encrypting data by default, ensuring that privacy safeguards are active even before processing begins.

Organizations can further strengthen their compliance by incorporating PETs into Data Protection Impact Assessments (DPIAs), identifying and addressing potential risks before processing begins. DPIAs help document how PETs mitigate risks by offering a transparent view of data processing activities.

PETs and International Data Transfers

Cross-border data transfers are a major concern under the GDPR, especially after the Schrems II ruling. PETs help address these challenges by adding layers of security to data during transit. Technologies like encryption and federated learning ensure that sensitive information remains protected even during international exchanges. PETs act as supplementary measures to meet the GDPR Chapter 5 (Art 44-50) requirements, reducing risks during cross-border transfers and maintaining compliance with European standards.

Some examples of how PETs can help mitigate this include federated learning that allows for machine learning models to be trained across multiple locations without sharing raw data. This reduces exposure and facilitates compliance with strict European data protection laws. Encryption helps to further ensure that even if data is intercepted during transfer, it remains unreadable without the right decryption keys.

Real-World Applications of PETs

PETs are already being used across various industries to maintain privacy and GDPR compliance.

Here are some of core examples of PET usage:

Healthcare: Differential privacy allows hospitals to share patient data for research while protecting confidentiality.
Technology: Companies like Google and Apple use federated learning to improve their services without centralizing user data. Apple also uses differential privacy.
Finance: Secure computation enables financial institutions to analyze sensitive data while maintaining strict confidentiality.

Implementing PETs requires careful planning and collaboration across IT, legal, and privacy teams. Legal ambiguities around anonymization, integration with legacy systems, and the complexity of deployment can pose challenges. However, conducting DPIAs, aligning strategies with GDPR Article 32, and ongoing training for staff help smooth the integration process. Regular audits and collaborative cross-functional efforts also contribute to effective implementation.

PETs as a Strategic Enabler for GDPR Compliance

Privacy Enhancing Technologies are not just compliance tools; they are strategic assets that enable secure, responsible data processing. For organizations striving to meet GDPR standards, PETs offer a practical path to data minimization, enhanced confidentiality, and secure international transfers.

Implementing PETs as part of your data privacy strategy not only reduces compliance risks but also fosters trust with clients and partners. By embracing these technologies, businesses can navigate the complexities of GDPR with confidence and accountability.

The post How Privacy Enhancing Technologies (PETs) Can Help Organizations Stay GDPR Compliant appeared first on TechGDPR.

How does the GDPR govern retention periods for businesses?

AJ Richter — Tue, 01 Apr 2025 09:52:50 +0000

The General Data Protection Regulation (GDPR) establishes clear guidelines to prevent unnecessary data storage and ensure that personal information is retained only for as long as it serves a legitimate purpose. Storage limitation requires that companies justify and set our data retention periods while considering all legal obligations. Navigating legal requirements and transforming them into practical, actionable measures can be complex. A structured approach makes implementation more seamless.

Understanding GDPR Data Retention Requirements

The GDPR does not specify a specific period of time for which personal data is allowed to be stored. Rather the GDPR, in Article 5: Principles relating the processing of personal data, states that

Personal data shall be: …kept in a form which permits identification of data subjects for no longer than is necessary for the purposes for which the personal data are processed; personal data may be stored for longer periods insofar as the personal data will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) subject to implementation of the appropriate technical and organisational measures required by this Regulation in order to safeguard the rights and freedoms of the data subject (‘storage limitation’);

This principle outlines that personal data should not be stored longer than necessary. There are some exceptions to this as listed in the Article 5(1)(e). These exceptions include anonymisation and taking into account other legal storage requirements. Since the GDPR actively requires companies to follow the principles of storage limitation, it is in best practice to delete the information when the retention period has run out.

However, personal data could also be anonymized instead, as properly anonymized data can no longer be linked to a person. Otherwise, one could consider whether other applicable legislations apply. For instance, German finance law requires that companies maintain records of certain documents. This requirement is mostly related to maintaining tax records for 6 to 10 years. So even if the records contain personal data and are no longer necessary for the processing activity they were initially collected for, they are maintained with respect to other applicable legal requirements.

Determining Retention Periods

The GDPR defines two main roles in the relationship to data: data controller and data processor. The data controller decides the purposes and the means of processing personal data. As a result, the data controller is also responsible for determining the time frame in relation to data retention. The Dutch Data Authority released guidance on applicable questions to ask when a company is determining the retention period of personal data.

Do you have statutory retention periods that must be followed, such as those required by tax laws or the Public Records Act? Are there any ongoing legal proceedings? If so, you are also obligated to retain the personal data.
How long is the data necessary for its intended purpose? Consider your company policy when determining this. For instance, you may need certain data to track outstanding invoices.
The fundamental principle of the law is to keep personal data for the shortest possible duration. Can the retention period be reduced?
Are you a member of a sector organization? If so, they may provide guidance on standard retention periods in your industry, which might be outlined in a code of conduct.

Following the guidance above when considering the storage of personal data can help in determining the best retention period for your business needs. The key requirement to understand when choosing a retention period is that the chosen duration must be able to be justified and the decision must be documented.

Best Actionable Practices for Retention Periods

In examining, various DPA guidances here is a list of actionable best practices for data retention:

Conducting an audit to regularly assess what personal data your company collects, stores, and processes.
Minimizing data collection by only gathering personal data that is strictly necessary for your specified purposes. Be sure to avoid excessive or irrelevant information.
Implementing a data retention policy and reviewing retention periods regularly. This establishes clear retention schedules for different data types, ensuring compliance with industry standards and legal obligations.
Justifying retention periods by basing them on business needs, legal obligations, and potential future claims, avoiding indefinite data retention without a valid reason. Documenting retention deviations by recording justifications whenever data is retained for longer or shorter periods than specified.
Regularly reviewing data processing activities to assess current processes and update retention schedules as new data processing activities emerge.
Following legal and regulatory requirements by retaining data in compliance with industry regulations, tax laws, and professional guidelines. Delete data as soon as it is no longer necessary.
Responding to data subject requests by ensuring that unnecessary data is promptly deleted or anonymized when individuals request erasure.

Training staff on retention policies to ensure they understand retention schedules, deletion procedures, and the risks of premature or improper data deletion.
Archiving data properly by storing older data in clearly labeled, separate electronic folders or indexing paper records for easy identification and disposal.
Ensuring secure disposal of data once retention periods expire, using confidential waste providers or cross-cut shredders for paper records. These practices ensure complete deletion or anonymization for electronic data.

How do you ensure compliance through effective data retention?

To effectively manage data retention under the GDPR requires a careful balance between compliance, business needs, and legal obligations. It is important to implement structured retention policies. Businesses can ensure they are not holding onto personal data longer than necessary while also meeting statutory requirements. Regular audits, clear documentation, and staff training are essential to maintaining compliance and mitigating risks. Adhering to the principle of storage limitation not only protects individuals’ data rights but also strengthens organizational data governance and security.

The post How does the GDPR govern retention periods for businesses? appeared first on TechGDPR.

Self-Hosting AI: For Privacy, Compliance, and Cost Efficiency

AJ Richter — Wed, 12 Mar 2025 11:12:08 +0000

Self-hosting AI models is the future of privacy and compliance. By hosting AI models on personal hardware, individuals and businesses can improve data security while meeting strict regulations like the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA). Most people use hosted artificial intelligence (AI) services such as ChatGPT by OpenAI or Gemini by Google. These are known as cloud-based AI models and the computation is done on servers operated by the AI providers. Self hosting your AI means that you are the controller of all of the data. Unlike cloud-based AI services, self-hosting ensures that all data remains within the user’s direct control. This significantly reduces the risks of unauthorized access, data breaches, and non-compliance with regulatory frameworks.

What does self-hosting an AI model mean?

To be explicit: if one self hosts AI models, it occurs directly on the hardware they own (i.e. one can run Ollama on their laptop). This control allows for enhanced privacy and security. Arguably, if you host an AI model on your device, there is no need for the data to ever leave your device. Therefore, the risk of data breaches or unauthorized access decreases drastically. If one hosts an AI directly on their device, the data does not need to travel far distance. This means the latency is decreased and one receives a faster response (this aspect of speed is hardware dependent). Latency can best be understood as how much time passes between when a question is asked to an AI model and when a response is received.

Most modern computers can run smaller AI models with no issue, but larger models tend to be more resource intensive. There are many resources available that allow one to examine the free open-source models and the hardware compatibility. The benefits to using an open source model can be greater privacy and transparency. The decreased latency also allows for reduced risks of data breaches and a better level of compliance if processing sensitive data using AI models.

Why and how to invest in self-hosting AI models?

To run usable AI models, hardware plays a crucial role. Self-hosting AI models require a graphical processing unit (GPU) for optimal performance, as running AI solely on a central processing unit (CPU) leads to slower computations and, as aforementioned, higher latency.

What are the key benefits of self-hosting AI models:

Improved Performance: GPUs significantly enhance processing speed, allowing AI models to generate responses faster.
Cost Savings Over Time: While the initial investment in hardware may be high, self-hosting eliminates recurring cloud subscription fees—leading to long-term financial benefits.
Data Control & Privacy: Self-hosting removes dependence on third-party cloud providers, ensuring full control over sensitive data.
Regulatory Compliance: Self-hosting reduces the risk of breaches and helps meet strict regulations like the GDPR and the HIPAA.
Avoids External Policy Changes: Cloud-based AI providers frequently update pricing models, governance rules, and data policies. Self-hosting AI models provide stability and predictability in data management.
Eliminates Token Costs: Using AI services from major providers (e.g., OpenAI, Google) requires purchasing tokens, making usage costs unpredictable. Self-hosting avoids reliance on fluctuating pricing. As demonstrated in the included chart, these prices are ever fluctuating and the cost of using AI that is not self-hosted is that one is at the whim of the cost dictated by the service provider.

Fluctuating AI Token Costs

By investing in local AI infrastructure, businesses and individuals regain autonomy over AI processing, ensuring cost efficiency, data privacy, and long-term stability. Investing in the hardware means that one is not at the whims of the service provider for your virtual cloud instance. It allows for complete control over the data and for an eventual decrease in the amount of money self-hosting AI costs.

How can using self-hosting AI help with regulatory compliance?

Self-hosting AI models is a crucial step toward ensuring compliance with data protection regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA), while also reducing reliance on big tech companies. Under Article 9 of the GDPR, sensitive personal data, such as health information, biometric data, and racial or ethnic origin, requires strict protection and cannot be processed without explicit consent or a lawful basis. By self-hosting AI models, organizations retain full control over such data, minimizing the risk of unauthorized access and third-party breaches.

Studies have shown that developing AI models within institutional boundaries, particularly in healthcare, enhances privacy and regulatory compliance. It allows for more ethical and secure AI deployment. Furthermore, reliance on centralized AI models controlled by major corporations raises concerns about monopolized access to data. This can potentially leading to biased decision-making and limited transparency. Self-hosting AI fosters greater ethical responsibility, ensuring that data governance aligns with user interests rather than corporate agendas.

Case study: Deepseek

In the beginning of 2025, there was a huge shock in the AI sphere with the introduction of DeepSeek R1. DeepSeek, a Chinese startup, was able to create and train an open sourced AI model for a fraction of the cost of its competitors. It is free to download and use. Since DeepSeek is based in China, there were growing concerns about using chat.deepseek.com or the application because of where the data is sent. However, if one is to host DeepSeek R1 the data is not sent anywhere the controller. Running DeepSeek as a self-hosted AI model is a simple and cost-effective way to explore the benefits of self-hosted AI, including privacy, performance, and cost savings.

Why is DeepSeek good for privacy?

But, do self-hosted AI models perform worse?

Short answer: No. A Swiss study showed that using a small local Deep Neural Net (DNN) alongside a remote large-scale AI model can help reduce the prediction cost by half without affecting the system’s accuracy. Essentially in 2022, Chat GPT-3 models cost $0.48 per request. The study worked by putting the input to a local hosted DNN for a response. If the response was trustworthy, the response was not forwarded to the GPT. If the output was not trustworthy, the GPT would need to compute the response. The local DNN was able to generate a correct prediction or response for 48% of the input needed and lost very little accuracy. Self-hosted AI models are able to save money for individuals. This is done by saving tokens and avoiding expensive calls with very little loss in terms of accuracy.

Why should businesses adopt self-hosting AI?

In a world where AI is increasingly intertwined with daily life, the decision to self-host AI models offers a powerful alternative to cloud-based solutions. By self-hosting AI models on personal hardware, one can improve:

Data Security: Eliminates external risks by keeping information in-house.
Regulatory Compliance: Easier to meet industry-specific privacy laws.
Cost Efficiency: Reduces long-term expenses related to cloud computing and API usage.
Customization & Flexibility: Empowers users to fine-tune models to their specific needs, ensuring greater transparency and understanding of how AI systems operate.
Improved Performance: Faster response times and reduced latency lead to better user experiences.

With advancements in open-source models like DeepSeek R1, running self-hosted AI models is more accessible than ever. This allows users to benefit from high-performance models without sacrificing privacy or autonomy. As AI continues to evolve, self-hosting AI models stands as a viable and increasingly necessary choice for those who prioritize control, security, and ethical responsibility in their AI usage.

The post Self-Hosting AI: For Privacy, Compliance, and Cost Efficiency appeared first on TechGDPR.

Upcoming Webinar: The Trump Effect on EU-US Data Transfers

AJ Richter — Tue, 04 Feb 2025 13:09:08 +0000

TechGDPR invites you to another insightful live discussion, The Trump Effect. Join our new Senior Consultant and former Information Commissioner, Stewart Haynes, alongside our Managing Partner, Silvan Jongerius, for an in-depth examination of how U.S. policies under the Trump administration have influenced EU-US data transfers and the broader regulatory landscape.

Webinar Sign Up

Date: Tuesday, February 11, 2025
Time: 14:00 CET
Where: LinkedIn Live

Why You Should Attend

Transatlantic data transfers remain a hot-button issue, and understanding their legal, political, and business implications is more critical than ever. This session will provide expert insights into the historical, legal, and strategic dimensions of EU-US data transfers, including:

Overview of EU-US Data Transfer Mechanisms: A deep dive into Privacy Shield, Standard Contractual Clauses (SCCs), and the evolution of cross-border data frameworks.
Impact of U.S. Policies Under the Trump Administration: Analyzing shifts in surveillance, national security, and international data flow policies, along with their ramifications for European privacy laws.
Legal and Regulatory Developments: Exploring key rulings such as Schrems II, the invalidation of Privacy Shield, and how the EU has responded to protect its data sovereignty.
Business and Compliance Implications: Examining the challenges organizations face when transferring data across the Atlantic, along with strategies to mitigate risks and remain compliant.
Geopolitical and Diplomatic Considerations: Understanding the balance between national security interests and data privacy, and how these concerns shape transatlantic relations.
Future Outlook and Strategic Considerations: Predictions on upcoming reforms, potential new frameworks under different U.S. administrations, and best practices for staying ahead in a shifting regulatory landscape.

Key Topics Covered

The evolving state of EU-US data transfer agreements;
How businesses can prepare for legal and compliance risks;
Lessons from Schrems II and other landmark decisions;
Strategic considerations for navigating geopolitical tensions; and
Best practices for ensuring secure and lawful data transfers.

This session is designed to provide decision-makers, compliance officers, and privacy professionals with a comprehensive understanding of how past U.S. policies have shaped today’s regulatory challenges—and what the future may hold. Stewart Haynes brings his knowledge as a former information commissioner about what to expect as the regulatory landscape changes with the new US presidency.

Sign Up Now to Secure Your Spot!

Don’t miss this opportunity to gain exclusive insights from a former regulator and a leading privacy expert. Whether you are a legal professional, business executive, or privacy enthusiast, this webinar will equip you with the knowledge needed to navigate the complexities of transatlantic data flows with confidence.

We look forward to seeing you on February 11, 2025!

The post Upcoming Webinar: The Trump Effect on EU-US Data Transfers appeared first on TechGDPR.