No items found.

Build for CPRA Compliance and Leverage Generative AI

No items found.
October 6, 2023

CPRA mandates stringent data handling processes. Learn how you can build for CPRA compliance and address privacy concerns with generative AI by using a data privacy vault.

California, Colorado, Connecticut, Utah, Virginia, and several other states have all adopted comprehensive data protection regulations that are in effect now, or that take effect later in 2023, increasing the need for businesses to find comprehensive data privacy solutions to ease compliance.

In this post, we’ll discuss California’s CPRA, the key changes you need to make in your business to comply with CPRA, and how you can use a data privacy vault to ease compliance and avoid the fines and reputational damage that can result from CPRA non-compliance. We’ll also look at how to take LLMs out of compliance scope using the same technology.

What Is the California Privacy Rights Act (CPRA)?

In November 2020, the majority of California citizens voted in favor of amending the existing California Consumer Privacy Act (CCPA). The resulting California Privacy Rights Act (CPRA) gives consumers more control over their personal data and limits how businesses can use sensitive personal information.

CPRA is quite similar to GDPR with regard to:

  • Protecting the personal information of users
  • Recognizing consent as a legal basis to process personal data
  • Offering special opt-in rights to protect minors under the age of 16
  • Encouraging businesses to pseudonymize personal data and prevent attribution
  • Allowing transfer of personal data only to a country or organization that has adequate protection
  • Providing consumers with the right to delete personal information
  • Offering individual consumers the right to data portability
  • Enforcing monetary non-compliance fines

In addition to these similarities, CPRA considers all biometric and geo-location data to be sensitive personal information. CPRA also enables consumers to access, delete, or correct inaccurate data, and also lets them opt out of having their data shared with or sold to other businesses.

Who Must Comply with CPRA?

Businesses that buy, sell, or share the personal data of 100,000 or more California residents must comply with the CPRA. Businesses that either have a gross annual revenue of over $25 million or that earn over half their revenue from selling or sharing the personal data of California residents must also comply with CPRA.

If your business falls into any of these categories, then one thing you must do is update your privacy notice to detail how you will collect, use, and store consumers’ personal information.

Businesses that fail to comply with the CPRA risk facing compliance fines from the newly-formed California Privacy Protection Agency (CPPA) and the Office of the Attorney General. As of July 1, 2023, consumers are able to file complaints with the CPPA.

What Do You Need to Know About CPRA?

From a business perspective, the most critical components of CPRA are increased rights for consumers, new compliance standards, financial penalties for non-compliance, and enforcement by the CPPA.

Increased Rights for Consumers

In addition to managing user data responsibly, businesses also need to develop processes and frameworks that give their users control over their data. So to comply with CPRA, your business needs to give all customers who reside in California the option to correct, delete, or request a copy of any records containing their personal information. CPRA has also extended its scope to protect all personal information in the context of B2B relationships. So you’ll need a process to detect and remove sensitive data any systems that include this type of information, including databases of legal agreements, contracts, analytics databases, and any LLM-based AI tools. 

New Compliance Standards

Your business also needs to implement data governance policies that control how you collect, use, and share personal information. And, you need to secure the personal information you collect and tell your users how it will be used. This disclosure must include any plans to share California residents’ personal information with third party services or external organizations, or to use it in analytics tools.

Also, if you want to use this personal information for a new purpose that wasn’t a part of the existing data use agreement, then you’ll need to obtain updated consent. And you’ll need to provide users with an easy way to opt out of the sale of their personal information.

Non-Compliance Fines and CPPA Enforcement

In case of a CPRA compliance issue, the CPPA can serve a notice to businesses, giving them 30 days to address the complaint. If a business doesn’t resolve the reported CPRA violation, then each violation is grounds for a $2,500 penalty. in some cases, if the violation is proven to be intentional or the result of negligence, the CPRA fine can increase to $7,500.

Build for CPRA Compliance with Skyflow Data Privacy Vault

A data privacy vault isolates sensitive data, securing it with advanced encryption and tokenization techniques, and fine-grained access controls. Skyflow Data Privacy Vault protects sensitive data against bad actors, and lets you future-proof your data protection strategy so you’re ready to quickly achieve compliance as data privacy laws evolve and go beyond the minimum compliance requirements to deliver effective data privacy.

Here are four ways that your business can use Skyflow to build for compliance with CPRA and similar regulations.

1. Isolate Sensitive Personal Information to Reduce Your Compliance Footprint

Reduce CPRA compliance complexities by isolating all sensitive personal information in a centrally located, secure vault.

For example, you can use Skyflow to isolate and protect personal information, such as user names, phone numbers, email IDs, and contact addresses, separating it from other non-sensitive data records. This greatly reduces the surface area that you need to protect and monitor, as shown below:

Isolate and Protect Personal Information with Skyflow Data Privacy Vault

Without using Skyflow Data Privacy Vault, your mobile apps, web apps, analytics, and other systems suffer from sensitive data sprawl. This creates the need for expensive data monitoring tools and creates an unnecessarily large CPRA compliance footprint.

2. Secure Data with Polymorphic Encryption and Tokenization

Skyflow uses polymorphic encryption to support read-only operations on fully encrypted personal information records like names, ages, phone numbers, ID numbers, and credit scores. So if you want to find the aggregate age of users in a table or get all the users who have a credit score greater than 650, you can perform these analytic operations directly on the encrypted records that reside within your Skyflow vault, with no need to decrypt these records.

Tokenization is another technique that Skyflow uses to keep sensitive data safe. It swaps personal information, such as customer names, payment card details, and email IDs, with non-sensitive tokens that can’t be exploited by bad actors — even when the data gets compromised. You can use these tokens in any data pipeline that may get shared with external entities or third-party tools without worrying about compliance violations.

3. Implement Granular Data Governance Controls

Comply with existing and future CPRA regulations by controlling who can access what data.

For example, a customer support agent (CSA) at a health insurance company doesn’t require a patient’s entire Social Security number (SSN) to verify a patient’s identity. So, instead of allowing CSAs to access the unredacted SSN records of all patients, with Skyflow, you can easily enforce a policy to redact the first five digits of all SSNs:

To comply with CPRA, you can also make this policy specific to residents of California by adding the following qualifier to the policy expression language:

Skyflow Data Privacy Vault lets you establish new data governance policies, implement fine-grained access controls, and assign roles and attributes for each resource without any custom engineering effort.

4. Automatically Maintain Detailed Audit Logs

Maintaining a searchable audit trail showing how, when, and where personal data is used makes it easy to demonstrate data privacy compliance.

Skyflow automatically logs all events that take place in the vault. So all of your sensitive data flows get documented in detailed audit logs. As a result, whenever you need to audit or investigate a data event, you can easily check the Skyflow audit logs to aid your investigation or to prove compliance.

Now that you’ve seen how Skyflow helps you build for compliance with CPRA and similar data privacy laws, let’s look at how it helps you remove generative AI LLMs from your compliance scope so you can harness this exciting new technology to drive business results. 

Build for Compliance to Unblock Adoption of Generative AI LLMs

Skyflow provides comprehensive privacy-preserving solutions that let you prevent the leakage of sensitive data into large language models (LLMs) like GPT. This addresses privacy and compliance concerns around LLM training (including multi-party model training) and inference from user inputs, and removes LLM-based AI tools from your scope of compliance for CPRA and other data privacy laws. 

Skyflow LLM Privacy Vault seamlessly integrates into your existing data infrastructure to add an effective layer of protection for sensitive data. It protects this data by preventing plaintext sensitive data from flowing into LLMs, replacing this data with vault-generated tokens. Later, sensitive data is only revealed to authorized users as model outputs are shared with those users.

You can see how Skyflow fits into your LLM toolchain below:

How Skyflow LLM Privacy Vault Integrates with Your LLM Toolchain

Skyflow preserves data privacy throughout the lifecycle of LLMs, including the following  workflows:

  • Model Training: Skyflow enables privacy-safe model training by excluding sensitive data from datasets used in the model training process.
  • Multi-party Model Training: Skyflow supports multi-party training so that multiple parties (i.e., multiple businesses or individuals) can de-identify sensitive data from their datasets and then build shared datasets that preserve data privacy.
  • Inference: Skyflow also protects the privacy of sensitive data by preventing it from flowing into LLMs by inference from prompts or user-provided files.

To learn more, see our blog about Skyflow LLM Privacy Vault.

Future-Proof Your Business Against Current and New Privacy Regulations

In addition to the five states that have recently enacted comprehensive data privacy regulations, at least 25 other US states and territories are either introducing or considering nearly 140 consumer privacy bills as of 2023

As a result, you face a conundrum where you must comply with CPRA and other regulations without disrupting your existing business operations. And, if your business operates internationally, you might be subject to a variety of data protection laws in other regions, each with its own data protection requirements – which sometimes include data residency requirements. This complex and ever-shifting global compliance landscape can hinder your ability to innovate by adopting new technologies like LLMs.

Keep Reading

December 12, 2024

Unlocking Privacy-Preserving AI with Skyflow’s Secure AI Functionality

Discover how Skyflow’s Secure AI Functionality empowers businesses to build privacy-preserving AI applications with enhanced usability, advanced privacy controls, and seamless data management—unlocking innovation while safeguarding sensitive information.
August 20, 2024

LLM Data Privacy: How to Implement Effective Data De-identification

Using this sensitive data to train an LLM raises significant data privacy concerns and compliance risks. The solution then is to de-identify sensitive data before using it to train LLMs.
April 18, 2024

How to Protect, Secure, and Use Unstructured Data

Unstructured data, which makes up approximately 80 to 90% of all data, has remained largely untapped due to lack of proper tooling. With the introduction of data lakes and lakehouses in the past decade, and more recently LLMs, organizations have begun unlocking the potential of this data.