Unlocking Privacy-Preserving AI with Skyflow’s Secure AI Functionality

December 12, 2024

Discover how Skyflow’s Secure AI Functionality empowers businesses to build privacy-preserving AI applications with enhanced usability, advanced privacy controls, and seamless data management—unlocking innovation while safeguarding sensitive information.

As businesses accelerate AI adoption through training, fine-tuning, building applications, or developing agents, data emerges as both a key enabler and a significant risk. Each approach brings its own challenges, with one critical hurdle standing out: leveraging data for innovation while safeguarding user privacy. Skyflow has enabled customers to discover and de-identify sensitive data, empowering them to build privacy-preserving AI applications. Its widespread adoption has integrated privacy into MLOps pipelines for numerous businesses.

Building on our customer feedback, we’re excited to announce new enhancements designed to improve usability, expand functionality, and deliver advanced privacy controls.

Exploring What’s New in Skyflow’s Secure AI Functionality

1. Enhanced Developer Experience

We’ve simplified and streamlined how developers can get started with Skyflow’s Secure AI Functionality.

  • Vault Template: With a single click, easily deploy a fully configured Skyflow Vault for Detect in seconds via UI or API, with a pre-built template optimized for seamless setup. This allows you to get quickly started on consuming the APIs rather than spend time designing the vault schema. This template includes comprehensive column configurations for all supported entity types and recommended tokenization settings, ensuring the best performance and relevance for securely storing sensitive data.
vault template - data privacy vault
  • Dedicated File-Type Endpoints: Processing unstructured data in files like PDFs, audio files, and images count among the most heavily used features by our customers. To make this even more intuitive, we now offer dedicated endpoints tailored to each file type, enabling optimized processing with format-specific options. some text
    • PDFs: Optimized handling for density and resolution.
    • Audio Files: Customizable redaction options like bleep frequency.
    • Images: Enhanced transformations for privacy enforcement.

For a deeper dive into how these endpoints work, refer to the API reference here.

These changes make integrating Skyflow Detect into existing workflows seamless while maintaining flexibility for diverse data formats.

2. Automated Entity-Column Mapping

Say goodbye to manual and error-prone configurations when storing detected entities in the vault! Skyflow Detect now automatically maps detected entities to the pre-configured columns in your vault. This enhancement eliminates configuration errors and makes it easy to get quickly started. Note that this applies to the text endpoint only for now.  See the API below as an example of storing names and dates as entities in the vault without needing any configuration.

1curl --location '{{url}}/v1/detect/deidentify/string' \
2--header 'X-SKYFLOW-ACCOUNT-ID: {{account_id}}' \
3--header 'Content-Type: application/json' \
4--header 'Authorization: ' \
5--data '{
6    "text": "Alice Smith and Bob Evans are 43 and 52 years old respectively and they live at Villa d’Este Via Regina, 40 22012 Cernobbio – Italy.",
7    "entity_types": ["all"],
8    "token_type":{
9        "vault_token":["name","date"]
10    },
11    "transformations":{
12        "shift_dates":{
13            "min_days": 1,
14            "max_days": 5,
15            "entity_types": ["date"]
16        }
17    },
18    "vault_id": "<VAULT_ID>"
19}'
20

3. Fine-Grained Context Preserving Tokenization Options

Need flexibility in the type of tokens being generated for the same dataset? We’ve got you covered! We have added additional functionalities in the new API, which can give you fine-grained control of the token types for each entity. You can specify token types for de-identification based on your use case. For example:

  • Vault Tokens: Ideal for scenarios requiring re-identification later on, such as inference applications.
  • Entity-Only Tokens: Best for one-way anonymization, commonly used in model training and fine-tuning workflows.

4. Flexible Date Transformations

Dates play a critical role in many applications, from personalized services to compliance reporting. However, not all dates are equally sensitive, and treating them as such can create unnecessary complexity or data loss. With the new APIs, you gain fine-grained control over how date transformations are handled, enabling you to balance privacy with functionality. For example:

  • Previously challenging: Adding randomization between sensitive dates (e.g., Date of Birth) and non-sensitive date intervals (e.g., warranty periods) was not possible. Blanket transformations of all dates often disrupted analysis or workflows.
  • Now possible: You can tailor date transformations on a granular basis for different date types, such as Date of Birth and Date Intervals.

This flexibility empowers you to anonymize data without losing its analytical utility, ensuring privacy compliance while maintaining the integrity of your applications.

In addition to streamlining usability and improving developer experience, we're thrilled to introduce several exciting new features to our APIs that elevate functionality and provide even greater value to your applications. 

Further Improvements to Access Control

Access control is a cornerstone of data security, especially when managing sensitive information. Many customers complain of privilege creep, where a user is assigned unnecessary or excessive access rights. These then become the vectors for security vulnerabilities. To mitigate this, we have introduced the Detect Invoker Role, which has scoped permissions only to invoke the Detect API. This role allows Vault Administrators to assign just the required level of access, allowing users to interact with the vault securely without over-provisioning permissions. This enhancement simplifies operational workflows and significantly bolsters your organization’s security posture by adhering to the principle of least privilege.

Extending Fine-Grained Re-Identification to RAG-Based Applications

As businesses adopt AI-powered conversational interfaces, maintaining fine-grained control over data visibility is critical. For instance, organizations building RAG-based chat agents must dynamically reveal or redact sensitive data based on user roles and context. Consider this: your marketing team may only require access to the last four digits of a phone number, while your support team needs full access to assist customers effectively. With the enhanced APIs, you can now enforce precise access controls for sensitive data, tailored to your specific needs.

  • The API now supports format maps to define how entities are revealed, masked, or redacted based on column configurations in the vault.
  • You can customize the re-identification of entities on a per use case and per end user basis for greater flexibility in conversational AI applications.

This ensures that customer interactions remain compliant and secure without compromising user experience.

Final Thoughts

Building privacy-preserving AI applications is a journey, and these updates represent another step in making that journey seamless and impactful. As businesses increasingly turn to AI, protecting sensitive data remains a critical challenge. With our revamped APIs, Skyflow reaffirms its commitment to being the trust layer for the modern AI stack, empowering you to innovate responsibly.

Getting started doesn’t have to be complex. Check out this repository of examples showing how Skyflow’s Data Privacy Platform ensures the secure handling of sensitive data on the Databricks Lakehouse. Explore the repository here.

Ready to transform the way you manage sensitive data? Explore the API here or book a demo here.

Keep Reading

December 2, 2024

Building Enterprise-Ready Secure AI Agents with Skyflow

Discover how Skyflow helps build secure AI agents that protect sensitive data and ensure compliance in the modern AI ecosystem.
Data Privacy Vault
Data Residency
October 28, 2024

India SEBI's New Cybersecurity and Cyber Resilience Framework: Data Protection Strategies for Regulated Entities

Learn about SEBI’s new Cybersecurity and Cyber Resilience Framework (CSCRF) for regulated entities in India. Discover key data protection strategies for compliance and enhanced security.
October 25, 2024

CFPB Finalized Rule 1033 to Protect Data Privacy: What to Know

Learn about the CFPB 1033 Rule and how Skyflow can help address the requirements without sacrificing efficiency or security.