All Resources
In this article:
minus iconplus icon
Share the Blog

Emerging Data Security Challenges In the LLM Era

February 20, 2024
3
 Min Read
Data Security

In April of 2023, it was discovered that several Samsung employees reportedly leaked sensitive data via OpenAI’s chatbot ChatGPT. The data leak included the source code of software responsible for measuring semiconductor equipment. This leak emphasizes the importance of taking preventive measures against future breaches associated with Large Language Models (LLMs).

LLMs are created to generate responses to questions with data that they continuously receive, which can unintentionally expose confidential information. Even though OpenAI specifically tells users not to share “any sensitive information in your conversations”, ChatGPT and other LLMs are simply too useful to ban for security reasons. You wouldn’t ban an employee from using Google or an engineer from Github. Business productivity (almost) always comes first.

This means that the risks of spilling company secrets and sharing sensitive data with LLMs are not going anywhere. And you can be sure that more generative AI tools will be introduced to the workplace in the near future.

“Banning chatbots one by one will start feeling “like playing whack-a-mole” really soon.”

  • Joe Payne, the CEO of insider risk software solutions provider Code42.


In many ways, the effect of LLMs on data security is similar to the changes we saw 10-15 years ago when companies started moving their data to the cloud.

Broadly speaking, we can say there have been three ‘eras’ of data and data security….

The Era of On-Prem Data

The first was the era of on-prem data. For most of the history of computing, enterprises stored their data in on-prem data centers, and secured access to sensitive data by fortifying the perimeter. The data also wasn’t going anywhere on its own. It lived on company servers, was managed by company IT teams, and they controlled who accessed anything that lived on those systems. 

The Era of the Cloud

Then came the next era - the cloud. Suddenly, corporate data wasn’t static anymore. Data was free and could be shared anywhere - engineers, BI tools, and data scientists were accessing and moving thus free-flowing data to drive the business forward. How you leverage your data becomes an integral part of a company’s success. While the business benefits were clear, this created a number of concerns - particularly around privacy, compliance, and security. Data needed to move quickly, securely, and have the proper security posture at all times. 

The challenge was that now security teams were struggling with basic questions about the data  like: 

  • Where is my data? 
  • Who has access to it? 
  • How can I comply with regulations? 

It was during this era that Data Security Posture Management (DSPM) emerged as a solution to this problem - by ensuring that data always had proper access controls wherever it traveled, this solution promised to address security and compliance issues for enterprises with fast-moving cloud data.

And while we were answering these questions, a new era emerged, with a host of new challenges. 

The Era of AI

The recent rise of Large Language Models (LLMs) as indispensable business tools in just the past few years has introduced a new dimension to data security challenges. It has significantly amplified the existing issues in the cloud era, presenting an unparalleled and exploding problem. While it has accelerated business operations to new heights, this development has also taken the cloud to another level of risk and challenge.

While securing data in the cloud was a challenge, at least you controlled (somehow) your cloud. You could decide who could access it, and when. You could decide what data to keep and what to remove. That has all changed as LLMs and AI play a larger role in company operations. 

Globally, and specifically in the US, organizations are facing the challenge of managing these new AI technology initiatives efficiently while maintaining speed and ensuring regulatory compliance. CEOs and boards are increasingly urging companies to leverage LLMs and AI and use them as databases. However, there is a limited understanding of associated risks and difficulties in controlling the data input into these models. The ultimate goal is to mitigate and prevent such situations effectively. 


LLMs are a black box. You don't know what data your engineers are feeding into it, and you can’t be sure that users aren’t going to be able to manipulate your LLMs into disclosing sensitive information. For example, an engineer training a model might accidentally use real customer data that now exists somewhere in the LLM and might be inadvertently disclosed. Or an LLM powered chatbot might have a vulnerability that leads it to respond with sensitive company data to an inquiry. This is the challenge facing the data security team in this new era. 

How can you know what the LLM has access to, how it’s using that data, and who it’s sharing that data with?

Solving The Challenges of the Cloud and AI Eras at the Same Time

Adding to the complexity for security and compliance professionals is that we’re still dealing with the challenges from the cloud era. Fortunately, Data Security Posture Management (DSPM) has adapted to solve these eras’ primary data security headaches.

For data in the cloud, DSPM can discover your sensitive data anywhere in the cloud environment, understand who can access this data, and assess its vulnerability to security threats and risk of regulatory non-compliance. Organizations can harness advanced technologies while ensuring privacy and compliance seamlessly integrated into their processes. Further, DSPM tackles issues such as finding shadow data, identifying sensitive information with inadequate security postures, discovering duplicate data, and ensuring proper access control.

For the LLM data challenges, DSPMs can automatically secure LLM training data, facilitating swift AI application development, and letting the business run as smoothly as possible.

Any DSPM solution that collaborates with platforms like AWS SageMaker and GCP Vertex AI, as well as other AI IDEs, can ensure secure data handling during ML training. Full integrations with features like Data Access Governance (DAG) and Data Detection and Response (DDR), provide a robust approach to data security and privacy.

AI has the remarkable capacity to reshape our world, yet this must be balanced with a firm dedication to maintaining data integrity and privacy. Ensuring data integrity and privacy in LLMs is crucial for the creation of ethical and responsible AI applications. By utilizing DSPM, organizations are equipped to apply best practices in data protection, thereby reducing the dangers of data breaches, unauthorized access, and bias. This approach is key to fostering a safe and ethical digital environment as we advance in the LLM era.

To learn more about DSPM, request a demo today.

Yoav Regev has over two decades of experience in the world of cybersecurity, cloud, big data, and machine learning. He was the Head of Cyber Department (Colonel) in the Israeli Military Intelligence (Unit 8200) for nearly 25 years. Reflecting on this experience, it was clear to him that sensitive data had become the most important asset in the world. In the private sector, enterprises that were leveraging data to generate new insights, develop new products, and provide better experiences, were separating themselves from the competition. As data becomes more valuable, it becomes a bigger target, and as the amount of sensitive data grows, so does the importance of finding the most effective way to secure it. That’s why he co-founded Sentra, together with accomplished co-founders, Asaf Kochan, Ron Reiter, and Yair Cohen.

Subscribe

Latest Blog Posts

Yoav Regev
Yoav Regev
April 23, 2025
3
Min Read
Data Security

Your AI Is Only as Secure as Your Data: Celebrating a $100M Milestone

Your AI Is Only as Secure as Your Data: Celebrating a $100M Milestone

Over the past year, we’ve seen an incredible surge in enterprise AI adoption. Companies across industries are integrating AI agents and generative AI into their operations to move faster, work smarter, and unlock innovation. But behind every AI breakthrough lies a foundational truth: AI is only as secure as the data behind it.

At Sentra, securing that data has always been our mission, not just to prevent breaches and data leaks, but to empower prosperity and innovation with confidence and control.

Data Security: The Heartbeat of Your Organization

As organizations push forward with AI, massive volumes of data, often sensitive, regulated, or business-critical are being used to train models or power AI agents. Too often, this happens without full visibility or governance. 


The explosion of the data security market reflects how critical this challenge has become. At Sentra, we’ve long believed that a Data Security Platform (DSP) must be cloud-native, scalable, and adaptable to real-world enterprise environments. We’ve been proud to lead the way, and our continued growth, especially among Fortune 500 customers, is a testament to the urgency and relevance of our approach.

Scaling for What's Next

With the announcement of our $50 million Series B funding round, bringing our total funding to over $100 million, we’re scaling Sentra to meet the moment. We're counting on strong customer momentum and more than tripling revenue year-over-year, and we’re using this investment to grow our team, strengthen our platform, and continue defining what modern data security looks like.

We’ve always said security shouldn’t slow innovation - it should fuel it. And that’s exactly what we’re enabling.

It's All About the People


At the end of the day, it’s people who build it, scale it, and believe in it. I want to extend a heartfelt thank you to our investors, customers, and, most importantly, our team. It’s all about you! Your belief in Sentra and your relentless execution make everything possible. We couldn’t make it without each and every one of you.

We’re not just building a product, we’re setting the gold standard for data security, because securing your data is the heartbeat of your organization!

Innovation without security isn’t progress. Let’s shape a future where both go together!

Read More
Meni Besso
Meni Besso
April 21, 2025
Min Read
Compliance

How to Scale DSAR Compliance (Without Breaking Your Team)

How to Scale DSAR Compliance (Without Breaking Your Team)

Privacy regulations such as GDPR (EU), CCPA/CPRA (California), and others are not just about legal checkboxes, they’re about building trust. In today’s data-driven world, customers expect organizations to be transparent about how their personal information is collected, used, and protected. When companies take privacy seriously, they demonstrate respect for their users, which in turn fosters loyalty and long-term engagement.

But among the many privacy requirements, Data Subject Access Requests (DSARs) can be the most complex to support. DSARs give individuals the right to request access to the personal data that an organization holds about them—often with a firm deadline of just 30 days to respond. For large enterprises with data scattered across multiple systems, both in the cloud and on-premises, even a single request can trigger a chaotic search across different platforms, manual reviews and legal oversight—it quickly becomes a race against the clock, with compliance, trust, and reputation on the line.

Key Challenges in Responding to DSARs

Data Discovery & Inventory
For large organizations, pinpointing where personal data resides across a diverse ecosystem of information systems, including databases, SaaS applications, data lakes, and legacy environments, is a complex challenge. The presence of fragmented IT infrastructure and third-party platforms often leads to limited visibility, which not only slows down the DSAR response process but also increases the likelihood of missing or overlooking critical personal data.

Linking Identities Across Systems
A single individual may appear in multiple systems under different identifiers, especially if systems have been acquired or integrated over time. Accurately correlating these identities to compile a complete DSAR response requires sophisticated identity resolution and often manual effort.


Unstructured Data Handling
Unlike structured databases, where data is organized into labeled fields and can be efficiently queried, unstructured data (like PDFs, documents, and logs) is free-form and lacks consistent formatting. This makes it much harder to search, classify, or extract relevant personal information.

Response Timeliness
Regulatory deadlines force organizations to respond quickly, even when data must be gathered from multiple sources and reviewed by legal teams. Manual processes can lead to delays, risking non-compliance and fines.

Volume & Scalability
While most organizations can handle an occasional DSAR manually, spikes in request volume — driven by events like regulatory campaigns or publicized incidents — can overwhelm privacy and legal teams. Without scalable automation, organizations face mounting operational costs, missed deadlines, and an increased risk of inconsistent or incomplete responses.


The Role of Data Security Platforms in DSAR Automation

Sentra is a modern data security platform dedicated to helping organizations gain complete visibility and control over their sensitive data. By continuously scanning and classifying data across all environments (including cloud, SaaS, and on-premises systems) Sentra maintains an always up-to-date data map, giving organizations a clear understanding of where sensitive data resides, how it flows, and who has access to it. This data map forms the foundation for efficient DSAR automation, enabling Sentra’s DSAR module to search for user identifiers only in locations where relevant data actually exists - ensuring high accuracy, completeness, and fast response times.

Data Security Platform example of US SSN finding

Another key factor in managing DSAR requests is ensuring that sensitive customer PII doesn’t end up in unauthorized or unintended environments. When data is copied between systems or environments, it’s essential to apply tokenization or masking to prevent unintentional sprawl of PII. Sentra helps identify misplaced or duplicated sensitive data and alerts when it isn’t properly protected. This allows organizations to focus DSAR processing within authorized operational environments, significantly reducing both risk and response time.

Smart Search of Individual Data

To initiate the generation of a Data Subject Access Request (DSAR) report, users can submit one or more unique identifiers—such as email addresses, Social Security numbers, usernames, or other personal identifiers—corresponding to the individual in question. Sentra then performs a targeted scan across the organization’s data ecosystem, focusing on data stores known to contain personally identifiable information (PII). This includes production databases, data lakes, cloud storage services, file servers, and both structured and unstructured data sources.

Leveraging its advanced classification and correlation capabilities, Sentra identifies all relevant records associated with the provided identifiers. Once the scan is complete, it compiles a comprehensive DSAR report that consolidates all discovered personal data linked to the data subject that can be downloaded as a PDF for manual review or securely retrieved via Sentra’s API.

DSAR Requests

Establishing a DSAR Processing Pipeline

Large organizations that receive a high volume of DSAR (Data Subject Access Request) submissions typically implement a robust, end-to-end DSAR processing pipeline. This pipeline is often initiated through a self-service privacy portal, allowing individuals to easily submit requests for access or deletion of their personal data. Once a request is received, an automated or semi-automated workflow is triggered to handle the request efficiently and in compliance with regulatory timelines.

  1. Requester Identity Verification: Confirm the identity of the data subject to prevent unauthorized access (e.g., via email confirmation or secure login).

  2. Mapping Identifiers: Collect and map all known identifiers for the individual across systems (e.g., email, user ID, customer number).

  3. Environment-Wide Data Discovery (via Sentra): Use Sentra to search all relevant environments — cloud, SaaS, on-prem — for personal data tied to the individual. By using Sentra’s automated discovery and classification, Sentra can automatically identify where to search for.

  4. DSAR Report Generation (via Sentra): Compile a detailed report listing all personal data found and where it resides.

  5. Data Deletion & Verification: Remove or anonymize personal data as required, then rerun a search to verify deletion is complete.

  6. Final Response to Requester: Send a confirmation to the requester, outlining the actions taken and closing the request.

Sentra plays a key role in the DSAR pipeline by exposing a powerful API that enables automated, organization-wide searches for personal data. The search results can be programmatically used to trigger downstream actions like data deletion. After removal, the API can initiate a follow-up scan to verify that all data has been successfully deleted.

Benefits of DSAR Automation 

With privacy regulations constantly growing, and DSAR volumes continuing to rise, building an automated, scalable pipeline is no longer a luxury - it’s a necessity.


  • Automated and Cost-Efficient: Replaces costly, error-prone manual processes with a streamlined, automated approach.
  • High-Speed, High-Accuracy: Sentra leverages its knowledge of where PII resides to perform targeted searches across all environments and data types, delivering comprehensive reports in hours—not days.
  • Seamless Integration: A powerful API allows integration with workflow systems, enabling a fully automated, end-to-end DSAR experience for end users.

By using Sentra to intelligently locate PII across all environments, organizations can eliminate manual bottlenecks and accelerate response times. Sentra’s powerful API and deep data awareness make it possible to automate every step of the DSAR journey - from discovery to deletion - enabling privacy teams to operate at scale, reduce costs, and maintain compliance with confidence. 

Turning DSAR Compliance into a Scalable Advantage

As privacy expectations grow and regulatory pressure intensifies, DSARs are no longer just a checkbox. They are a reflection of how seriously an organization takes user trust. Manual, reactive processes simply can’t keep up with the scale and complexity of modern data environments.

By automating DSAR workflows with tools like Sentra, organizations can achieve faster response times, lower operational costs, and sustained compliance - while freeing up teams to focus on higher-value privacy initiatives.

Read More
David Stuart
David Stuart
April 3, 2025
3
Min Read
Data Security

The Rise of Next-Generation DSPs

The Rise of Next-Generation DSPs

Recently there has been a significant shift from standalone Data Security Posture Management (DSPM) solutions to comprehensive Data Security Platforms (DSPs). These platforms integrate DSPM functionality, but also encompass access governance, threat detection, and data loss prevention capabilities to provide a more holistic data protection solution. Additionally, the critical role of data in AI and LLM training requires holistic data security platforms that can manage data sensitivity, ensure security and compliance, and maintain data integrity.

This consolidation will improve security effectiveness and help organizations manage the growing complexity of their IT environments. Originally more of a governance/compliance tool, DSPs have evolved into a critical necessity for organizations managing sensitive data in sprawling cloud environments. With the explosion of cloud adoption, stricter regulatory landscapes, and the increasing sophistication of cyber threats, DSPs will continue to evolve to address the monumental data scale expected.

DSP Addressing Modern Challenges in 2025

As the threat landscape evolves, DSP is shifting to address modern challenges. New trends such as AI integration, real-time threat detection, and cloud-native architectures are transforming how organizations approach data security. DSPM is no longer just about assuring compliance and proper data governance, it’s about mitigating all data risks, monitoring for new threats, and proactively resolving them in real time.

Must-Have DSP Features for 2025

Over the years, Data Security Platforms (DSPs) have evolved significantly, with a range of providers emerging to address the growing need for robust data security in cloud environments. Initially, smaller startups began offering innovative solutions, and in 2024, several of these providers were acquired, signaling the increasing demand for comprehensive data protection. As organizations continue to prioritize securing their cloud data, it's essential to carefully evaluate DSP solutions to ensure they meet key security needs. When assessing DSP options for 2025, certain features stand out as critical for ensuring a comprehensive and effective approach to data security.

Below are outlined the must-have features for any DSP solution in the coming year:

  1. Cloud-Native Architecture

Modern DSPs are built for the cloud and address vast data scale with cloud-native technologies that leverage provider APIs and functions. This allows data discovery and classification to occur autonomously, within the customer cloud environment leveraging existing compute resources. Agentless approaches reduce administrative burdens as well.

  1. AI-Based Classification

AI has revolutionized data classification, providing context-aware accuracy exceeding 95%. By understanding data in its unique context, AI-driven DSP solutions ensure the right security measures are applied without overburdening teams with false positives.

  1. Anomaly Detection and Real-Time Threat Detection

Anomaly detection, powered by Data Detection and Response (DDR), identifies unusual patterns in data usage to spotlight risks such as ransomware and insider threats. Combined with real-time, data-aware detection of suspicious activities, modern DSP solutions proactively address cloud-native vulnerabilities, stopping breaches before they unfold and ensuring swift, effective action.

  1. Automatic Labeling

Manual tagging is too cumbersome and time consuming. When choosing DSP solutions, it’s critical to make sure that you choose ones that automate data tagging and labeling, seamlessly integrating with Data Loss Prevention (DLP), Secure Access Service Edge (SASE), and governance platforms. This reduces errors and accelerates compliance processes.

  1. Data Zones and Perimeters

As data moves across cloud environments, maintaining control is paramount. Leading DSP solutions monitor data movement, alerting teams when data crosses predefined perimeters or storage zones, ensuring compliance with internal and external policies.

  1. Automatic Remediation and Enforcement

Automation extends to remediation, with DSPs swiftly addressing data risks like excessive permissions or misconfigurations. By enforcing protection policies across cloud environments, organizations can prevent breaches before they occur.

The Business Case for DSP in 2025

Proactive Security

Cloud-native DSP represents a shift from reactive to proactive security practices. By identifying and addressing risks early, and across their entire data estate from cloud to on-premises, organizations can mitigate potential threats and strengthen their security posture.

Regulatory Compliance

As regulations such as GDPR and CCPA continue to evolve, DSPM solutions play a crucial role in simplifying compliance by automating data discovery and labeling. This automation reduces the manual effort required to meet regulatory requirements. In fact, 84% of security and IT professionals consider data protection frameworks like GDPR and CCPA to be mandatory for their industries, emphasizing the growing need for automated solutions to ensure compliance.

The Rise of Gen AI

The rise of Gen AI is expected to be a main theme in 2025. Gen AI is a driver for data proliferation in the cloud and for a transition between legacy data technologies and modern ones that require an updated data security program.

Operational Efficiency

By automating repetitive tasks, DSPM significantly reduces the workload for security teams. This efficiency allows teams to focus on strategic initiatives rather than firefighting. According to a 2024 survey, organizations using DSPM reported a 40% reduction in time spent on manual data management tasks, demonstrating its impact on operational productivity.

Future-Proofing Your Organization with Cloud-Native DSP

To thrive in the evolving security landscape, organizations must adopt forward-looking strategies. Cloud-native DSP tools integrate seamlessly with broader security frameworks, ensuring resilience and adaptability. As technology advances, features like predictive analytics and deeper AI integration will further enhance capabilities.

Conclusion

Data security challenges are only becoming more complex, but new Data Security Platforms (DSPs) provide the tools to meet them head-on. Now is the time for organizations to take a hard look at their security posture and consider how DSPs can help them stay protected, compliant, and trusted. DSPs are quickly becoming essential to business operations, influencing strategic decisions and enabling faster, more secure innovation.

Ready to see it in action?

Request a demo to discover how a modern DSP can strengthen your security and support your goals.

Read More
decorative ball