All Resources
In this article:
minus iconplus icon
Share the Blog

Transforming Data Security with Large Language Models (LLMs): Sentra’s Innovative Approach

September 12, 2023
5
Min Read
AI and ML

In today's data-driven world, the success of any data security program hinges on the accuracy, speed, and scalability of its data classification efforts. Why? Because not all data is created equal, and precise data classification lays the essential groundwork for security professionals to understand the context of data-related risks and vulnerabilities. Armed with this knowledge, security operations (SecOps) teams can remediate in a targeted, effective, and prioritized manner, with the ultimate aim of proactively reducing an organization's data attack surface and risk profile over time.

Sentra is excited to introduce Large Language Models (LLMs) into its classification engine. This development empowers enterprises to proactively reduce the data attack surface while accurately identifying and understanding sensitive unstructured data such as employee contracts, source code, and user-generated content at scale.

Many enterprises today grapple with a multitude of data regulations and privacy frameworks while navigating the intricate world of cloud data. Sentra's announcement of adding LLMs to its classification engine is redefining how enterprise security teams understand, manage, and secure their sensitive and proprietary data on a massive scale. Moreover, as enterprises eagerly embrace AI's potential, they must also address unauthorized access or manipulation of Language Model Models (LLMs) and remain vigilant in detecting and responding to security risks associated with AI model training. Sentra is well-equipped to guide enterprises through this multifaceted journey.

A New Era of Data Classification 

Identifying and managing unstructured data has always been a headache for organizations,  whether it's legal documents buried in email attachments, confidential source code scattered across various folders, or user-generated content strewn across collaboration platforms. Imagine a scenario where an enterprise needs to identify all instances of employee contracts within its vast data repositories. Previously, this would have involved painstaking manual searches, leading to inefficiency, potential oversight, and increased security risks.

Sentra’s LMM-powered classification engine can now comprehend the context, sentiment, and nuances of unstructured data, enabling it to classify such data with a level of accuracy and granularity that was previously unimaginable. The model can analyze the content of documents, emails, and other unstructured data sources, not only identifying employee contracts but also providing valuable insights into their context. It can understand contract clauses, expiration dates, and even flag potential compliance issues. Similarly, for source code scattered across diverse folders, Sentra can recognize programming languages, identify proprietary code, and ensure that sensitive code is adequately protected.

When it comes to user-generated content on collaboration platforms, Sentra can analyze and categorize this data, making it easier for organizations to monitor and manage user interactions, ensuring compliance with their policies and regulations. This new classification approach not only aids in understanding the business context of unstructured customer data but also aligns seamlessly with compliance standards such as GDPR, CCPA, and HIPAA. Ensuring the highest level of security, Sentra exclusively scans data with LLM-based classifiers within the enterprise's cloud premises. The assurance that the data never leaves the organization’s environment reduces an additional layer of risk.

Quantifying Risk: Prioritized Data Risk Scores 

Automated data classification capabilities provide a solid foundation for data security management practices. What’s more, data classification speed and accuracy are paramount when striving for an in-depth comprehension of sensitive data and quantifying risk. 

Sentra offers data risk scoring that considers multiple layers of data, including sensitivity scores, access permissions, user activity, data movement, and misconfigurations. This unique technology automatically scores the most critical data risks, providing security teams and executives with a clear, prioritized view of all their sensitive data at-risk, with the option to drill down deeply into the root cause of the vulnerability (often at a code level). 

Having a clear, prioritized view of high-risk data at your fingertips empowers security teams to truly understand, quantify, and prioritize data risks while directing targeted remediation efforts.

The Power of Accuracy and Efficiency

One of the most significant advantages of Sentra's LLM-powered data classification is the unprecedented accuracy it brings to the table. Inaccurate or incomplete data classification can lead to costly consequences, including data breaches, regulatory fines, and reputational damage. With LLMs, Sentra ensures that your data is classified with  precision, reducing the risk of errors and omissions. Moreover, this enhanced accuracy translates into increased efficiency. Sentra's LLM engine can process vast volumes of data in a fraction of the time it would take a human workforce. This not only saves valuable resources but also enables organizations to proactively address security and compliance challenges.

Key developments of Sentra's classification engine encompass:

  • Automatic classification of proprietary customer data with additional context to comply with regulations and privacy frameworks.
  • LLM-powered scanning of data asset content and analysis of metadata, including file names, schemas, and tags.
  • The capability for enterprises to train their LLMs and seamlessly integrate them into Sentra's classification engine for improved proprietary data classification.

We are excited about the possibilities that this advancement will unlock for our customers as we continue to innovate and redefine cloud data security. To learn more about Sentra’s LMM-powered classification engine, request a demo today.

Discover Ron’s expertise, shaped by over 20 years of hands-on tech and leadership experience in cybersecurity, cloud, big data, and machine learning. As a serial entrepreneur and seed investor, Ron has contributed to the success of several startups, including Axonius, Firefly, Guardio, Talon Cyber Security, and Lightricks, after founding a company acquired by Oracle.

Subscribe

Latest Blog Posts

Gilad Golani
Gilad Golani
January 18, 2026
3
Min Read

False Positives Are Killing Your DSPM Program: How to Measure Classification Accuracy

False Positives Are Killing Your DSPM Program: How to Measure Classification Accuracy

As more organizations move sensitive data to the cloud, Data Security Posture Management (DSPM) has become a critical security investment. But as DSPM adoption grows, a big problem is emerging: security teams are overwhelmed by false positives that create too much noise and not enough useful insight. If your security program is flooded with unnecessary alerts, you end up with more risk, not less.

Most enterprises say their existing data discovery and classification solutions fall short, primarily because they misclassify data. False positives waste valuable analyst time and deteriorate trust in your security operation. Security leaders need to understand what high-quality data classification accuracy really is, why relying only on regex fails, and how to use objective metrics like precision and recall to assess potential tools. Here’s a look at what matters most for accuracy in DSPM.

What Does Good Data Classification Accuracy Look Like?

To make real progress with data classification accuracy, you first need to know how to measure it. Two key metrics - precision and recall - are at the core of reliable classification. Precision tells you the share of correct positive results among everything identified as positive, while recall shows the percentage of actual sensitive items that get caught. You want both metrics to be high. Your DSPM solution should identify sensitive data, such as PII or PCI, without generating excessive false or misclassified results.

The F1-score adds another perspective, blending precision and recall for a single number that reflects both discovery and accuracy. On the ground, these metrics mean fewer false alerts, quicker responses, and teams that spend their time fixing problems rather than chasing noise. "Good" data classification produces consistent, actionable results, even as your cloud data grows and changes.

The Hidden Cost of Regex-Only Data Discovery

A lot of older DSPM tools still depend on regular expressions (regex) to classify data in both structured and unstructured systems. Regex works for certain fixed patterns, but it struggles with the diverse, changing data types common in today’s cloud and SaaS environments. Regex can't always recognize if a string that “looks” like a personal identifier is actually just a random bit of data. This results in security teams buried by alerts they don’t need, leading to alert fatigue.

Far from helping, regex-heavy approaches waste resources and make it easier for serious risks to slip through. As privacy regulations become more demanding and the average breach hit $4.4 million according to the annual "Cost of a Data Breach Report" by IBM and the Ponemon Institute, ignoring precision and recall is becoming increasingly costly.

How to Objectively Test DSPM Accuracy in Your POC

If your current DSPM produces more noise than value, a better method starts with clear testing. A meaningful proof-of-value (POV) process uses labeled data and a confusion matrix to calculate true positives, false positives, and false negatives. Don’t rely on vendor promises. Always test their claims with data from your real environment. Ask hard questions: How does the platform classify unstructured data? How much alert noise can you expect? Can it keep accuracy high even when scanning huge volumes across SaaS, multi-cloud, and on-prem systems? The best DSPM tool cuts through the clutter, surfacing only what matters.

Sentra Delivers Highest Accuracy with Small Language Models and Context

Sentra’s DSPM platform raises the bar by going beyond regex, using purpose-built small language models (SLMs) and advanced natural language processing (NLP) for context-driven data classification at scale. Customers and analysts consistently report that Sentra achieves over the highest classification accuracy for PII and PCI, with very few false positives.

Gartner Review - Sentra received 5 stars

How does Sentra get these results without data ever leaving your environment? The platform combines multi-cloud discovery, agentless install, and deep contextual awareness - scanning extensive environments and accurately discerning real risks from background noise. Whether working with unstructured cloud data, ever-changing SaaS content, or traditional databases, Sentra keeps analysts focused on real issues and helps you stay compliant. Instead of fighting unnecessary alerts, your team sees clear results and can move faster with confidence.

Want to see Sentra DSPM in action? Schedule a Demo.

Reducing False Positives Produces Real Outcomes

Classification accuracy has a direct impact on whether your security is efficient or overwhelmed. With compliance rules tightening and threats growing, security teams cannot afford DSPM solutions that bury them in false positives. Regex-only tools no longer cut it - precision, recall, and truly reliable results should be standard.

Sentra’s SLM-powered, context-aware classification delivers the trustworthy performance businesses need, changing DSPM from just another alert engine to a real tool for reducing risk. Want to see the difference yourself? Put Sentra’s accuracy to the test in your own environment and finally move past false positive fatigue.

<blogcta-big>

Read More
Ward Balcerzak
Ward Balcerzak
January 14, 2026
4
Min Read

The Real Business Value of DSPM: Why True ROI Goes Beyond Cost Savings

The Real Business Value of DSPM: Why True ROI Goes Beyond Cost Savings

As enterprises scale cloud usage and adopt AI, the value of Data Security Posture Management (DSPM) is no longer just about checking a tool category box. It’s about protecting what matters most: sensitive data that fuels modern business and AI workflows.

Traditional content on DSPM often focuses on cost components and deployment considerations. That’s useful, but incomplete. To truly justify DSPM to executives and boards, security leaders need a holistic, outcome-focused view that ties data risk reduction to measurable business impact.

In this blog, we unpack the real, measurable benefits of DSPM, beyond just cost savings, and explain how modern DSPM strategies deliver rapid value far beyond what most legacy tools promise. 

1. Visibility Isn’t Enough - You Need Context

A common theme in DSPM discussions is that tools help you see where sensitive data lives. That’s important, but it’s only the first step. Real value comes from understanding context. Who can access the data, how it’s being used, and where risk exists in the wider security posture. Organizations that stop at discovery often struggle to prioritize risk and justify spend.

Modern DSPM solutions go further by:

  • Correlating data locations with access rights and usage patterns
  • Mapping sensitive data flows across cloud, SaaS, and hybrid environments
  • Detecting shadow data stores and unmanaged copies that silently increase exposure
  • Linking findings to business risk and compliance frameworks

This contextual intelligence drives better decisions and higher ROI because teams aren’t just counting sensitive data, they’re continuously governing it.

2. DSPM Saves Time and Shrinks Attack Surface Fast

One way DSPM delivers measurable business value is by streamlining functions that used to be manual, siloed, and slow:

  • Automated classification reduces manual tagging and human error
  • Continuous discovery eliminates periodic, snapshot-alone inventories
  • Policy enforcement reduces time spent reacting to audit requests

This translates into:

  • Faster compliance reporting
  • Shorter audit cycles
  • Rapid identification and remediation of critical risks

For security leaders, the speed of insight becomes a competitive advantage, especially in environments where data volumes grow daily and AI models can touch every corner of the enterprise.

3. Cost Benefits That Matter, but with Context

Lately I’m hearing many DSPM discussions break down cost components like scanning compute, licensing, operational expenses, and potential cloud savings. That’s a good start because DSPM can reduce cloud waste by identifying stale or redundant data, but it’s not the whole story.

 

Here’s where truly strategic DSPM differs:

Operational Efficiency

When DSPM tools automate discovery, classification, and risk scoring:

  • Teams spend less time on manual reports
  • Alert fatigue drops as noise is filtered
  • Engineers can focus on higher-value work

Breach Avoidance

Data breaches are expensive. According to industry studies, the average cost of a data breach runs into millions, far outweighing the cost of DSPM itself. A DSPM solution that prevents even one breach or major compliance failure pays for itself tenfold

Compliance as a Value Center

Rather than treating compliance as a cost center consider that:

  • DSPM reduces audit overhead
  • Provides automated evidence for frameworks like GDPR, HIPAA, PCI DSS
  • Improves confidence in reporting accuracy

That’s a measurable business benefit CFOs can appreciate and boards expect.

4. DSPM Reduces Risk Vector Multipliers Like AI

One benefit that’s often under-emphasized is how DSPM reduces risk vector multipliers, the factors that amplify risk exponentially beyond simple exposure counts.

In 2026 and beyond, AI systems are increasingly part of the risk profile. Modern DSPM help reduce the heightened risk from AI by:

  • Identifying where sensitive data intersects with AI training or inference pipelines
  • Governing how AI tools and assistants can access sensitive content
  • Providing risk context so teams can prevent data leakage into LLMs

This kind of data-centric, contextual, and continuous governance should be considered a requirement for secure AI adoption, no compromise.

5. Telling the DSPM ROI Story

The most convincing DSPM ROI stories aren’t spreadsheets, they’re narratives that align with business outcomes. The key to building a credible ROI case is connecting metrics, security impact, and business outcomes:

Metric Security Impact Business Outcome
Faster discovery & classification Fewer blind spots Reduced breach likelihood
Consistent governance enforcement Fewer compliance issues Lower audit cost
Contextual risk scoring Better prioritization Efficient resource allocation
AI governance Controlled AI exposure Safe innovation

By telling the story this way, security leaders can speak in terms the board and executives care about: risk reduction, compliance assurance, operational alignment, and controlled growth.

How to Evaluate DSPM for Real ROI

To capture tangible return, don’t evaluate DSPM solely on cost or feature checklists. Instead, test for:

1. Scalability Under Real Load

Can the tool discover and classify petabytes of data, including unstructured content, without degrading performance?

2. Accuracy That Holds Up

Poor classification undermines automation. True ROI requires consistent, top-performing accuracy rates.

3. Operational Cost Predictability

Beware of DSPM solutions that drive unexpected cloud expenses due to inefficient scanning or redundant data reads.

4. Integration With Enforcement Workflows

Visibility without action isn’t ROI. Your DSPM should feed DLP, IAM/CIEM, SIEM/SOAR, and compliance pipelines (ticketing, policy automation, alerts).

ROI Is a Journey, Not a Number

Costs matter, but value lives in context. DSPM is not just a cost center, it’s a force multiplier for secure cloud operations, AI readiness, compliance, and risk reduction. Instead of seeing DSPM as another tool, forward-looking teams view it as a fundamental decision support engine that changes how risk is measured, prioritized, and controlled.

Ready to See Real DSPM Value in Your Environment?

Download Sentra’s “DSPM Dirty Little Secrets” guide, a practical roadmap for evaluating DSPM with clarity, confidence, and production reality in mind.

👉 Download the DSPM Dirty Little Secrets guide now

Want a personalized walkthrough of how Sentra delivers measurable DSPM value?
👉 Request a demo

<blogcta-big>

Read More
Ofir Yehoshua
Ofir Yehoshua
January 13, 2026
3
Min Read

Why Infrastructure Security Is Not Enough to Protect Sensitive Data

Why Infrastructure Security Is Not Enough to Protect Sensitive Data

For years, security programs have focused on protecting infrastructure: networks, servers, endpoints, and applications. That approach made sense when systems were static and data rarely moved. It’s no longer enough.

Recent breach data shows a consistent pattern. Organizations detect incidents, restore systems, and close tickets, yet remain unable to answer the most important question regulators and customers often ask:

Where does my sensitive data reside?

Who or what has access to this data and are they authorized?

Which specific sensitive datasets were accessed or exfiltrated?

Infrastructure security alone cannot answer that question.

Infrastructure Alerts Detect Events, Not Impact

Most security tooling is infrastructure-centric by design. SIEMs, EDRs, NDRs, and CSPM tools monitor hosts, processes, IPs, and configurations. When something abnormal happens, they generate alerts.

What they do not tell you is:

  • Which specific datasets were accessed
  • Whether those datasets contained PHI or PII
  • Whether sensitive data was copied, moved, or exfiltrated

Traditional tools monitor the "plumbing" (network traffic, server logs, etc.) While they can flag that a database was accessed by an unauthorized IP, they often cannot distinguish between an attacker downloading a public template or downloading a table containing 50,000 Social Security numbers. An alert is not the same as understanding the exposure of the data stored inside it. Without that context, incident response teams are forced to infer impact rather than determine it.

The “Did They Access the Data?” Problem

This gap becomes pronounced during ransomware and extortion incidents.

In many cases:

  • Operations are restored from backups
  • Infrastructure is rebuilt
  • Access is reduced
  • (Hopefully!) attackers are removed from the environment

Yet organizations still cannot confirm whether sensitive data was accessed or exfiltrated during the dwell time.

Without data-level visibility:

  • Legal and compliance teams must assume worst-case exposure
  • Breach notifications expand unnecessarily
  • Regulatory penalties increase due to uncertainty, not necessarily damage

The inability to scope an incident accurately is not a tooling failure during the breach, it is a visibility failure that existed long before the breach occurred. Under regulations like GDPR or CCPA/CPRA, if an organization cannot prove that sensitive data wasn’t accessed during a breach, they are often legally required to notify all potentially affected parties. This ‘over-notification’ is costly and damaging to reputation.

Data Movement Is the Real Attack Vulnerability

Modern environments are defined by constant data movement:

  • Cloud migrations
  • SaaS integrations
  • App dev lifecycles
  • Analytics and ETL pipelines
  • AI and ML workflows

Each transition creates blind spots.

Legacy platforms awaiting migration often exist in a “wait state” with reduced monitoring. Data copied into cloud storage or fed into AI pipelines frequently loses lineage and classification context. Posture may vary and traditional controls no longer apply consistently. From an attacker’s perspective, these environments are ideal. From a defender’s perspective, they are blind spots.

Policies Are Not Proof

Most organizations can produce policies stating that sensitive data is encrypted, access-controlled, and monitored. Increasingly, regulators are moving from point-in-time audits to requiring continuous evidence of control.  

Regulators are asking for evidence:

  • Where does PHI live right now?
  • Who or what can access it?
  • How do you know this hasn’t changed since the last audit?

Point-in-time audits cannot answer those questions. Neither can static documentation. Exposure and access drift continuously, especially in cloud and AI-driven environments.

Compliance depends on continuous control, not periodic attestation.

What Data-Centric Security Actually Requires

Accurately proving compliance and scoping breach impact requires security visibility that is anchored to the data itself, not the infrastructure surrounding it.

At a minimum, this means:

  • Continuous discovery and classification of sensitive data
  • Consistent compliance reporting and controls across cloud, SaaS, On-Prem, and migration states
  • Clear visibility into which identities, services, and AI tools can access specific datasets
  • Detection and response signals tied directly to sensitive data exposure and movement

This is the operational foundation of Data Security Posture Management (DSPM) and Data Detection and Response (DDR). These capabilities do not replace infrastructure security controls; they close the gap those controls leave behind by connecting security events to actual data impact.

This is the problem space Sentra was built to address.

Sentra provides continuous visibility into where sensitive data lives, how it moves, and who or what can access it, and ties security and compliance outcomes to that visibility. Without this layer, organizations are forced to infer breach impact and compliance posture instead of proving it.

Why Data-Centric Security Is Required for Today's Compliance and Breach Response

Infrastructure security can detect that an incident occurred, but it cannot determine which sensitive data was accessed, copied, or exfiltrated. Without data-level evidence, organizations cannot accurately scope breaches, contain risk, or prove compliance, regardless of how many alerts or controls are in place. Modern breach response and regulatory compliance require continuous visibility into sensitive data, its lineage, and its access paths. Infrastructure-only security models are no longer sufficient.

Want to see how Sentra provides complete visibility and control of sensitive data?

Schedule a Demo

<blogcta-big>

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.