All Resources
In this article:
minus iconplus icon
Share the Blog

Transforming Data Security with Large Language Models (LLMs): Sentra’s Innovative Approach

September 12, 2023
5
Min Read
AI and ML

In today's data-driven world, the success of any data security program hinges on the accuracy, speed, and scalability of its data classification efforts. Why? Because not all data is created equal, and precise data classification lays the essential groundwork for security professionals to understand the context of data-related risks and vulnerabilities. Armed with this knowledge, security operations (SecOps) teams can remediate in a targeted, effective, and prioritized manner, with the ultimate aim of proactively reducing an organization's data attack surface and risk profile over time.

Sentra is excited to introduce Large Language Models (LLMs) into its classification engine. This development empowers enterprises to proactively reduce the data attack surface while accurately identifying and understanding sensitive unstructured data such as employee contracts, source code, and user-generated content at scale.

Many enterprises today grapple with a multitude of data regulations and privacy frameworks while navigating the intricate world of cloud data. Sentra's announcement of adding LLMs to its classification engine is redefining how enterprise security teams understand, manage, and secure their sensitive and proprietary data on a massive scale. Moreover, as enterprises eagerly embrace AI's potential, they must also address unauthorized access or manipulation of Language Model Models (LLMs) and remain vigilant in detecting and responding to security risks associated with AI model training. Sentra is well-equipped to guide enterprises through this multifaceted journey.

A New Era of Data Classification 

Identifying and managing unstructured data has always been a headache for organizations,  whether it's legal documents buried in email attachments, confidential source code scattered across various folders, or user-generated content strewn across collaboration platforms. Imagine a scenario where an enterprise needs to identify all instances of employee contracts within its vast data repositories. Previously, this would have involved painstaking manual searches, leading to inefficiency, potential oversight, and increased security risks.

Sentra’s LMM-powered classification engine can now comprehend the context, sentiment, and nuances of unstructured data, enabling it to classify such data with a level of accuracy and granularity that was previously unimaginable. The model can analyze the content of documents, emails, and other unstructured data sources, not only identifying employee contracts but also providing valuable insights into their context. It can understand contract clauses, expiration dates, and even flag potential compliance issues. Similarly, for source code scattered across diverse folders, Sentra can recognize programming languages, identify proprietary code, and ensure that sensitive code is adequately protected.

When it comes to user-generated content on collaboration platforms, Sentra can analyze and categorize this data, making it easier for organizations to monitor and manage user interactions, ensuring compliance with their policies and regulations. This new classification approach not only aids in understanding the business context of unstructured customer data but also aligns seamlessly with compliance standards such as GDPR, CCPA, and HIPAA. Ensuring the highest level of security, Sentra exclusively scans data with LLM-based classifiers within the enterprise's cloud premises. The assurance that the data never leaves the organization’s environment reduces an additional layer of risk.

Quantifying Risk: Prioritized Data Risk Scores 

Automated data classification capabilities provide a solid foundation for data security management practices. What’s more, data classification speed and accuracy are paramount when striving for an in-depth comprehension of sensitive data and quantifying risk. 

Sentra offers data risk scoring that considers multiple layers of data, including sensitivity scores, access permissions, user activity, data movement, and misconfigurations. This unique technology automatically scores the most critical data risks, providing security teams and executives with a clear, prioritized view of all their sensitive data at-risk, with the option to drill down deeply into the root cause of the vulnerability (often at a code level). 

Having a clear, prioritized view of high-risk data at your fingertips empowers security teams to truly understand, quantify, and prioritize data risks while directing targeted remediation efforts.

The Power of Accuracy and Efficiency

One of the most significant advantages of Sentra's LLM-powered data classification is the unprecedented accuracy it brings to the table. Inaccurate or incomplete data classification can lead to costly consequences, including data breaches, regulatory fines, and reputational damage. With LLMs, Sentra ensures that your data is classified with  precision, reducing the risk of errors and omissions. Moreover, this enhanced accuracy translates into increased efficiency. Sentra's LLM engine can process vast volumes of data in a fraction of the time it would take a human workforce. This not only saves valuable resources but also enables organizations to proactively address security and compliance challenges.

Key developments of Sentra's classification engine encompass:

  • Automatic classification of proprietary customer data with additional context to comply with regulations and privacy frameworks.
  • LLM-powered scanning of data asset content and analysis of metadata, including file names, schemas, and tags.
  • The capability for enterprises to train their LLMs and seamlessly integrate them into Sentra's classification engine for improved proprietary data classification.

We are excited about the possibilities that this advancement will unlock for our customers as we continue to innovate and redefine cloud data security. To learn more about Sentra’s LMM-powered classification engine, request a demo today.

Discover Ron’s expertise, shaped by over 20 years of hands-on tech and leadership experience in cybersecurity, cloud, big data, and machine learning. As a serial entrepreneur and seed investor, Ron has contributed to the success of several startups, including Axonius, Firefly, Guardio, Talon Cyber Security, and Lightricks, after founding a company acquired by Oracle.

Subscribe

Latest Blog Posts

Yair Cohen
Yair Cohen
February 5, 2026
3
Min Read

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw, previously known as MoltBot, isn't just another cybersecurity story - it's a wake-up call for every organization. With over 150,000 GitHub stars and more than 300,000 users in just two months, OpenClaw’s popularity signals a huge change: autonomous AI agents are spreading quickly and dramatically broadening the attack surface in businesses. This is far beyond the risks of a typical ChatGPT plugin or a staff member pasting data into a chatbot. These agents live on user machines and servers with shell-level access, file system privileges, live memory control, and broad integration abilities, usually outside IT or security’s purview.

Older perimeter and endpoint security tools weren’t built to find or control agents that can learn, store information, and act independently in all kinds of environments. As organizations face this shadow AI risk, the need for real-time, data-level visibility becomes critical. Enter Data Security Posture Management (DSPM): a way for enterprises to understand, monitor, and respond to the unique threats that OpenClaw and its next-generation kin pose.

What makes OpenClaw different - and uniquely dangerous - for security teams?

OpenClaw runs by setting up a local HTTP server and agent gateway on endpoints. It provides shell access, automates browsers, and links with over 50 messaging platforms. But what really sets it apart is how it combines these features with persistent memory. That means agents can remember actions and data far better than any script or bot before. Palo Alto Networks calls this the 'lethal trifecta': direct access to private data, exposure to untrusted content, communication outside the organization, and persistent memory.

This risk isn't hypothetical. OpenClaw’s skill ecosystem functions like an unguarded software supply chain. Any third-party 'skill' a user adds to an agent can run with full privileges, opening doors to vulnerabilities that original developers can’t foresee. While earlier concerns focused on employees leaking information to public chatbots, tools like OpenClaw operate quietly at system level, often without IT noticing.

From theory to reality: OpenClaw exploitation is active and widespread

This threat is already real. OpenClaw’s design has exposed thousands of organizations to actual attacks. For instance, CVE-2026-25253 is a severe remote code execution flaw caused by a WebSocket validation error, with a CVSS score of 8.8. It lets attackers compromise an agent with a single click (critical OpenClaw vulnerability).

Attackers wasted no time. The ClawHavoc malware campaign, for example, spread over 341 malicious 'skills’, using OpenClaw’s official marketplace to push info-stealers and RATs directly into vulnerable environments. Over 21,000 exposed OpenClaw instances have turned up on the public internet, often protected by nothing stronger than a weak password, or no authentication at all. Researchers even found plaintext password storage in the code. The risk is both immediate and persistent.

The shadow AI dimension: why you’re likely exposed

One of the trickiest parts of OpenClaw and MoltBot is how easily they run outside official oversight. Research shows that more than 22% of enterprise customers have found MoltBot operating without IT approval. Agents connect with personal messaging apps, making it easy for employees to use them on devices IT doesn’t manage, creating blind spots in endpoint management.

This reflects a bigger shift: 68% of employees now access free AI tools using personal accounts, and 57% still paste sensitive data into these services. The risks tied to shadow AI keep rising, and so does the cost of breaches: incidents involving unsanctioned AI tools now average $670,000 higher than those without. No wonder experts at Palo Alto, Straiker, Google Cloud, and Intruder strongly advise enterprises to block or at least closely watch OpenClaw deployments.

Why classic security tools are defenseless - and why DSPM is essential

Despite many advances in endpoint, identity, and network defense, these tools fall short against AI agents such as OpenClaw. Agents often run code with system privileges and communicate independently, sometimes over encrypted or unfamiliar channels. This blinds existing security tools to what internal agent 'skills' are doing or what data they touch and process. The attack surface now includes prompt injection through emails and documents, poisoning of agent memory, delayed attacks, and natural language input that bypasses static scans.

The missing link is visibility: understanding what data any AI agent - sanctioned or shadow - can access, process, or send out. Data Security Posture Management (DSPM) responds to this by mapping what data AI agents can reach, tracing sensitive data to and from agents everywhere they run. Newer DSPM features such as real-time risk scoring, shadow AI discovery, and detailed flow tracking help organizations see and control risks from AI agents at the data layer (Sentra DSPM for AI agent security).

Immediate enterprise action plan: detection, mapping, and control

Security teams need to move quickly. Start by scanning for OpenClaw, MoltBot, and other shadow AI agents across endpoints, networks, and SaaS apps. Once you know where agents are, check which sensitive data they can access by using DSPM tools with AI agent awareness, such as those from Sentra (Sentra’s AI asset discovery). Treat unauthorized installations as active security incidents: reset credentials, investigate activity, and prevent agents from running on your systems following expert recommendations.

For long-term defense, add continuous shadow AI tracking to your operations. Let DSPM keep your data inventory current, trace possible leaks, and set the right controls for every workflow involving AI. Sentra gives you a single place to find all agent activity, see your actual AI data exposure, and take fast, business-aware action.

Conclusion

OpenClaw is simply the first sign of what will soon be a string of AI agent-driven security problems for enterprises. As companies use AI more to boost productivity and automate work, the chance of unsanctioned agents acting with growing privileges and integrations will continue to rise. Gartner expects that by 2028, one in four cyber incidents will stem from AI agent misuse - and attacks have already started to appear in the news.

Success with AI is no longer about whether you use agents like OpenClaw; it’s about controlling how far they reach and what they can do. Old-school defenses can’t keep up with how quickly shadow AI spreads. Only data-focused security, with total AI agent discovery, risk mapping, and ongoing monitoring, can provide the clarity and controls needed for this new world. Sentra's DSPM platform offers precisely that. Take the first steps now: identify your shadow AI risks, map out where your data can go, and make AI agent security a top priority.

<blogcta-big>

Read More
David Stuart
David Stuart
Nikki Ralston
Nikki Ralston
February 4, 2026
3
Min Read

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

Discover  What DSPM Vendors Try to Hide 

Your goal in running a data security/DSPM POV is to evaluate all important performance and cost parameters so you can make the best decision and avoid unpleasant surprises. Vendors, on the other hand, are looking for a ‘quick win’ and will often suggest shortcuts like using a limited test data set and copying your data to their environment.

 On the surface this might sound like a reasonable approach, but if you don’t test real data types and volumes in your own environment, the POV process may hide costly failures or compliance violations that will quickly become apparent in production. A recent evaluation of Sentra versus another top emerging DSPM exposed how the other solution’s performance dropped and costs skyrocketed when deployed at petabyte scale. Worse, the emerging DSPM removed data from the customer environment - a clear controls violation.

If you want to run a successful POV and avoid DSPM buyers' remorse you need to look out for these "dirty little secrets".

Dirty Little Secret #1:
‘Start small’ can mean ‘fails at scale’

The biggest 'dirty secret' is that scalability limits are hidden behind the 'start small' suggestion. Many DSPM platforms cannot scale to modern petabyte-sized data environments. Vendors try to conceal this architectural weakness by encouraging small, tightly scoped POVs that never stress the system and create false confidence. Upon broad deployment, this weakness is quickly exposed as scans slow and refresh cycles stretch, forcing teams to drastically reduce scope or frequency. This failure is fundamentally architectural, lacking parallel orchestration and elastic execution, proving that the 'start small' advice was a deliberate tactic to avoid exposing the platform’s inevitable bottleneck.In a recent POV, Sentra successfully scanned 10x more data in approximately the same time than the alternative:

Dirty Little Secret #2:
High cloud cost breaks continuous security

Another reason some vendors try to limit the scale of POVs is to hide the real cloud cost of running them in production. They often use brute-force scanning that reads excessive data, consumes massive compute resources, and is architecturally inefficient. This is easy to mask during short, limited POVs, but quickly drives up cloud bills in production. The resulting cost pressure forces organizations to reduce scan frequency and scope, quietly shifting the platform from continuous security control to periodic inventory. Ultimately, tools that cannot scale scanners efficiently on-demand or scan infrequently trade essential security for cost, proving they are only affordable when they are not fully utilized. In a recent POV run on 100 petabytes of data, Sentra proved to be 10x more operationally cost effective to run:

Dirty Little Secret #3:
‘Good enough’ accuracy degrades security

Accuracy is fundamental to Data Security Posture Management (DSPM) and should not be compromised. While a few points difference may not seem like a deal breaker, every percentage point of classification accuracy can dramatically affect all downstream security controls. Costs increase as manual intervention is required to address FPs. When organizations automate controls based on these inaccuracies, the DSPM platform becomes a source of risk. Confidence is lost. The secret is kept safe because the POV never validates the platform's accuracy against known sensitive data.

In a recent POV Sentra was able to prove less than one percent rate of false positives and false negatives:

DSPM POV Red Flags 

  • Copy data to the vendor environment for a “quick win”
  • Limit features or capabilities to simplify testing
  • Artificially reduce the size of scanned data
  • Restrict integrations to avoid “complications”
  • Limit or avoid API usage

These shortcuts don’t make a POV easier - they make it misleading.

Four DSPM POV Requirements That Expose the Truth

If you want a DSPM POV that reflects production reality, insist on these requirements:

1. Scalability

Run discovery and classification on at least 1 petabyte of real data, including unstructured object storage. Completion time must be measured in hours or days - not weeks.

2. Cost Efficiency

Operate scans continuously at scale and measure actual cloud resource consumption. If cost forces reduced frequency or scope, the model is unsustainable.

3. Accuracy

Validate results against known sensitive data. Measure false positives and false negatives explicitly. Accuracy must be quantified and repeatable.

4. Unstructured Data Depth

Test long-form, heterogeneous, real-world unstructured data including audio, video, etc. Classification must demonstrate contextual understanding, not just keyword matches.

A DSPM solution that only performs well in a limited POV will lead to painful, costly buyer’s regret. Once in production, the failures in scalability, cost efficiency, accuracy, and unstructured data depth quickly become apparent.

Getting ready to run a DSPM POV? Schedule a demo.

<blogcta-big>

Read More
David Stuart
David Stuart
January 28, 2026
3
Min Read

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day is a good reminder for all of us in the tech world: finding sensitive data is only the first step. But in today’s environment, data is constantly moving -across cloud platforms, SaaS applications, and AI workflows. The challenge isn’t just knowing where your sensitive data lives; it’s also understanding who or what can touch it, whether that access is still appropriate, and how it changes as systems evolve.

I’ve seen firsthand that privacy breaks down not because organizations don’t care, but because access decisions are often disconnected from how data is actually being used. You can have the best policies on paper, but if they aren’t continuously enforced, they quickly become irrelevant.

Discovery is Just the Beginning

Most organizations start with data discovery. They run scans, identify sensitive files, and map out where data lives. That’s an important first step, and it’s necessary, but it’s far from sufficient. Data is not static. It moves, it gets copied, it’s accessed by humans and machines alike. Without continuously governing that access, all the discovery work in the world won’t stop privacy incidents from happening.

The next step, and the one that matters most today, is real-time governance. That means understanding and controlling access as it happens. 

Who can touch this data? Why do they have access? Is it still needed? And crucially, how do these permissions evolve as your environment changes?

Take, for example, a contractor who needs temporary access to sensitive customer data. Or an AI workflow that processes internal HR information. If those access rights aren’t continuously reviewed and enforced, a small oversight can quickly become a significant privacy risk.

Privacy in an AI and Automation Era

AI and automation are changing the way we work with data, but they also change the privacy equation. Automated processes can move and use data in ways that are difficult to monitor manually. AI models can generate insights using sensitive information without us even realizing it. This isn’t a hypothetical scenario, it’s happening right now in organizations of all sizes.

That’s why privacy cannot be treated as a once-a-year exercise or a checkbox in an audit report. It has to be embedded into daily operations, into the way data is accessed, used, and monitored. Organizations that get this right build systems that automatically enforce policies and flag unusual access - before it becomes a problem.

Beyond Compliance: Continuous Responsibility

The companies that succeed in protecting sensitive data are those that treat privacy as a continuous responsibility, not a regulatory obligation. They don’t wait for audits or compliance reviews to take action. Instead, they embed privacy into how data is accessed, shared, and used across the organization.

This approach delivers real results. It reduces risk by catching misconfigurations before they escalate. It allows teams to work confidently with data, knowing that sensitive information is protected. And it builds trust - both internally and with customers because people know their data is being handled responsibly.

A New Mindset for Data Privacy Day

So this Data Privacy Day, I challenge organizations to think differently. The question is no longer “Do we know where our sensitive data is?” Instead, ask:

“Are we actively governing who can touch our data, every moment, everywhere it goes?”

In a world where cloud platforms, AI systems, and automated workflows touch nearly every piece of data, privacy isn’t a one-time project. It’s a continuous practice, a mindset, and a responsibility that needs to be enforced in real time.

Organizations that adopt this mindset don’t just meet compliance requirements, they gain a competitive advantage. They earn trust, strengthen security, and maintain a dynamic posture that adapts as systems and access needs evolve.

Because at the end of the day, true privacy isn’t something you achieve once a year. It’s something you maintain every day, in every process, with every decision. This Data Privacy Day, let’s commit to moving beyond discovery and audits, and make continuous data privacy the standard.

<blogcta-big>

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.