All Resources
In this article:
minus iconplus icon
Share the Blog

How SLMs (Small Language Models) Make Sentra’s AI Faster and More Accurate

November 6, 2025
4
Min Read

The LLM Hype, and What’s Missing

Over the past few years, large language models (LLMs) have dominated the AI conversation. From writing essays to generating code, LLMs like GPT-4 and Claude have proven that massive models can produce human-like language and reasoning at scale.

But here's the catch: not every task needs a 70-billion-parameter model. Parameters are computationally expensive - they require both memory and processing time.

At Sentra, we discovered early on that the work our customers rely on for accurate, scalable classification of massive data flows - isn’t about writing essays or generating text. It’s about making decisions fast, reliably, and cost-effectively across dynamic, real-world data environments. While large language models (LLMs) are excellent at solving general problems, it creates a lot of unnecessary computational overhead.

That’s why we’ve shifted our focus toward Small Language Models (SLMs) - compact, specialized models purpose-built for a single task - understanding and classifying data efficiently. By running hundreds of SLMs in parallel on regular CPUs, Sentra can deliver faster insights, stronger data privacy, and a dramatically lower total cost of AI-based classification that scales with their business, not their cloud bill.

What Is an SLM?

An SLM is a smaller, domain-specific version of a language model. Instead of trying to understand and generate any kind of text, an SLM is trained to excel at a particular task, such as identifying the topic of a document (what the document is about or what type of document it is), or detecting sensitive entities within documents, such as passwords, social security numbers, or other forms of PII.

In other words: If an LLM is a generalist, an SLM is a specialist. At Sentra, we use SLMs that are tuned and optimized for security data classification, allowing them to process high volumes of content with remarkable speed, consistency, and precision. These SLMs are based on standard open source models, but trained with data that was curated by Sentra, to achieve the level of accuracy that only Sentra can guarantee.

From LLMs to SLMs: A Strategic Evolution

Like many in the industry, we started by testing LLMs to see how well they could classify and label data. They were powerful, but also slow, expensive, and difficult to scale. Over time, it became clear: LLMs are too big and too expensive to run on customer data for Sentra to be a viable, cost effective solution for data classification.

Each SLM handles a focused part of the process: initial categorization, text extraction from documents and images, and sensitive entity classification. The SLMs are not only accurate (even more accurate than LLMs classifying using prompts) - they can run on standard CPUs efficiently, and they run inside the customer’s environment, as part of Sentra’s scanners.

The Benefits of SLMs for Customers

a. Speed and Efficiency

SLMs process data faster because they’re lean by design. They don’t waste cycles generating full sentences or reasoning across irrelevant contexts. This means real-time or near-real-time classification, even across millions of data points.

b. Accuracy and Adaptability

SLMs are pre-trained “zero-shot” language models that can categorize and classify generically, without the need to pre-train on a specific task in advance. This is the meaning of “zero shot” - it means that regardless of the data it was trained on, the model can classify an arbitrary set of entities and document labels without training on each one specifically. This is possible due to the fact that language models are very advanced, and they are able to capture deep natural language understanding at the training stage.

Regardless of that, Sentra fine tunes these models to further increase the accuracy of the classification, by curating a very large set of tagged data that resembles the type of data that our customers usually run into.

Our feedback loops ensure that model performance only gets better over time - a direct reflection of our customers’ evolving environments.

c. Cost and Sustainability

Because SLMs are compact, they require less compute power, which means lower operational costs and a smaller carbon footprint. This efficiency allows us to deliver powerful AI capabilities to customers without passing on the heavy infrastructure costs of running massive models.

d. Security and Control

Unlike LLMs hosted on external APIs, SLMs can be run within Sentra’s secure environment, preserving data privacy and regulatory compliance. Customers maintain full control over their sensitive information - a critical requirement in enterprise data security.

A Quick Comparison: SLMs vs. LLMs

The difference between SLMs and LLMs becomes clear when you look at their performance across key dimensions:

Factor SLMs LLMs
Speed Fast, optimized for classification throughput Slower and more compute-intensive for large-scale inference
Cost Cost-efficient Expensive to run at scale
Accuracy (for simple tasks) Optimized for classification Comparable but unnecessary overhead
Deployment Lightweight, easy to integrate Complex and resource-heavy
Adaptability (with feedback) Continuously fine-tuned, ability to fine tune per customer Harder to customize, fine-tuning costly
Best Use Case Classification, tagging, filtering Reasoning and analysis, generation, synthesis

Continuous Learning: How Sentra’s SLMs Grow

One of the most powerful aspects of our SLM approach is continuous learning. Each Sentra customer project contributes valuable insights, from new data patterns to evolving classification needs. These learnings feed back into our training workflows, helping us refine and expand our models over time.

While not every model retrains automatically, the system is built to support iterative optimization: as our team analyzes feedback and performance, models can be fine-tuned or extended to handle new categories and contexts.

The result is an adaptive ecosystem of SLMs that becomes more effective as our customer base and data diversity grow, ensuring Sentra’s AI remains aligned with real-world use cases.

Sentra’s Multi-SLM Architecture

Sentra’s scanning technology doesn’t rely on a single model. We run many SLMs in parallel, each specializing in a distinct layer of classification:

  1. Embedding models that convert data into meaningful vector representations
  2. Entity Classification models that label sensitive entities
  3. Document Classification models that label documents by type
  4. Image-to-text and speech-to-text models that are able to process non-textual data into textual data

This layered approach allows us to operate at scale - quickly, cheaply, and with great results. In practice, that means faster insights, fewer errors, and a more responsive platform for every customer.

The Future of AI Is Specialized

We believe the next frontier of AI isn’t about who can build the biggest model, it’s about who can build the most efficient, adaptive, and secure ones.

By embracing SLMs, Sentra is pioneering a future where AI systems are purpose-built, transparent, and sustainable. Our approach aligns with a broader industry shift toward task-optimized intelligence - models that do one thing extremely well and can learn continuously over time.

Conclusion: The Power of Small

At Sentra, we’ve learned that in AI, bigger isn’t always better. Our commitment to SLMs reflects our belief that efficiency, adaptability, and precision matter most for customers. By running thousands of small, smart models rather than a single massive one, we’re able to classify data faster, cheaper, and with greater accuracy - all while ensuring customer privacy and control.

In short: Sentra’s SLMs represent the power of small, and the future of intelligent classification.

<blogcta-big>

Explore Gilad’s insights, drawn from his extensive experience in R&D, software engineering, and product management. With a strategic mindset and hands-on expertise, he shares valuable perspectives on bridging development and product management to deliver quality-driven solutions.

Subscribe

Latest Blog Posts

Yair Cohen
Yair Cohen
February 5, 2026
3
Min Read

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw (MoltBot): The AI Agent Security Crisis Enterprises Must Address Now

OpenClaw, previously known as MoltBot, isn't just another cybersecurity story - it's a wake-up call for every organization. With over 150,000 GitHub stars and more than 300,000 users in just two months, OpenClaw’s popularity signals a huge change: autonomous AI agents are spreading quickly and dramatically broadening the attack surface in businesses. This is far beyond the risks of a typical ChatGPT plugin or a staff member pasting data into a chatbot. These agents live on user machines and servers with shell-level access, file system privileges, live memory control, and broad integration abilities, usually outside IT or security’s purview.

Older perimeter and endpoint security tools weren’t built to find or control agents that can learn, store information, and act independently in all kinds of environments. As organizations face this shadow AI risk, the need for real-time, data-level visibility becomes critical. Enter Data Security Posture Management (DSPM): a way for enterprises to understand, monitor, and respond to the unique threats that OpenClaw and its next-generation kin pose.

What makes OpenClaw different - and uniquely dangerous - for security teams?

OpenClaw runs by setting up a local HTTP server and agent gateway on endpoints. It provides shell access, automates browsers, and links with over 50 messaging platforms. But what really sets it apart is how it combines these features with persistent memory. That means agents can remember actions and data far better than any script or bot before. Palo Alto Networks calls this the 'lethal trifecta': direct access to private data, exposure to untrusted content, communication outside the organization, and persistent memory.

This risk isn't hypothetical. OpenClaw’s skill ecosystem functions like an unguarded software supply chain. Any third-party 'skill' a user adds to an agent can run with full privileges, opening doors to vulnerabilities that original developers can’t foresee. While earlier concerns focused on employees leaking information to public chatbots, tools like OpenClaw operate quietly at system level, often without IT noticing.

From theory to reality: OpenClaw exploitation is active and widespread

This threat is already real. OpenClaw’s design has exposed thousands of organizations to actual attacks. For instance, CVE-2026-25253 is a severe remote code execution flaw caused by a WebSocket validation error, with a CVSS score of 8.8. It lets attackers compromise an agent with a single click (critical OpenClaw vulnerability).

Attackers wasted no time. The ClawHavoc malware campaign, for example, spread over 341 malicious 'skills’, using OpenClaw’s official marketplace to push info-stealers and RATs directly into vulnerable environments. Over 21,000 exposed OpenClaw instances have turned up on the public internet, often protected by nothing stronger than a weak password, or no authentication at all. Researchers even found plaintext password storage in the code. The risk is both immediate and persistent.

The shadow AI dimension: why you’re likely exposed

One of the trickiest parts of OpenClaw and MoltBot is how easily they run outside official oversight. Research shows that more than 22% of enterprise customers have found MoltBot operating without IT approval. Agents connect with personal messaging apps, making it easy for employees to use them on devices IT doesn’t manage, creating blind spots in endpoint management.

This reflects a bigger shift: 68% of employees now access free AI tools using personal accounts, and 57% still paste sensitive data into these services. The risks tied to shadow AI keep rising, and so does the cost of breaches: incidents involving unsanctioned AI tools now average $670,000 higher than those without. No wonder experts at Palo Alto, Straiker, Google Cloud, and Intruder strongly advise enterprises to block or at least closely watch OpenClaw deployments.

Why classic security tools are defenseless - and why DSPM is essential

Despite many advances in endpoint, identity, and network defense, these tools fall short against AI agents such as OpenClaw. Agents often run code with system privileges and communicate independently, sometimes over encrypted or unfamiliar channels. This blinds existing security tools to what internal agent 'skills' are doing or what data they touch and process. The attack surface now includes prompt injection through emails and documents, poisoning of agent memory, delayed attacks, and natural language input that bypasses static scans.

The missing link is visibility: understanding what data any AI agent - sanctioned or shadow - can access, process, or send out. Data Security Posture Management (DSPM) responds to this by mapping what data AI agents can reach, tracing sensitive data to and from agents everywhere they run. Newer DSPM features such as real-time risk scoring, shadow AI discovery, and detailed flow tracking help organizations see and control risks from AI agents at the data layer (Sentra DSPM for AI agent security).

Immediate enterprise action plan: detection, mapping, and control

Security teams need to move quickly. Start by scanning for OpenClaw, MoltBot, and other shadow AI agents across endpoints, networks, and SaaS apps. Once you know where agents are, check which sensitive data they can access by using DSPM tools with AI agent awareness, such as those from Sentra (Sentra’s AI asset discovery). Treat unauthorized installations as active security incidents: reset credentials, investigate activity, and prevent agents from running on your systems following expert recommendations.

For long-term defense, add continuous shadow AI tracking to your operations. Let DSPM keep your data inventory current, trace possible leaks, and set the right controls for every workflow involving AI. Sentra gives you a single place to find all agent activity, see your actual AI data exposure, and take fast, business-aware action.

Conclusion

OpenClaw is simply the first sign of what will soon be a string of AI agent-driven security problems for enterprises. As companies use AI more to boost productivity and automate work, the chance of unsanctioned agents acting with growing privileges and integrations will continue to rise. Gartner expects that by 2028, one in four cyber incidents will stem from AI agent misuse - and attacks have already started to appear in the news.

Success with AI is no longer about whether you use agents like OpenClaw; it’s about controlling how far they reach and what they can do. Old-school defenses can’t keep up with how quickly shadow AI spreads. Only data-focused security, with total AI agent discovery, risk mapping, and ongoing monitoring, can provide the clarity and controls needed for this new world. Sentra's DSPM platform offers precisely that. Take the first steps now: identify your shadow AI risks, map out where your data can go, and make AI agent security a top priority.

Read More
David Stuart
David Stuart
Nikki Ralston
Nikki Ralston
February 4, 2026
3
Min Read

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

DSPM Dirty Little Secrets: What Vendors Don’t Want You to Test

Discover  What DSPM Vendors Try to Hide 

Your goal in running a data security/DSPM POV is to evaluate all important performance and cost parameters so you can make the best decision and avoid unpleasant surprises. Vendors, on the other hand, are looking for a ‘quick win’ and will often suggest shortcuts like using a limited test data set and copying your data to their environment.

 On the surface this might sound like a reasonable approach, but if you don’t test real data types and volumes in your own environment, the POV process may hide costly failures or compliance violations that will quickly become apparent in production. A recent evaluation of Sentra versus another top emerging DSPM exposed how the other solution’s performance dropped and costs skyrocketed when deployed at petabyte scale. Worse, the emerging DSPM removed data from the customer environment - a clear controls violation.

If you want to run a successful POV and avoid DSPM buyers' remorse you need to look out for these "dirty little secrets".

Dirty Little Secret #1:
‘Start small’ can mean ‘fails at scale’

The biggest 'dirty secret' is that scalability limits are hidden behind the 'start small' suggestion. Many DSPM platforms cannot scale to modern petabyte-sized data environments. Vendors try to conceal this architectural weakness by encouraging small, tightly scoped POVs that never stress the system and create false confidence. Upon broad deployment, this weakness is quickly exposed as scans slow and refresh cycles stretch, forcing teams to drastically reduce scope or frequency. This failure is fundamentally architectural, lacking parallel orchestration and elastic execution, proving that the 'start small' advice was a deliberate tactic to avoid exposing the platform’s inevitable bottleneck.In a recent POV, Sentra successfully scanned 10x more data in approximately the same time than the alternative:

Dirty Little Secret #2:
High cloud cost breaks continuous security

Another reason some vendors try to limit the scale of POVs is to hide the real cloud cost of running them in production. They often use brute-force scanning that reads excessive data, consumes massive compute resources, and is architecturally inefficient. This is easy to mask during short, limited POVs, but quickly drives up cloud bills in production. The resulting cost pressure forces organizations to reduce scan frequency and scope, quietly shifting the platform from continuous security control to periodic inventory. Ultimately, tools that cannot scale scanners efficiently on-demand or scan infrequently trade essential security for cost, proving they are only affordable when they are not fully utilized. In a recent POV run on 100 petabytes of data, Sentra proved to be 10x more operationally cost effective to run:

Dirty Little Secret #3:
‘Good enough’ accuracy degrades security

Accuracy is fundamental to Data Security Posture Management (DSPM) and should not be compromised. While a few points difference may not seem like a deal breaker, every percentage point of classification accuracy can dramatically affect all downstream security controls. Costs increase as manual intervention is required to address FPs. When organizations automate controls based on these inaccuracies, the DSPM platform becomes a source of risk. Confidence is lost. The secret is kept safe because the POV never validates the platform's accuracy against known sensitive data.

In a recent POV Sentra was able to prove less than one percent rate of false positives and false negatives:

DSPM POV Red Flags 

  • Copy data to the vendor environment for a “quick win”
  • Limit features or capabilities to simplify testing
  • Artificially reduce the size of scanned data
  • Restrict integrations to avoid “complications”
  • Limit or avoid API usage

These shortcuts don’t make a POV easier - they make it misleading.

Four DSPM POV Requirements That Expose the Truth

If you want a DSPM POV that reflects production reality, insist on these requirements:

1. Scalability

Run discovery and classification on at least 1 petabyte of real data, including unstructured object storage. Completion time must be measured in hours or days - not weeks.

2. Cost Efficiency

Operate scans continuously at scale and measure actual cloud resource consumption. If cost forces reduced frequency or scope, the model is unsustainable.

3. Accuracy

Validate results against known sensitive data. Measure false positives and false negatives explicitly. Accuracy must be quantified and repeatable.

4. Unstructured Data Depth

Test long-form, heterogeneous, real-world unstructured data including audio, video, etc. Classification must demonstrate contextual understanding, not just keyword matches.

A DSPM solution that only performs well in a limited POV will lead to painful, costly buyer’s regret. Once in production, the failures in scalability, cost efficiency, accuracy, and unstructured data depth quickly become apparent.

Getting ready to run a DSPM POV? Schedule a demo.

<blogcta-big>

Read More
David Stuart
David Stuart
January 28, 2026
3
Min Read

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day: Why Discovery Isn’t Enough

Data Privacy Day is a good reminder for all of us in the tech world: finding sensitive data is only the first step. But in today’s environment, data is constantly moving -across cloud platforms, SaaS applications, and AI workflows. The challenge isn’t just knowing where your sensitive data lives; it’s also understanding who or what can touch it, whether that access is still appropriate, and how it changes as systems evolve.

I’ve seen firsthand that privacy breaks down not because organizations don’t care, but because access decisions are often disconnected from how data is actually being used. You can have the best policies on paper, but if they aren’t continuously enforced, they quickly become irrelevant.

Discovery is Just the Beginning

Most organizations start with data discovery. They run scans, identify sensitive files, and map out where data lives. That’s an important first step, and it’s necessary, but it’s far from sufficient. Data is not static. It moves, it gets copied, it’s accessed by humans and machines alike. Without continuously governing that access, all the discovery work in the world won’t stop privacy incidents from happening.

The next step, and the one that matters most today, is real-time governance. That means understanding and controlling access as it happens. 

Who can touch this data? Why do they have access? Is it still needed? And crucially, how do these permissions evolve as your environment changes?

Take, for example, a contractor who needs temporary access to sensitive customer data. Or an AI workflow that processes internal HR information. If those access rights aren’t continuously reviewed and enforced, a small oversight can quickly become a significant privacy risk.

Privacy in an AI and Automation Era

AI and automation are changing the way we work with data, but they also change the privacy equation. Automated processes can move and use data in ways that are difficult to monitor manually. AI models can generate insights using sensitive information without us even realizing it. This isn’t a hypothetical scenario, it’s happening right now in organizations of all sizes.

That’s why privacy cannot be treated as a once-a-year exercise or a checkbox in an audit report. It has to be embedded into daily operations, into the way data is accessed, used, and monitored. Organizations that get this right build systems that automatically enforce policies and flag unusual access - before it becomes a problem.

Beyond Compliance: Continuous Responsibility

The companies that succeed in protecting sensitive data are those that treat privacy as a continuous responsibility, not a regulatory obligation. They don’t wait for audits or compliance reviews to take action. Instead, they embed privacy into how data is accessed, shared, and used across the organization.

This approach delivers real results. It reduces risk by catching misconfigurations before they escalate. It allows teams to work confidently with data, knowing that sensitive information is protected. And it builds trust - both internally and with customers because people know their data is being handled responsibly.

A New Mindset for Data Privacy Day

So this Data Privacy Day, I challenge organizations to think differently. The question is no longer “Do we know where our sensitive data is?” Instead, ask:

“Are we actively governing who can touch our data, every moment, everywhere it goes?”

In a world where cloud platforms, AI systems, and automated workflows touch nearly every piece of data, privacy isn’t a one-time project. It’s a continuous practice, a mindset, and a responsibility that needs to be enforced in real time.

Organizations that adopt this mindset don’t just meet compliance requirements, they gain a competitive advantage. They earn trust, strengthen security, and maintain a dynamic posture that adapts as systems and access needs evolve.

Because at the end of the day, true privacy isn’t something you achieve once a year. It’s something you maintain every day, in every process, with every decision. This Data Privacy Day, let’s commit to moving beyond discovery and audits, and make continuous data privacy the standard.

<blogcta-big>

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.