Sentra Launches Breakthrough AI Classification Capabilities!
All Resources
In this article:
minus iconplus icon
Share the Blog

Empowering Users to Self-Protect Their Data

March 27, 2025
3
Min Read
Sentra Case Study

In today’s fast-evolving cybersecurity landscape, organizations must not only deploy sophisticated security tools but also empower users to self-protect. Operationalizing this data security requires a proactive approach that integrates automation, streamlined processes, and user education. A recent discussion with Sapir Gottdiner, Cyber Security Architect at Global-e, highlighted key strategies to enhance data security by addressing alert management, sensitive data exposure, and user-driven security measures.

As a provider of end-to-end e-commerce solutions that combine localization capabilities, business intelligence, and logistics for streamlined international expansion, Global-e makes cross-border sales as simple as domestic ones. The chosen partner of leading brands and retailers across the USA, Europe and Asia, Global-e sets the standard of global e-commerce. This requires a strong commitment to security and compliance, and Global-e must comply with a number of strict regulations.

Automating Security Tasks for Efficiency

“One of the primary challenges faced by any security team is keeping pace with the volume of security alerts and the effort required to address them”, said Sapir. Automating human resource-constrained tasks is crucial for efficiency. For example, sensitive data should only exist in certain controlled environments, as improper data handling can lead to vulnerabilities. By leveraging DSPM which acts as a validation tool, organizations can automate the detection of sensitive information stored in incorrect locations and initiate remediation processes without human intervention.

Strengthening Sensitive Data Protection

A concern identified in the discussion was data accessible to unauthorized personnel in Microsoft OneDrive, that may contain sensitive information. To mitigate this, organizations should automate the creation of support tickets (in Jira, for instance) for security incidents, ensuring critical and high-risk alerts are addressed immediately. Assigning these incidents to the relevant departments and data owners ensures accountability and prompt resolution. Additionally, identifying the type and location of sensitive data enables organizations to implement precise fixes, reducing exposure risks.

Risk Management and Process Improvement

Permissioning is equally important and organizations must establish clear procedures and policies for managing authentication credentials. Different actions for different levels of risk to ensure no business interruption is applicable in most cases. This can vary from easy, quick access revocation for low-risk cases while requiring manual verification for critical credentials.

Furthermore, proper data storage is an important protection factor, given sovereignty regulations, data proliferation, etc. Implementing well-defined data mapping strategies and systematically applying proper hygiene and ensuring correct locations will minimize security gaps. For the future, Sapir envisions smart data mapping within O365 and deeper integrations with automated remediation workflow tools to further enhance security posture.

Continuous Review and Training

Sapir also suggests that to ensure compliance and effective security management, organizations should conduct monthly security reviews. These reviews help define when to close or suppress alerts, preventing unnecessary effort on minor issues. Additionally, policies should align with infrastructure security and regulatory compliance requirements such as GDPR, PCI and SOC2. Expanding security training programs is another essential step, equipping users with the knowledge on proper storage and handling of controlled data and how to avoid common security missteps. Empowering users to self-police/self-remediate allows lean security teams to scale data protection operations more efficiently.

Enhancing Communication and Future Improvements

Streamlined communication between security platforms, such as Jira and Microsoft Teams, can significantly improve incident resolution. Automating alert closures based on predefined criteria will reduce the workload on security teams. Addressing existing bugs, such as shadow IT detection issues, will further refine security processes. By fostering a culture of proactive security and leveraging automation, organizations can empower users to self-protect, ensuring a robust defense against evolving cyber threats.

Operationalizing data security is an ongoing effort that blends automation, user education, and process refinement. By taking a strategic user-enablement approach, organizations can create a security-aware culture while minimizing risks and optimizing their security response. Since implementing Sentra’s DSPM solution, Global-e has seen significant improvement in the strength of its data security posture. The company is now able to protect its cloud data more effectively, saving its security, IT, DevOps and engineering teams time, and ensuring it remains compliant with regulatory requirements. Empowering users and data owners to take responsibility for their data security, and providing the right tools to do so easily, is a game changer to the organization.

<blogcta-big>

Ran is a passionate product and customer success leader with over 12 years of experience in the cybersecurity sector. He combines extensive technical knowledge with a strong passion for product innovation, research and development (R&D), and customer success to deliver robust, user-centric security solutions. His leadership journey is marked by proven managerial skills, having spearheaded multidisciplinary teams towards achieving groundbreaking innovations and fostering a culture of excellence. He started at Sentra as a Senior Product Manager and is currently the Head of Technical Account Management, located in NYC.

Subscribe

Latest Blog Posts

Gilad Golani
Gilad Golani
November 27, 2025
3
Min Read

Unstructured Data Is 80% of Your Risk: Why DSPM 1.0 Vendors, Like Varonis and Cyera, Fail to Protect It at Petabyte Scale

Unstructured Data Is 80% of Your Risk: Why DSPM 1.0 Vendors, Like Varonis and Cyera, Fail to Protect It at Petabyte Scale

Unstructured data is the fastest-growing, least-governed, and most dangerous class of enterprise data. Emails, Slack messages, PDFs, screenshots, presentations, code repositories, logs, and the endless stream of GenAI-generated content — this is where the real risk lives.

The Unstructured data dilemma is this: 80% of your organization’s data is essentially invisible to your current security tools, and the volume is climbing by up to 65% each year. This isn’t just a hypothetical - it’s the reality for enterprises as unstructured data spreads across cloud and SaaS platforms. Yet, most Data Security Posture Management (DSPM) solutions - often called DSPM 1.0 - were never built to handle this explosion at petabyte scale. Especially legacy vendors and first-generation players like Cyera — were never designed to handle unstructured data at scale. Their architectures, classification engines, and scanning models break under real enterprise load.

Looking ahead to 2026, unstructured data security risk stands out as the single largest blind spot in enterprise security. If overlooked, it won’t just cause compliance headaches and soaring breach costs - it could put your organization in the headlines for all the wrong reasons.

The 80% Problem: Unstructured Data Dominates Your Risk

The Scale You Can’t Ignore - Over 80% of enterprise data is unstructured

  • Unstructured data is growing 55-65% per year; by 2025, the world will store more than 180 zettabytes of it.
  • 95% of organizations say unstructured data management is a critical challenge but less than 40% of data security budgets address this high-risk area. Unstructured data is everywhere: cloud object stores, SaaS apps, collaboration tools, and legacy file shares. Unlike structured data in databases, it often lacks consistent metadata, access controls, or even basic visibility. This “dark data” is behind countless breaches, from accidental file exposures and overshared documents to sensitive AI training datasets left unmonitored.

The Business Impact - The average breach now costs $4-4.9M, with unstructured data often at the center.

  • Poor data quality, mostly from unstructured sources, costs the U.S. economy $3.1 trillion each year.
  • More than half of organizations report at least one non-compliance incident annually, with average costs topping $1M. The takeaway: Unstructured data isn’t just a storage problem.

Why DSPM 1.0 Fails: The Blind Spots of Legacy Approaches

Traditional Tools Fall Short in Cloud-First, Petabyte-Scale Environments

Legacy DSPM and DCAP solutions, such as Varonis or Netwrix - were built for an era when data lived on-premises, followed predictable structures, and grew at a manageable pace.

In today’s cloud-first reality, their limitations have become impossible to ignore:

  • Discovery Gaps: Agent-based scanning can’t keep up with sprawling, constantly changing cloud and SaaS environments. Shadow and dark data across platforms like Google Drive, Dropbox, Slack, and AWS S3 often go unseen.
  • Performance Limits: Once environments exceed 100 TB, and especially as they reach petabyte scale—these tools slow dramatically or miss data entirely.
  • Manual Classification: Most legacy tools rely on static pattern matching and keyword rules, causing them to miss sensitive information hidden in natural language, code, images, or unconventional file formats.
  • Limited Automation: They generate alerts but offer little or no automated remediation, leaving security teams overwhelmed and forcing manual cleanup.
  • Siloed Coverage: Solutions designed for on-premises or single-cloud deployments create dangerous blind spots as organizations shift to multi-cloud and hybrid architectures.

Example: Collaboration App Exposure

A global enterprise recently discovered thousands of highly sensitive files—contracts, intellectual property, and PII—were unintentionally shared with “anyone with the link” inside a cloud collaboration platform. Their legacy DSPM tool failed to identify the exposure because it couldn’t scan within the app or detect real-time sharing changes.

Further, even Emerging DSPM tools often rely on pattern matching or LLM-based scanning. These approaches also fail for three reasons:

  • Inaccuracy at scale: LLMs hallucinate, mislabel, and require enormous compute.
  • Cost blow-ups: Vendors pass massive cloud bills back to customers or incur inordinate compute cost.
  • Architectural limitations: Without clustering and elastic scaling, large datasets overwhelm the system.

This is exactly where Cyera and legacy tools struggle - and where Sentra’s SLM-powered classifier thrives with >99% accuracy at a fraction of the cost.

The New Mandate: Securing Unstructured Data in 2026 and Beyond

GenAI, and stricter privacy laws (GDPR, CCPA, HIPAA) have raised the stakes for unstructured data security. Gartner now recommends Data Access Governance (DAG) and AI-driven classification to reduce oversharing and prepare for AI-centric workloads.

What Modern Security Leaders Need - Agentless, Real-Time Discovery: No deployment hassles, continuous visibility, and coverage for unstructured data stores no matter where they live.

  • Petabyte-Scale Performance: Scan, classify, and risk-score all data, everywhere it lives.
  • AI-Driven Deep Classification: Use of natural language processing (NLP), Domain-specific  Small Language Models (SLMs), and context analysis for every unstructured format.
  • Automated Remediation: Playbooks that fix exposures, govern permissions, and ensure compliance without manual work.
  • Multi-Cloud & SaaS Coverage: Security that follows your data, wherever it goes.

Sentra: Turning the 80% Blind Spot into a Competitive Advantage

Sentra was built specifically to address the risks of unstructured data in 2026 and beyond. There are nuances involved in solving this.  Selecting an appropriate solution is key to a sustainable approach. Here’s what sets Sentra apart:
 

  • Agentless Discovery Across All Environments:Instantly scans and classifies unstructured data across AWS, Azure, Google, M365, Dropbox, legacy file shares, and more - no agents required, no blind spots left behind.
  • Petabyte-Tested Performance:Designed for Fortune 500 scale, Sentra keeps speed and accuracy high across petabytes, not just terabytes.
  • AI-Powered Deep Classification:Our platform uses advanced NLP, SLMs, and context-aware algorithms to classify, label, and risk-score every file - including code, images, and AI training data, not just structured fields.
  • Continuous, Context-Rich Visibility:Real-time risk scoring, identity and access mapping, and automated data lineage show not just where data lives, but who can access it and how it’s used.
  • Automated Remediation and Orchestration: Sentra goes beyond alerts. Built-in playbooks fix permissions, restrict sharing, and enforce policies within seconds.
  • Compliance-First, Audit-Ready: Quickly spot compliance gaps, generate audit trails, and reduce regulatory risk and reporting costs.     

During a recent deployment with a global financial services company, Sentra uncovered 40% more exposed sensitive files than their previous DSPM tool. Automated remediation covered over 10 million documents across three clouds, cutting manual investigation time by 80%.

Actionable Takeaways for Security Leaders 

1. Put Unstructured Data at the Center of Your 2026 Security Plan: Make sure your DSPM strategy covers all data, especially “dark” and shadow data in SaaS, object stores, and collaboration platforms.

2.  Choose Agentless, AI-Driven Discovery: Legacy, agent-based tools can’t keep up. And underperforming emerging tools may not adequately scale.  Look for continuous, automated scanning and classification that scales with your data.

3.  Automate Remediation Workflows: Visibility is just the start; your platform should fix exposures and enforce policies in real time.

4.  Adopt Multi-Cloud, SaaS-Agnostic Solutions: Your data is everywhere, and your security should be too. Ensure your solution supports all of your unstructured data repositories.

5.  Make Compliance Proactive: Use real-time risk scoring and automated reporting to stay ahead of auditors and regulators.

    

Conclusion: Ready for the 80% Challenge?

With petabyte-scale, cloud-first data, ignoring unstructured data risk is no longer an option. Traditional DSPM tools can’t keep up, leaving most of your data - and your business - vulnerable. Sentra’s agentless, AI-powered platform closes this gap, delivering the discovery, classification, and automated response you need to turn your biggest blind spot into your strongest defense. See how Sentra uncovers your hidden risk - book an instant demo today.

Don’t let unstructured data be your organization’s Achilles’ heel. With Sentra, enterprises finally have a way to secure the data that matters most.

<blogcta-big>

Read More
David Stuart
David Stuart
Nikki Ralston
Nikki Ralston
November 24, 2025
3
Min Read

Third-Party OAuth Apps Are the New Shadow Data Risk: Lessons from the Gainsight/Salesforce Incident

Third-Party OAuth Apps Are the New Shadow Data Risk: Lessons from the Gainsight/Salesforce Incident

The recent exposure of customer data through a compromised Gainsight integration within Salesforce environments is more than an isolated event - it’s a sign of a rapidly evolving class of SaaS supply-chain threats. Even trusted AppExchange partners can inadvertently create access pathways that attackers exploit, especially when OAuth tokens and machine-to-machine connections are involved. This post explores what happened, why today’s security tooling cannot fully address this scenario, and how data-centric visibility and identity governance can meaningfully reduce the blast radius of similar breaches.

A Recap of the Incident

In this case, attackers obtained sensitive credentials tied to a Gainsight integration used by multiple enterprises. Those credentials allowed adversaries to generate valid OAuth tokens and access customer Salesforce orgs, in some cases with extensive read capabilities. Neither Salesforce nor Gainsight intentionally misconfigured their systems. This was not a product flaw in either platform. Instead, the incident illustrates how deeply interconnected SaaS environments have become and how the security of one integration can impact many downstream customers.

Understanding the Kill Chain: From Stolen Secrets to Salesforce Lateral Movement

The attackers’ pathway followed a pattern increasingly common in SaaS-based attacks. It began with the theft of secrets; likely API keys, OAuth client secrets, or other credentials that often end up buried in repositories, CI/CD logs, or overlooked storage locations. Once in hand, these secrets enabled the attackers to generate long-lived OAuth tokens, which are designed for application-level access and operate outside MFA or user-based access controls.

What makes OAuth tokens particularly powerful is that they inherit whatever permissions the connected app holds. If an integration has broad read access, which many do for convenience or legacy reasons, an attacker who compromises its token suddenly gains the same level of visibility. Inside Salesforce, this enabled lateral movement across objects, records, and reporting surfaces far beyond the intended scope of the original integration. The entire kill chain was essentially a progression from a single weakly-protected secret to high-value data access across multiple Salesforce tenants.

Why Traditional SaaS Security Tools Missed This

Incident response teams quickly learned what many organizations are now realizing: traditional CASBs and CSPMs don’t provide the level of identity-to-data context necessary to detect or prevent OAuth-driven supply-chain attacks.

CASBs primarily analyze user behavior and endpoint connections, but OAuth apps are “non-human identities” - they don’t log in through browsers or trigger interactive events. CSPMs, in contrast, focus on cloud misconfigurations and posture, but they don’t understand the fine-grained data models of SaaS platforms like Salesforce. What was missing in this incident was visibility into how much sensitive data the Gainsight connector could access and whether the privileges it held were appropriate or excessive. Without that context, organizations had no meaningful way to spot the risk until the compromise became public.

Sentra Helps Prevent and Contain This Attack Pattern

Sentra’s approach is fundamentally different because it starts with data: what exists, where it resides, who or what can access it, and whether that access is appropriate. Rather than treating Salesforce or other SaaS platforms as black boxes, Sentra maps the data structures inside them, identifies sensitive records, and correlates that information with identity permissions including third-party apps, machine identities, and OAuth sessions.

One key pillar of Sentra’s value lies in its DSPM capabilities. The platform identifies sensitive data across all repositories, including cloud storage, SaaS environments, data warehouses, code repositories, collaboration platforms, and even on-prem file systems. Because Sentra also detects secrets such as API keys, OAuth credentials, private keys, and authentication tokens across these environments, it becomes possible to catch compromised or improperly stored secrets before an attacker ever uses them to access a SaaS platform.

OAuth 2.0 Access Token

Another area where this becomes critical is the detection of over-privileged connected apps. Sentra continuously evaluates the scopes and permissions granted to integrations like Gainsight, identifying when either an app or an identity holds more access than its business purpose requires. This type of analysis would have revealed that a compromised integrated app could see far more data than necessary, providing early signals of elevated risk long before an attacker exploited it.

Sentra further tracks the health and behavior of non-human identities. Service accounts and connectors often rely on long-lived credentials that are rarely rotated and may remain active long after the responsible team has changed. Sentra identifies these stale or overly permissive identities and highlights when their behavior deviates from historical norms. In the context of this incident type, that means detecting when a connector suddenly begins accessing objects it never touched before or when large volumes of data begin flowing to unexpected locations or IP ranges.

Finally, Sentra’s behavior analytics (part of DDR) help surface early signs of misuse. Even if an attacker obtains valid OAuth tokens, their data access patterns, query behavior, or geography often diverge from the legitimate integration. By correlating anomalous activity with the sensitivity of the data being accessed, Sentra can detect exfiltration patterns in real time—something traditional tools simply aren’t designed to do.

The 2026 Outlook: More Incidents Are Coming

The Gainsight/Salesforce incident is unlikely to be the last of its kind. The speed at which enterprises adopt SaaS integrations far exceeds the rate at which they assess the data exposure those integrations create. OAuth-based supply-chain attacks are growing quickly because they allow adversaries to compromise one provider and gain access to dozens or hundreds of downstream environments. Given the proliferation of partner ecosystems, machine identities, and unmonitored secrets, this attack vector will continue to scale.

Prediction:
Unless enterprises add data-centric SaaS visibility and identity-aware DSPM, we should expect three to five more incidents of similar magnitude before summer 2026.

Conclusion

The real lesson from the Gainsight/Salesforce breach is not to reduce reliance on third-party SaaS providers as modern business would grind to a halt without them. The lesson is that enterprises must know where their sensitive data lives, understand exactly which identities and integrations can access it, and ensure those privileges are continuously validated. Sentra provides that visibility and contextual intelligence, making it possible to identify the risks that made this breach possible and help to prevent the next one.

<blogcta-big>

Read More
David Stuart
David Stuart
November 24, 2025
3
Min Read

Securing Unstructured Data in Microsoft 365: The Case for Petabyte-Scale, AI-Driven Classification

Securing Unstructured Data in Microsoft 365: The Case for Petabyte-Scale, AI-Driven Classification

The modern enterprise runs on collaboration and nothing powers that more than Microsoft 365. From Exchange Online and OneDrive to SharePoint, Teams, and Copilot workflows, M365 hosts a massive and ever-growing volume of unstructured content: documents, presentations, spreadsheets, image files, chats, attachments, and more.

Yet unstructured = harder to govern. Unlike tidy database tables with defined schemas, unstructured repositories flood in with ambiguous content types, buried duplicates, or unused legacy files. It’s in these stacks that sensitive IP, model training data, or derivative work can quietly accumulate, and then leak.

Consider this: one recent study found that more than 81 % of IT professionals report data-loss events in M365 environments. And to make matters worse, according to the International Data Corporation (IDC), 60% of organizations do not have a strategy for protecting their critical business data that resides in Microsoft 365.

Why Traditional Tools Struggle

  • Built-in classification tools (e.g., M365’s native capabilities) often rely on pattern matching or simple keywords, and therefore struggle with accuracy, context, scale and derivative content.

  • Many solutions only surface that a file exists and carries a type label - but stop short of mapping who or what can access it, its purpose, and what its downstream exposure might be.

  • GenAI workflows now pump massive volumes of unstructured data into copilots, knowledge bases, training sets - creating new blast radii that legacy DLP or labeling tools weren’t designed to catch.

What a Modern Platform Must Deliver

  1. High-accuracy, petabyte-scale classification of unstructured data (so you know what you have, where it sits, and how sensitive it is). And it must keep pace with explosive data growth and do so cost efficiently.

  2. Unified Data Access Governance (DAG) - mapping identities (users, service principals, agents), permissions, implicit shares, federated/cloud-native paths across M365 and beyond.
  3. Data Detection & Response (DDR) - continuous monitoring of data movement, copies, derivative creation, AI agent interactions, and automated response/remediation.

How Sentra addresses this in M365

Assets contain plain text credit card numbers

At Sentra, we’ve built a cloud-native data-security platform specifically to address this triad of capabilities - and we extend that deeply into M365 (OneDrive, SharePoint, Teams, Exchange Online) and other SaaS platforms.

  • A newly announced AI Classifier for Unstructured Data accelerates and improves classification across M365’s unstructured repositories (see: Sentra launches breakthrough unstructured-data AI classification capabilities).

  • Petabyte-scale processing: our architecture supports classification and monitoring of massive file estates without astronomical cost or time-to-value.

  • Seamless support for M365 services: read/write access, ingestion, classification, access-graph correlation, detection of shadow/unmanaged copies across OneDrive and SharePoint—plus integration into our DAG and DDR layers (see our guide: How to Secure Regulated Data in Microsoft 365 + Copilot).

  • Cost-efficient deployment: designed for high scale without breaking the budget or massive manual effort.

The Bottom Line

In today’s cloud/AI era, saying “we discovered the PII in my M365 tenant” isn’t enough.

The real question is: Do I know who or what (user/agent/app) can access that content, what its business purpose is, and whether it’s already been copied or transformed into a risk vector?


If your solution can’t answer that, your unstructured data remains a silent, high-stakes liability and resolving concerns becomes a very costly, resource-draining burden. By embracing a platform that combines classification accuracy, petabyte-scale processing, unified DSPM + DAG + DDR, and deep M365 support, you move from “hope I’m secure” to “I know I’m secure.”

Want to see how it works in a real M365 setup? Check out our video or book a demo.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

Gartner Certificate for Sentra