Sentra Expands Data Security Platform with On-Prem Scanners for Hybrid Environments
All Resources
In this article:
minus iconplus icon
Share the Blog

Automating Sensitive Data Classification in Audio, Image and Video Files

January 13, 2025
4
Min Read
Data Security

The world we live in is constantly changing. Innovation and technology are advancing at an unprecedented pace. So much innovation and high tech. Yet, in the midst of all this progress, vast amounts of critical data continue to be stored in various formats, often scattered across network file shares network file shares or cloud storage. Not just structured documents—PDFs, text files, or PowerPoint presentations - we're talking about audio recordings, video files, x-ray images, engineering charts, and so much more.

How do you truly understand the content hidden within these formats? 

After all, many of these files could contain your organization’s crown jewels—sensitive data, intellectual property, and proprietary information—that must be carefully protected.

Importance of Extracting and Understanding Unstructured Data

Extracting and analyzing data from audio, image and video files is crucial in a data-driven world. Media files often contain valuable and sensitive information that, when processed effectively, can be leveraged for various applications.

  • Accessibility: Transcribing audio into text helps make content accessible to people with hearing impairments and improves usability across different languages and regions, ensuring compliance with accessibility regulations.
  • Searchability: Text extraction enables indexing of media content, making it easier to search and categorize based on keywords or topics. This becomes critical when managing sensitive data, ensuring that privacy and security standards are maintained while improving data discoverability.
  • Insights and Analytics: Understanding the content of audio, video, or images can help derive actionable insights for fields like marketing, security, and education. This includes identifying sensitive data that may require protection, ensuring compliance with privacy regulations, and protecting against unauthorized access.
  • Automation: Automated analysis of multimedia content supports workflows like content moderation, fraud detection, and automated video tagging. This helps prevent exposure of sensitive data and strengthens security measures by identifying potential risks or breaches in real-time.
  • Compliance and Legal Reasons: Accurate transcription and content analysis are essential for meeting regulatory requirements and conducting audits, particularly when dealing with sensitive or personally identifiable information (PII). Proper extraction and understanding of media data help ensure that organizations comply with privacy laws such as GDPR or HIPAA, safeguarding against data breaches and potential legal issues.

Effective extraction and analysis of media files unlocks valuable insights while also playing a critical role in maintaining robust data security and ensuring compliance with evolving regulations.

Cases Where Sensitive Data Can Be Found in Audio & MP4 Files

In industries such as retail and consumer services, call centers frequently record customer calls for quality assurance purposes. These recordings often contain sensitive information like personally identifiable information (PII) and payment card data (PCI), which need to be safeguarded. In the media sector, intellectual property often consists of unpublished or licensed videos, such as films and TV shows, which are copyrighted and require protection with rights management technology. However, it's common for employees or apps to extract snippets or screenshots from these videos and store them on personal drives or in unsecured environments, exposing valuable content to unauthorized access.

Another example is when intellectual property or trade secrets are inadvertently shared through unsecured audio or video files, putting sensitive business information at risk - or simply a leakage of confidential information such as non-public sales figures for a publicly traded company. Serious damage can occur to a public company if a bad actor got a hold of an internal audio or video call recording in advance where forecasts or other non-public sales figures are discussed. This would likely be a material disclosure requiring regulatory reporting (ie., for SEC 4-day material breach compliance).

Discover Sensitive Data in MP4s and Audio with Sentra

AI-powered technologies that extract text from images, audio, and video are built on advanced machine learning models like Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR)

OCR converts visual text in images or videos into editable, searchable formats, while ASR transcribes spoken language from audio and video into text. These systems are fueled by deep learning algorithms trained on vast datasets, enabling them to recognize diverse fonts, handwriting, languages, accents, and even complex layouts. At scale, cloud computing enables the deployment of these AI models by leveraging powerful GPUs and scalable infrastructure to handle high volumes of data efficiently. 

The Sentra Cloud-Native Platform integrates tools like serverless computing, distributed processing, and API-driven architectures, allowing it to access these advanced capabilities that run ML models on-demand. This seamless scaling capability ensures fast, accurate text extraction across the global user base.

Sentra is rapidly adopting advancements in AI-driven text extraction. A few examples of recent advancements are Optical Character Recognition (OCR) that works seamlessly on dynamic video streams and robust Automatic Speech Recognition (ASR) models capable of transcribing multilingual and domain-specific content with high accuracy. Additionally, innovations in pre-trained transformer models, like Vision-Language and Speech-Language models, enable context-aware extractions, such as identifying key information from complex layouts or detecting sentiment in spoken text. These breakthroughs are pushing the boundaries of accessibility and automation across industries, and enable data security and privacy teams to achieve what was previously thought impossible.

Large volume of sensitive data was copied into a shared drive
Data at Risk - Data Activity Overview

Sentra: An Innovator in Sensitive Data Discovery within Video & Audio

Sentra’s innovative approach to sensitive data discovery goes beyond traditional text-based formats, leveraging advanced ML and AI algorithms to extract and classify data from audio, video, and images. Extracting and understanding unstructured data from media files is increasingly critical in today’s data-driven world. These files often contain valuable and sensitive information that, when properly processed, can unlock powerful insights and drive better decision-making across industries. Sentra’s solution contextualizes multimedia content to highlight what matters most for your unique needs, delivering instant answers with a single click—capabilities we believe set us apart as the only DSPM solution offering this level of functionality.

As threats continue to evolve across multiple vectors, including text, audio, and video—solution providers must constantly adopt new techniques for accurate classification and detection. AI plays a critical role in enhancing these capabilities, offering powerful tools to improve precision and scalability. Sentra is committed to driving innovation by leveraging these advanced technologies to keep data secure.

Want to see it in action? Request a demo today and discover how Sentra can help you protect sensitive data wherever it resides, even in image and audio formats.

<blogcta-big>

Yair brings a wealth of experience in cybersecurity and data product management. In his previous role, Yair led product management at Microsoft and Datadog. With a background as a member of the IDF's Unit 8200 for five years, he possesses over 18 years of expertise in enterprise software, security, data, and cloud computing. Yair has held senior product management positions at Datadog, Digital Asset, and Microsoft Azure Protection.

Subscribe

Latest Blog Posts

Ward Balcerzak
Ward Balcerzak
October 20, 2025
3
Min Read
Data Security

2026 Cybersecurity Budget Planning: Make Data Visibility a Priority

2026 Cybersecurity Budget Planning: Make Data Visibility a Priority

Why Data Visibility Belongs in Your 2026 Cybersecurity Budget

As the fiscal year winds down and security leaders tackle cybersecurity budget planning for 2026, you need to decide how to use every remaining 2025 dollar wisely and how to plan smarter for next year. The question isn’t just what to cut or keep, it’s what creates measurable impact. Across programs, data visibility and DSPM deliver provable risk reduction, faster audits, and clearer ROI,making them priority line items whether you’re spending down this year or shaping next year’s plan. Some teams discover unspent funds after project delays, postponed renewals, or slower-than-expected hiring. Others are already deep in planning mode, mapping next year’s security priorities across people, tools, and processes. Either way, one question looms large: where can a limited security budget make the biggest impact - right now and next year?

Across the industry, one theme is clear: data visibility is no longer a “nice-to-have” line item, it’s a foundational control. Whether you’re allocating leftover funds before year-end or shaping your 2026 strategy, investing in Data Security Posture Management (DSPM) should be part of the plan.

As Bitsight notes, many organizations look for smart ways to use remaining funds that don’t roll over. The goal isn’t simply to spend, it’s to invest in initiatives that improve posture and provide measurable, lasting value. And according to Applied Tech, “using remaining IT funds strategically can strengthen your position for the next budget cycle.”

That same principle applies in cybersecurity. Whether you’re closing out this year or planning for 2026, the focus should be on spending that improves security maturity and tells a story leadership understands. Few areas achieve that more effectively than data-centric visibility.

(For additional background, see Sentra’s article on why DSPM should take a slice of your cybersecurity budget.)

Where to Allocate Remaining Year-End Funds (Without Hurting Next Year’s Budget)

It’s important to utilize all of your 2025 budget allocations because finance departments frequently view underspending as a sign of overfunding, leading to smaller allocations next year. Instead, strategic security teams look for ways to convert every remaining dollar into evidence of progress.

That means focusing on investments that:

  • Produce measurable results you can show to leadership.
  • Strengthen core program foundations: people, visibility, and process.
  • Avoid new recurring costs that stretch future budgets.

Top Investments That Pay Off

1. Invest in Your People

One of the strongest points echoed by security professionals across industry communities: the best investment is almost always your people. Security programs are built on human capability. Certifications, practical training, and professional growth not only expand your team’s skills but also build morale and retention, two things that can’t be bought with tooling alone.

High-impact options include:

  • Hands-on training platforms like Hack The Box, INE Skill Dive, or Security Blue Team, which develop real-world skills through simulated environments.
  • Professional certifications (SANS GIAC, OSCP, or cloud security credentials) that validate expertise and strengthen your team’s credibility.
  • Conference attendance for exposure to new threat perspectives and networking with peers.
  • Cross-functional training between SOC, GRC, and AppSec to create operational cohesion.

In practitioner discussions, one common sentiment stood out: training isn’t just an expense, it’s proof of leadership maturity.

As one manager put it, “If you want your analysts to go the extra mile during an incident, show you’ll go the extra mile for them when things are calm.”

2. Invest in Data Visibility (DSPM)

While team capability drives execution, data visibility drives confidence. In recent conversations among mid-market and enterprise security teams, Data Security Posture Management (DSPM) repeatedly surfaced as one of the most valuable investments made in the past year, especially for hybrid-cloud environments.

One security leader described it this way:

“After implementing DSPM, we finally had a clear picture of where sensitive data actually lived. It saved our team hours of manual chasing and made the audit season much easier.”

That feedback reflects a growing consensus: without visibility into where sensitive data resides, who can access it, and how it’s secured, every other layer of defense operates partly in the dark.

*Tip: If your remaining 2025 budget won’t suffice for a full DSPM deployment, you can scope an initial implementation with the remaining budget, then expand to full coverage in 2026.

DSPM solutions provide that clarity by helping teams:

  • Map and classify sensitive data across multi-cloud and SaaS environments.
  • Identify access misconfigurations or risky sharing patterns.
  • Detect policy violations or overexposure before they become incidents.

Beyond security operations, DSPM delivers something finance and leadership appreciate, measurable proof. Dashboards and reports make risk tangible, allowing CISOs to demonstrate progress in data protection and compliance.

The takeaway: DSPM isn’t just a good way to use remaining funds, it’s a baseline investment every forward-looking security program should plan for in 2026 and beyond.

3. Invest in Testing

Training builds capability. Visibility builds understanding. Testing builds credibility.

External red team, purple team, or security posture assessments continue to be among the most effective ways to validate your defenses and generate actionable findings.

Security practitioners often point out that testing engagements create outcomes leadership understands:

“Training is great, but it’s hard to quantify. An external assessment gives you findings, metrics, and a roadmap you can point to when defending next year’s budget.”

Well-scoped assessments do more than uncover vulnerabilities—they benchmark performance, expose process gaps, and generate data-backed justification for continued investment.

4. Preserve Flexibility with a Retainer

If your team can’t launch a new project before year-end, a retainer with a trusted partner is an efficient way to preserve funds without waste. Retainers can cover services like penetration testing, incident response, or advisory hours, providing flexibility when unpredictable needs arise. This approach, often recommended by veteran CISOs, allows teams to close their books responsibly while keeping agility for the next fiscal year.

5. Strengthen Your Foundations

Not every valuable investment requires new tools. Several practitioners emphasized the long-term returns from process improvements and collaboration-focused initiatives:

  • Threat modeling workshops that align development and security priorities.
  • Framework assessments (like NIST CSF or ISO 27001) that provide measurable baselines.
  • Automation pilots to eliminate repetitive manual work.
  • Internal tabletop exercises that enhance cross-team coordination.

These lower-cost efforts improve resilience and efficiency, two metrics that always matter in budget conversations.

How to Decide: A Simple, Measurable Framework

When evaluating where to allocate remaining or future funds, apply a simple framework:

  1. Identify what’s lagging. Which pillar - people, visibility, or process most limits your current effectiveness?
  2. Choose something measurable. Prioritize initiatives that produce clear, demonstrable outputs: reports, dashboards, certifications.
  3. Aim for dual impact. Every investment should strengthen both your operations and your ability to justify next year’s funding.

Final Thoughts

A strong security budget isn’t just about defense, it’s about direction. Every spend tells a story about how your organization prioritizes resilience, efficiency, and visibility.

Whether you’re closing out this year’s funds or preparing your 2026 plan, focus on investments that create both operational value and executive clarity. Because while technologies evolve and threats shift, understanding where your data is, who can access it, and how it’s protected remains the cornerstone of a mature security program.

Or, as one practitioner summed it up: “Spend on the things that make next year’s budget conversation easier.”

DSPM fits that description perfectly.

<blogcta-big>

Read More
Meni Besso
Meni Besso
October 15, 2025
3
Min Read
Compliance

Hybrid Environments: Expand DSPM with On-Premises Scanners

Hybrid Environments: Expand DSPM with On-Premises Scanners

Data Security Posture Management (DSPM) has quickly become a must-have for organizations moving to the cloud. By discovering, classifying, and protecting sensitive data across SaaS apps and cloud services, DSPM gave security teams visibility into data risks they never knew they had before.

But here’s the reality: most enterprises aren’t 100% cloud. Legacy file shares, private databases, and hybrid workloads still hold massive amounts of sensitive data. Without visibility into these environments, even the most advanced DSPM platforms leave critical blind spots.

That’s why DSPM platform support is evolving - from cloud-only to truly hybrid.

The Evolution of DSPM

DSPM emerged as a response to the visibility problem created by rapid cloud adoption. As organizations moved to cloud services, SaaS applications, and collaboration platforms, sensitive data began to sprawl across environments at a pace traditional security tools couldn’t keep up with. Security teams suddenly faced oversharing, inconsistent access controls, and little clarity on where critical information actually lived.

DSPM helped fill this gap by delivering a new level of insight into cloud data. It allowed organizations to map sensitive information across their environments, highlight risky exposures, and begin enforcing least-privilege principles at scale. For cloud-native companies, this represented a huge leap forward - finally, there was a way to keep up with constant data changes and movements, helping customers safely adopt the cloud while maintaining data security best practices and compliance and without slowing innovation.

But for large enterprises, the model was incomplete. Decades of IT infrastructure meant that vast amounts of sensitive information still lived in legacy databases, file shares, and private cloud environments. While DSPM gave them visibility in the cloud, it left everything else in the dark.

The Blind Spot of On-Prem & Private Data

Despite rapid cloud adoption and digital transformation progress, large organizations still rely heavily on hybrid and on-prem environments, since data movement to the cloud can be a year’s long process. On-premises file shares such as NetApp ONTAP, SMB, and NTFS, alongside enterprise databases like Oracle, SQL Server, and MySQL, remain central to operations. Private cloud applications are especially common in regulated industries like healthcare, finance, and government, where compliance demands keep critical data on-premises.

To scan on premises data, many DSPM providers offer partial solutions by taking ephemeral ‘snapshots’ of that data and temporarily moving it to the cloud (either within customer environment, as Sentra does, or to the vendor cloud as some others do) for classification analysis. This can satisfy some requirements, but often is seen as a compliance risk for very sensitive or private data which must remain on-premises. What’s left are two untenable alternatives - ignoring the data which leaves serious visibility gaps or utilizing manual techniques which do not scale.

These approaches were clearly not built for today’s security or operational requirements. Sensitive data is created and proliferates rapidly, which means it may be unclassified, unmonitored, and overexposed, but how do you even know? From a compliance and risk standpoint, DSPM without on-prem visibility is like watching only half the field, and leaving the other half open to attackers or accidental exposure.

Expanding with On-Prem Scanners

Sentra is changing the equation. With the launch of its on-premise scanners, the platform now extends beyond the cloud to hybrid and private environments, giving organizations a single pane of glass for all their data security.

With Sentra, organizations can:

  • Discover and classify sensitive data across traditional file shares (SMB, NFS, CIFS, NTFS) and enterprise databases (Oracle, SQL Server, MySQL, MSSQL, PostgreSDL, MongoDB, MariaDB, IBM DB2, Teradata).
  • Detects and protects critical data as it moves between on-prem and cloud environments.
  • Apply AI-powered classification and enforce Microsoft Purview labeling consistently across environments.
  • Strengthen compliance with frameworks that demand full visibility across hybrid estates.
  • Have a choice of deployment models that best fits their security, compliance, and operational requirements.

Crucially, Sentra’s architecture allows customers to ensure private data always remains in their own environment. They need not move data outside their premises and nothing is ever copied into Sentra’s cloud, making it a trusted choice for enterprises that require secure, private data processing.

Extending the Hybrid Vision

This milestone builds on Sentra’s proven track record as the only cloud-native data security platform that guarantees data always remains within the customer’s cloud environments - never copied or stored in Sentra’s cloud.

Now, Sentra’s AI-powered classification and governance engine can also be deployed in organizations that require onsite data processing, giving them the flexibility to protect both structured and unstructured data across cloud and on-premises systems.

By unifying visibility and governance across all environments while maintaining complete data sovereignty, Sentra continues to lead the next phase of DSPM, one built for modern, hybrid enterprises.

Real-World Impact

Picture a global bank: with modern customer-facing websites and mobile applications hosted in the public cloud, providing agility and scalability for digital services. At the same time, the bank continues to rely on decades-old operational databases running in its private cloud — systems that power core banking functions such as transactions and account management. Without visibility into both, security teams can’t fully understand the risks these stores may pose and enforce least privilege, prevent oversharing, or ensure compliance.

With hybrid DSPM powered by on-prem scanners, that same bank can unify classification and governance across every environment - cloud or on-prem, and close the gaps that attackers or AI systems could otherwise exploit.

Conclusion

DSPM solved the cloud problem. But enterprises aren’t just in the cloud, they’re hybrid. Legacy systems and private environments still hold critical data, and leaving them out of your security posture is no longer an option.

Sentra’s on-premise scanners mark the next stage of DSPM evolution: one unified platform for cloud, on-prem, and private environments. With full visibility, accurate classification, and consistent governance, enterprises finally have the end-to-end data security they need for the AI era. Because protecting half your data is no longer enough.

<blogcta-big>

Read More
Shiri Nossel
Shiri Nossel
September 28, 2025
4
Min Read
Compliance

The Hidden Risks Metadata Catalogs Can’t See

The Hidden Risks Metadata Catalogs Can’t See

In today’s data-driven world, organizations are dealing with more information than ever before. Data pours in from countless production systems and applications, and data analysts are tasked with making sense of it all - fast. To extract valuable insights, teams rely on powerful analytics platforms like Snowflake, Databricks, BigQuery, and Redshift. These tools make it easier to store, process, and analyze data at scale.

But while these platforms are excellent at managing raw data, they don't solve one of the most critical challenges organizations face: understanding and securing that data.

That’s where metadata catalogs come in.

Metadata Catalogs Are Essential But They’re Not Enough

Metadata catalogs such as AWS Glue, Hive Metastore, and Apache Iceberg are designed to bring order to large-scale data ecosystems. They offer a clear inventory of datasets, making it easier for teams to understand what data exists, where it’s stored, and who is responsible for it.

This organizational visibility is essential. With a good catalog in place, teams can collaborate more efficiently, minimize redundancy, and boost productivity by making data discoverable and accessible.

But while these tools are great for discovery, they fall short in one key area: security. They aren’t built to detect risky permissions, identify regulated data, or prevent unintended exposure. And in an era of growing privacy regulations and data breach threats, that’s a serious limitation.

Different Data Tools, Different Gaps

It’s also important to recognize that not all tools in the data stack work the same way. For example, platforms like Snowflake and BigQuery come with fully managed infrastructure, offering seamless integration between storage, compute, and analytics. Others, like Databricks or Redshift, are often layered on top of external cloud storage services like S3 or ADLS, providing more flexibility but also more complexity.

Metadata tools have similar divides. AWS Glue is tightly integrated into the AWS ecosystem, while tools like Apache Iceberg and Hive Metastore are open and cloud-agnostic, making them suitable for diverse lakehouse architectures.

This variety introduces fragmentation, and with fragmentation comes risk. Inconsistent access policies, blind spots in data discovery, and siloed oversight can all contribute to security vulnerabilities.

The Blind Spots Metadata Can’t See

Even with a well-maintained catalog, organizations can still find themselves exposed. Metadata tells you what data exists, but it doesn’t reveal when sensitive information slips into the wrong place or becomes overexposed.

This problem is particularly severe in analytics environments. Unlike production environments, where permissions are strictly controlled, or SaaS applications, which have clear ownership and structured access models, data lakes and warehouses function differently. They are designed to collect as much information as possible, allowing analysts to freely explore and query it.

In practice, this means data often flows in without a clear owner and frequently without strict permissions. Anyone with warehouse access, whether users or automated processes, can add information, and analysts typically have broad query rights across all data. This results in a permissive, loosely governed environment where sensitive data such as PII, financial records, or confidential business information can silently accumulate. Once present, it can be accessed by far more individuals than appropriate.

The good news is that the remediation process doesn't require a heavy-handed approach. Often, it's not about managing complex permission models or building elaborate remediation workflows. The crucial step is the ability to continuously identify and locate sensitive data, understand its location, and then take the correct action whether that involves removal, masking, or locking it down.

How Sentra Bridges the Gap Between Data Visibility & Security

This is where Sentra comes in.

Sentra’s Data Security Posture Management (DSPM) platform is designed to complement and extend the capabilities of metadata catalogs, not just to address their limitations, but to elevate your entire data security strategy. Instead of replacing your metadata layer, Sentra works alongside it enhancing your visibility with real-time insights and powerful security controls.

Sentra scans across modern data platforms like Snowflake, S3, BigQuery, and more. It automatically classifies and tags sensitive data, identifies potential exposure risks, and detects compliance violations as they happen.

With Sentra, your metadata becomes actionable.

sentra dashboard datasets

From Static Maps to Live GPS

Think of your metadata catalog as a map. It shows you what’s out there and how things are connected. But a map is static. It doesn’t tell you when there’s a roadblock, a detour, or a collision. Sentra transforms that map into a live GPS. It alerts you in real time, enforces the rules of the road, and helps you navigate safely no matter how fast your data environment is moving.

Conclusion: Visibility Without Security Is a Risk You Can’t Afford

Metadata catalogs are indispensable for organizing data at scale. But visibility alone doesn’t stop a breach. It doesn’t prevent sensitive data from slipping into the wrong place, or from being accessed by the wrong people.

To truly safeguard your business, you need more than a map of your data—you need a system that continuously detects, classifies, and secures it in real time. Without this, you’re leaving blind spots wide open for attackers, compliance violations, and costly exposure.

Sentra turns static visibility into active defense. With real-time discovery, context-rich classification, and automated protection, it gives you the confidence to not only see your data, but to secure it.

See clearly. Understand fully. Protect confidently with Sentra.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

Gartner Certificate for Sentra