All Resources
In this article:
minus iconplus icon
Share the Article

Email DLP Beyond the Gateway: Why Email Archive Scanning Has to Be Part of Your DSPM

May 1, 2026
3
 Min Read

Key takeaway: Gateway DLP only inspects email at send time. MSG, PST, EML, and OST archives — stored on file shares, desktops, and cloud storage — contain years of PII, PHI, and financial data that most DSPM tools never scan. Email archive scanning is a required component of any complete data security posture management strategy.

If you walk into most security teams today and ask how they “protect email,” you’ll hear a familiar story: secure gateway, phishing filters, transport DLP, maybe some sandboxing. All of that matters. But it’s solving the wrong half of the problem.

The real risk is not email in transit. It’s email at rest.

The Email Data Security Gap: What Lives in PST, MSG, and EML Archives

Every organization I’ve worked with has the same pattern: MSG files saved to desktops, PST archives dumped onto file shares, EML files zipped and uploaded to cloud storage. Those archives contain years of attachments, forwarded threads, and exported mailboxes. They also contain some of the densest concentrations of PII, PHI, financial data, and confidential conversations anywhere in the company — and for most data security tools, they’re completely invisible.

Gateway DLP inspects a message once, at send time. It has no idea what happens when that message is saved, exported, forwarded, archived, or bundled into a PST file on someone’s last day at the company.  If your data security posture management (DSPM) strategy doesn’t include deep, format‑aware email archive scanning, you’re blind to where email data actually lives.

How Sentra Scans Email Archives: MSG, EML, PST, and OST

At Sentra, we treat MSG, EML, PST, and OST as composite data stores that deserve the same depth of analysis as a database or a data lake table. Our extraction engine understands Outlook message files, standard RFC 822 emails, and full mailbox data files. We pull out headers, HTML and plain‑text bodies, and every attachment, then recursively follow the chain as far as it goes — attached emails, nested ZIPs, the spreadsheets and PDFs hiding inside those ZIPs, and so on.  All of that processing happens in memory, so we’re not creating new, unmanaged copies of sensitive content while we scan.

Three Risks That Email Archive Scanning Directly Addresses

From a risk perspective, this matters in three concrete ways. First, insider exfiltration doesn’t always look like a big transfer to an external file‑sharing service. More often, it looks like months of forwarding sensitive files to a personal account, followed by a mailbox export to PST. That one file now contains everything they walked out with, in a format most tools can’t inspect.  Second, accidental exposure is endemic: people send spreadsheets with customer PII, lab results, or financial reports to the wrong recipients all the time. Those messages live in archives long after anyone remembers they exist.  Third, every major privacy and sectoral framework — GDPR, HIPAA, SEC/FINRA rules — assumes you can actually find personal and regulated data in email when you need to respond to a deletion request, an investigation, or legal discovery.

Email archives are one of the largest ungoverned data lakes in most enterprises. Treating them as “solved” because you have a good gateway is how you end up explaining to regulators why a PST on a public share contained ten years of customer attachments. Deep email archive scanning is exactly the kind of capability we built Sentra’s DSPM platform to deliver. If you’re serious about closing real‑world data gaps, you have to go where the data actually lives — and a staggering amount of it still lives in email.

Learn more about how Sentra discovers and classifies sensitive data across your cloud — including inside email archives — at sentra.io.

Discover Ron’s expertise, shaped by over 20 years of hands-on tech and leadership experience in cybersecurity, cloud, big data, and machine learning. As a serial entrepreneur and seed investor, Ron has contributed to the success of several startups, including Axonius, Firefly, Guardio, Talon Cyber Security, and Lightricks, after founding a company acquired by Oracle.

Subscribe

Latest Blog Posts

Ron Reiter
Ron Reiter
May 1, 2026
3
Min Read

Source Code Secrets Scanning: The Missing Half of Your Cloud Data Security Strategy

Source Code Secrets Scanning: The Missing Half of Your Cloud Data Security Strategy

Key takeaway: Scanning only your Git repositories for secrets misses the majority of exposures. API keys, credentials, and private keys routinely escape into cloud storage, laptops, and CI pipelines — where no SCM scanner can find them. Comprehensive source code secrets scanning must cover your entire cloud estate, not just version control.

If you look at the root cause of most modern breaches, a depressingly common pattern appears: someone left a secret where it didn’t belong. An API key in a script. A database password in a config file. An SSH private key in a shared folder. We’ve all seen it, and we all know better — but knowing and seeing are two very different things.

Why Repository-Level Scanning Is Not Enough

The uncomfortable reality is that source code secrets scanning is still treated as a repository problem in most organizations. You wire up scanners to GitHub or GitLab, plug something into the CI pipeline, and feel like you’re covered. But that’s not where the real blind spot is.

Code spreads. Secrets spread with it.

Developers clone repos to laptops. They sync whole project directories — including .env files you carefully excluded from version control — to Box, Google Drive, or OneDrive. They drop configuration bundles into S3 for deployment scripts. They zip up “old” services and park them in cold storage “just in case.” None of your branch protection rules or repository‑level scanners apply to those copies anymore.

What Comprehensive Cloud-Wide Secrets Scanning Looks Like

That’s the gap we designed Sentra to close. Our DSPM platform doesn’t limit itself to SCMs; it treats code, configs, and secrets as data spread across your cloud estate. We natively support 600+ source file extensions across mainstream and niche languages — Python, JavaScript/TypeScript, Java, Go, C/C++, C#, Rust, Ruby, PHP, Swift, Kotlin, Scala, R, MATLAB, and hundreds more — because secrets don’t care what language you wrote them in.  We read those files with smart encoding detection and process them entirely in memory so scanning doesn’t create new copies of the very content you’re trying to protect.

We also go after the places secrets are supposed to live and still end up exposed. Environment files like .env, .prod, .dev, .qa are intentionally dense collections of connection strings, API keys, OAuth tokens, and cloud credentials.  They’re also routinely copied into CI buckets, checked into repos “temporarily,” synced from laptops to personal cloud storage, and left behind in old deployment folders. Sentra parses these as structured key–value stores and treats every value as a potential secret, not just as generic text.

On the higher‑impact end of the spectrum, we identify cryptographic keys and certificates — .pem, .ppk, .crt, .id_rsa, Java KeyStores, and more — wherever they show up in your cloud.  A single private key on a shared file system can be the difference between a contained incident and full cluster compromise; pretending those files don’t exist outside your “keys” repo is wishful thinking.

We apply the same lens to infrastructure‑as‑code and config files: Terraform (.tf, .hcl), Kubernetes YAML manifests, Helm charts, Dockerfiles, .config, .conf, .ini, .cfg. Those are exactly the artifacts that get copied into S3 for ops, packaged into artifacts, or left in CI logs. They frequently embed credentials, service account tokens, and internal endpoints.

Even “documentation” isn’t off the hook. I’ve lost count of README files with “example” API keys that turned out to be real, markdown runbooks with production connection strings, or onboarding guides that still contain “temporary” passwords issued months ago. Sentra scans these right alongside code, because attackers don’t care whether a secret lives in .py or .md.

And it’s not just secrets. Source trees are full of embedded PII and regulated data: test data seeded with real customer records, SQL seed scripts with actual phone numbers and SSNs, debug dumps committed alongside the code that created them.  Sentra’s classifiers treat this like any other data source and flag those exposures so compliance teams can act.

Secrets Scanning and Compliance: SOC 2, ISO 27001, and Supply Chain Security

Frameworks like SOC 2 and ISO 27001 already expect you to have serious secrets management; supply‑chain security expectations are pushing in the same direction.  But you can’t manage what you can’t see. There’s a huge difference between “we scan our main repos” and “we know where every secret lives across our cloud.” That gap — all the code, configs, and keys that leaked into storage outside of Git — is where real breaches happen.

If you want to see what comprehensive source code secrets scanning looks like when it’s treated as part of data security, not just DevSecOps hygiene, you can request a demo or explore our DSPM overview at sentra.io.

Read More
Ron Reiter
Ron Reiter
May 1, 2026
3
Min Read

Jupyter Notebook Scanning: The Data Science Blind Spot Leaking Your Sensitive Data

Jupyter Notebook Scanning: The Data Science Blind Spot Leaking Your Sensitive Data

Key takeaway: Jupyter notebooks silently embed query results, PII, credentials, and model training data directly into .ipynb files — making them a high-risk, largely invisible data exposure vector that traditional DSPM tools miss entirely.

As a CTO, I love what Jupyter notebooks have done for data science. They made experimentation faster and more accessible. But they also created a data security problem almost nobody in the industry wanted to talk about — and one that most DSPM platforms still don’t address.

Why Jupyter Notebooks Are a Hidden Data Security Risk

A notebook is not just “some JSON.” It’s a living environment where data scientists write code, run queries against production systems, visualize results, and document what they did — all in a single .ipynb file. Crucially, notebooks persist their outputs. Every DataFrame you print, every SQL query you run, every chart you render is embedded back into the notebook and travels with it when you commit to Git, upload to S3, or share it through JupyterHub.

That means a quick “SELECT * FROM customers LIMIT 1000” during an exploration session can turn into a permanent snapshot of real customer data — names, emails, addresses, account IDs — now stored in a file that’s often outside your formal data governance boundary. Multiply that by thousands of notebooks spread across repos and buckets, and you get a very large, largely invisible problem.

Why Traditional Data Security Scanning Misses Notebook Content

Traditional scanning approaches don’t help much here. If you treat notebooks as raw JSON and run regexes over them, you’ll drown in false positives from code syntax and structural noise, while still missing sensitive data rendered as HTML tables, base64‑encoded images, or attachments in cell outputs.  Effective Jupyter notebook scanning for data security has to understand the format and the different kinds of content it holds.

How Sentra Scans Jupyter Notebooks for Sensitive Data

In Sentra, we built a dedicated Jupyter reader that decomposes notebooks into code cells, markdown cells, and outputs, then processes each with the right extraction strategy.  Code cells are analyzed as text so we can detect hard‑coded database credentials, API keys, cloud tokens, and connection strings — all the “just for testing” shortcuts that never got cleaned up.  Markdown cells go through a markdown‑aware reader, because they often contain commentary about datasets, customers, or experiments that’s sensitive in its own right.

Most importantly, we treat cell outputs as a first‑class data source. We scan text and HTML outputs for PII, PHI, and financial data; we decode embedded images and run them through OCR to catch sensitive content in charts and screenshots; and we extract and analyze any attachments sitting inside outputs using the full Sentra parsing stack.  Everything is done in memory, and we support both v3 and v4 notebook formats so legacy notebooks aren’t exempt.

Jupyter Notebooks, AI Governance, and Compliance Risk

This isn’t just a nice‑to‑have. Notebooks are often the only place where you can see which data was used to train a model, how it was accessed, and what transformations were applied. As AI governance and regulations tighten, having a way to systematically scan and catalog notebook content becomes a prerequisite for answering basic questions about your ML pipelines.  From a compliance perspective, notebooks that contain EU customer data and end up in a US‑hosted Git repo can also create data residency problems you’ll never spot without automated discovery.

At the end of the day, the Jupyter notebook problem is a visibility problem. Security teams can’t protect data they can’t see, and notebooks have historically been invisible to DSPM tools.  Our goal with Sentra is to make notebooks as governable as any other data store — so your data scientists don’t have to choose between moving fast and staying compliant. You can see how this fits into our broader AI data readiness story at sentra.io.

Read More
Mark Kiley
Mark Kiley
April 10, 2026
4
Min Read

South Carolina Insurance Data Security Act: Data Security Requirements and How to Prove “Reasonable Security”

South Carolina Insurance Data Security Act: Data Security Requirements and How to Prove “Reasonable Security”

South Carolina Insurance Data Security Act: Data Security Requirements and How to Prove “Reasonable Security”

If you’re an insurer or insurance licensee doing business in South Carolina, you now have two layers of data security law to worry about:

  • The general South Carolina data breach notification law (S.C. Code § 39‑1‑90), and
  • The South Carolina Insurance Data Security Act (Title 38, Chapter 99), which imposes sector‑specific cybersecurity and incident reporting requirements on insurance licensees.

Directors, CISOs, and compliance leaders keep asking the same core questions:

  • Exactly who does the Insurance Data Security Act apply to?
  • What does “reasonable security” actually mean under this law?
  • How does it interact with the general breach notification statute?
  • And how can we prove compliance when an exam or incident happens?

This post walks through the law in plain language and, more importantly, shows how data‑centric security and DSPM make it far easier to demonstrate that you’re doing the right things.

What is the South Carolina Insurance Data Security Act?

The South Carolina Insurance Data Security Act (SCIDSA) is codified at Title 38, Chapter 99 of the South Carolina Code of Laws. It was enacted in 2018 and is modeled closely on the NAIC Insurance Data Security Model Law.

In short, it:

  • Requires insurance licensees to develop, implement, and maintain a comprehensive written information security program based on a risk assessment.
  • Imposes specific obligations around incident response, investigation, and reporting of cybersecurity events to the SC Department of Insurance.
  • Mandates oversight of third‑party service providers that access nonpublic information.
  • Requires annual certification of compliance to the Director of Insurance (for domestic insurers).

Think of it as the insurance‑specific overlay on top of South Carolina’s general breach law and any federal obligations you may have (GLBA, HIPAA for certain products, etc.).

Who does the Act apply to?

The Act applies to “licensees” that are licensed, authorized, or registered (or required to be) under South Carolina’s insurance laws, with certain exceptions.

That includes, for example:

  • Insurers (life, P&C, health, specialty carriers)
  • HMOs and many health plans regulated as insurers
  • Producers, agencies, and certain intermediaries
  • Other entities holding a license from the SC Department of Insurance

There are some exemptions. For instance, licensees with fewer than a specified number of employees or licensees already compliant with equivalent requirements in another state, but they are narrow, and you should confirm applicability with counsel.

If you’re writing policies in South Carolina or handling SC policyholder/insured data, you should assume SCIDSA applies until you’ve proven otherwise.

What does the law require? (High‑level overview)

At a high level, the Insurance Data Security Act requires licensees to do five big things:

  1. Build and maintain a written information security program that is risk‑based and appropriate to your size, complexity, and the sensitivity of your data.
  2. Conduct regular risk assessments to identify reasonably foreseeable threats to nonpublic information and your information systems.
  3. Implement controls across administrative, technical, and physical domains—covering access control, encryption, monitoring, incident response, and more.
  4. Oversee third‑party service providers that handle nonpublic information on your behalf, ensuring they maintain appropriate security.
  5. Investigate and report cybersecurity events to the SC Department of Insurance within defined timelines, and maintain documentation and records for exam and enforcement purposes.

What follows is a closer look at the pieces that matter most from a data‑security and breach‑readiness perspective.

Nonpublic information: what are you actually protecting?

The Act uses the term “nonpublic information” rather than just PII. That typically includes:

  • Business‑related information that, if tampered with or disclosed, would materially impact your operations or security.
  • Consumer information that can identify a person when combined with data elements like SSNs, financial account numbers, or driver’s license numbers—similar to but often broader than the general breach law’s definition.
  • Certain health or medical information associated with insurance products.

For insurers, this often spans:

  • Policyholder and applicant records
  • Claims data and adjuster notes
  • Payment and bank details
  • Underwriting models and internal risk scoring data
  • Credentials and MFA secrets granting access to core systems

This “nonpublic information” is what your information security program, and your incident response obligations, are ultimately organized around.

“Reasonable security” under SCIDSA: more than a buzzword

Many laws talk about “reasonable” or “appropriate” safeguards. The Insurance Data Security Act goes further by spelling out what a risk‑based program must include, such as:

  • Designating one or more responsible individuals (or an outside vendor) to oversee the information security program.
  • Conducting and documenting periodic risk assessments of threats to nonpublic information and information systems.
  • Implementing controls for:
    • Access controls and authentication
    • Physical and environmental security
    • Encryption or equivalent protections for data in transit and at rest
    • Secure development practices for internally built applications
    • Monitoring, logging, and testing for unauthorized access or changes
    • Incident response planning and disaster recovery

For licensees with a board of directors, executive management must regularly report to the board (or a committee) on the overall status of the information security program, material risks, and recommended changes.

This is the statutory backbone behind a regulator or plaintiff saying, “Did you have reasonable security in place?”

Cybersecurity events and reporting: what triggers notice?

The Insurance Data Security Act introduces the defined concept of a “cybersecurity event”—broadly, an event resulting in unauthorized access to or disruption of an information system or nonpublic information, with some carve‑outs (such as access that is determined not to have been used or released and has been returned or destroyed).

When a licensee learns that a cybersecurity event has occurred or may have occurred, it must conduct a prompt investigation to determine:

  • Whether a cybersecurity event actually occurred
  • The nature and scope of the event
  • What nonpublic information may have been involved
  • What measures are needed to restore system security and prevent recurrence

Department of Insurance notification

A licensee must notify the Director of the Department of Insurance of a cybersecurity event no later than 72 hours after determining that such an event has occurred, when either:

  • South Carolina is the licensee’s state of domicile (for insurers) or home state (for producers), or
  • The event involves the nonpublic information of at least 250 South Carolina consumers and either:
    • The event must be reported to another governmental body or regulator, or
    • The event is reasonably likely to materially harm a South Carolina consumer or a material part of the licensee’s normal operations.

The notice to the Director must include, as much as possible at the time:

  • Date and nature of the event
  • How information was exposed, lost, or accessed
  • Types of nonpublic information involved
  • Number of SC consumers affected
  • Law enforcement involvement
  • Steps taken to investigate, remediate, and notify consumers

The Constangy Cyber and other practitioner summaries emphasize that these requirements layer on top of, not in place of, your obligations to consumers under § 39‑1‑90 and any federal laws.

Interaction with the general breach notification statute

SCIDSA is explicit that licensees must still comply with S.C. Code § 39‑1‑90 for consumer notice, where applicable. In other words:

  • SCIDSA: Notice to the Director of Insurance (regulator) within ~72 hours in certain thresholds and conditions.
  • § 39‑1‑90: Notice to South Carolina residents, the Department of Consumer Affairs, and credit bureaus based on the scope of the breach and harm threshold.

For insurance CISOs and compliance officers, the practical takeaway is that you must design an incident response program that simultaneously satisfies both statutes, plus any other state or federal obligations in the jurisdictions where you operate.

Why this is hard in 2026: the data sprawl reality

On paper, the Insurance Data Security Act reads like a classic security program checklist. In practice, the hardest parts are:

  • Knowing where nonpublic information actually resides across policy admin platforms, claims systems, data warehouses, email, collaboration tools, and third‑party SaaS.
  • Understanding real‑world access—not just IAM roles on paper, but which users, service accounts, vendors, and AI tools can actually touch sensitive data.
  • Quantifying impact quickly during an incident so you can meet the 72‑hour Department of Insurance clock and the “most expedient time possible” obligation for consumer notice.

Most insurers have grown through acquisition, product launches, and third‑party integrations. That means policyholder and claimant data often exists in:

  • Multiple core systems and data lakes
  • Historical archives and backups nobody has looked at for years
  • Ad‑hoc exports shared with vendors or internal analytics teams
  • Email and collaboration workspaces, often with full Excel dumps attached

Without continuous, accurate visibility into that sprawl, your ability to prove “reasonable security” and respond within statutory timelines depends more on luck than design.

How data‑centric security and DSPM help prove compliance

This is where Data Security Posture Management (DSPM) and data‑centric security come into play.

A modern DSPM platform like Sentra is designed to answer, continuously and at scale, the core questions baked into the Insurance Data Security Act:

  • What nonpublic information do we actually hold?
  • Where does it reside across cloud, SaaS, and on‑prem?
  • How is it protected (encryption, masking, labeling, access controls)?
  • Who or what can access it—humans, APIs, AI agents, third‑party vendors?

Sentra does this by:

  • Discovering and classifying sensitive data across your clouds, SaaS, and on‑prem data stores, including policy/claims systems, warehouses, and collaboration tools.
  • Building a live, context‑rich inventory of regulated data (PII, PCI, health, credentials, etc.) and mapping it to your environments and business units.
  • Continuously analyzing exposure and effective access, so you see where nonpublic information is overshared, unencrypted, or accessible from risky identities.
  • Integrating with your incident response workflows, so that when a security event occurs, you can quickly scope which data sets and how many consumers are truly in play.

A real‑world parallel: SoFi’s DSPM story

While SoFi is a financial services leader rather than an insurer, their story closely mirrors the challenges SC insurers face.

In our webinar and case study with SoFi, their security leaders describe how they used Sentra to:

  • Create a centralized data catalog of sensitive customer data across a complex cloud environment.
  • Improve compliance mapping by aligning data classes with regulatory frameworks and internal policies.
  • Tighten data access governance, reducing false positives and focusing on the exposures that matter most.

Their experience moving from scattered, manual efforts to automated, high‑confidence visibility and classification is exactly what SC insurers need to align day‑to‑day operations with SCIDSA’s “reasonable security” and incident reporting expectations.

Making the Insurance Data Security Act manageable

If you’re an insurance licensee in South Carolina, the path forward looks something like this:

  1. Confirm applicability and scope with legal counsel, but assume SCIDSA and § 39‑1‑90 both matter if you hold South Carolina policyholder or claimant data.
  2. Inventory your nonpublic information using automated discovery and classification—it’s no longer realistic to do this with spreadsheets.
  3. Align your risk assessment and controls with what the Act explicitly requires: access control, encryption, monitoring, incident response, and third‑party oversight.
  4. Instrument your incident response so that every suspected cybersecurity event immediately pulls in DSPM context: data types, consumers affected, exposure level, and cross‑jurisdiction triggers.
  5. Document everything: risk assessments, board reports, incident investigations, and rationales for notification or non‑notification decisions. This is what “reasonable security” looks like under exam.

With that in place, the next early‑morning call from the SOC won’t feel like a blind scramble against a 72‑hour clock. You’ll be operating from a place of data‑driven confidence, not guesswork.

Call to action

If you’re responsible for data security or compliance under the South Carolina Insurance Data Security Act, now is the time to get ahead of the next exam—or the next incident.

See how Sentra helps insurers continuously map nonpublic information, reduce exposure, and accelerate investigations so you can meet SCIDSA and § 39‑1‑90 obligations with confidence.

Request a Sentra demo

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.