All Resources
In this article:
minus iconplus icon
Share the Blog

Use Redshift Data Scrambling for Additional Data Protection

May 3, 2023
8
Min Read

According to IBM, a data breach in the United States cost companies an average of 9.44 million dollars in 2022. It is now more important than ever for organizations to place high importance on protecting confidential information. Data scrambling, which can add an extra layer of security to data, is one approach to accomplish this. 

In this post, we'll analyze the value of data protection, look at the potential financial consequences of data breaches, and talk about how Redshift Data Scrambling may help protect private information.

The Importance of Data Protection

Data protection is essential to safeguard sensitive data from unauthorized access. Identity theft, financial fraud,and other serious consequences are all possible as a result of a data breach. Data protection is also crucial for compliance reasons. Sensitive data must be protected by law in several sectors, including government, banking, and healthcare. Heavy fines, legal problems, and business loss may result from failure to abide by these regulations.

Hackers employ many techniques, including phishing, malware, insider threats, and hacking, to get access to confidential information. For example, a phishing assault may lead to the theft of login information, and malware may infect a system, opening the door for additional attacks and data theft. 

So how to protect yourself against these attacks and minimize your data attack surface?

What is Redshift Data Masking?

Redshift data masking is a technique used to protect sensitive data in Amazon Redshift; a cloud-based data warehousing and analytics service. Redshift data masking involves replacing sensitive data with fictitious, realistic values to protect it from unauthorized access or exposure. It is possible to enhance data security by utilizing Redshift data masking in conjunction with other security measures, such as access control and encryption, in order to create a comprehensive data protection plan.

What is Redshift Data Masking

What is Redshift Data Scrambling?

Redshift data scrambling protects confidential information in a Redshift database by altering original data values using algorithms or formulas, creating unrecognizable data sets. This method is beneficial when sharing sensitive data with third parties or using it for testing, development, or analysis, ensuring privacy and security while enhancing usability. 

The technique is highly customizable, allowing organizations to select the desired level of protection while maintaining data usability. Redshift data scrambling is cost-effective, requiring no additional hardware or software investments, providing an attractive, low-cost solution for organizations aiming to improve cloud data security.

Data Masking vs. Data Scrambling

Data masking involves replacing sensitive data with a fictitious but realistic value. However, data scrambling, on the other hand, involves changing the original data values using an algorithm or a formula to generate a new set of values.

In some cases, data scrambling can be used as part of data masking techniques. For instance, sensitive data such as credit card numbers can be scrambled before being masked to enhance data protection further.

Setting up Redshift Data Scrambling

Having gained an understanding of Redshift and data scrambling, we can now proceed to learn how to set it up for implementation. Enabling data scrambling in Redshift requires several steps.

To achieve data scrambling in Redshift, SQL queries are utilized to invoke built-in or user-defined functions. These functions utilize a blend of cryptographic techniques and randomization to scramble the data.

The following steps are explained using an example code just for a better understanding of how to set it up:

Step 1: Create a new Redshift cluster

Create a new Redshift cluster or use an existing cluster if available. 

Redshift create cluster

Step 2: Define a scrambling key

Define a scrambling key that will be used to scramble the sensitive data.

 
SET session my_scrambling_key = 'MyScramblingKey';

In this code snippet, we are defining a scrambling key by setting a session-level parameter named <inlineCode>my_scrambling_key<inlineCode> to the value <inlineCode>MyScramblingKey<inlineCode>. This key will be used by the user-defined function to scramble the sensitive data.

Step 3: Create a user-defined function (UDF)

Create a user-defined function in Redshift that will be used to scramble the sensitive data. 


CREATE FUNCTION scramble(input_string VARCHAR)
RETURNS VARCHAR
STABLE
AS $$
DECLARE
scramble_key VARCHAR := 'MyScramblingKey';
BEGIN
-- Scramble the input string using the key
-- and return the scrambled output
RETURN ;
END;
$$ LANGUAGE plpgsql;

Here, we are creating a UDF named <inlineCode>scramble<inlineCode> that takes a string input and returns the scrambled output. The function is defined as <inlineCode>STABLE<inlineCode>, which means that it will always return the same result for the same input, which is important for data scrambling. You will need to input your own scrambling logic.

Step 4: Apply the UDF to sensitive columns

Apply the UDF to the sensitive columns in the database that need to be scrambled.


UPDATE employee SET ssn = scramble(ssn);

For example, applying the <inlineCode>scramble<inlineCode> UDF to a column saying, <inlineCode>ssn<inlineCode> in a table named <inlineCode>employee<inlineCode>. The <inlineCode>UPDATE<inlineCode> statement calls the <inlineCode>scramble<inlineCode> UDF and updates the values in the <inlineCode>ssn<inlineCode> column with the scrambled values.

Step 5: Test and validate the scrambled data

Test and validate the scrambled data to ensure that it is unreadable and unusable by unauthorized parties.


SELECT ssn, scramble(ssn) AS scrambled_ssn
FROM employee;

In this snippet, we are running a <inlineCode>SELECT<inlineCode> statement to retrieve the <inlineCode>ssn<inlineCode> column and the corresponding scrambled value using the <inlineCode>scramble<inlineCode> UDF. We can compare the original and scrambled values to ensure that the scrambling is working as expected. 

Step 6: Monitor and maintain the scrambled data

To monitor and maintain the scrambled data, we can regularly check the sensitive columns to ensure that they are still rearranged and that there are no vulnerabilities or breaches. We should also maintain the scrambling key and UDF to ensure that they are up-to-date and effective.

Different Options for Scrambling Data in Redshift

Selecting a data scrambling technique involves balancing security levels, data sensitivity, and application requirements. Various general algorithms exist, each with unique pros and cons. To scramble data in Amazon Redshift, you can use the following Python code samples in conjunction with a library like psycopg2 to interact with your Redshift cluster. Before executing the code samples, you will need to install the psycopg2 library:


pip install psycopg2

Random

Utilizing a random number generator, the Random option quickly secures data, although its susceptibility to reverse engineering limits its robustness for long-term protection.


import random
import string
import psycopg2

def random_scramble(data):
    scrambled = ""
    for char in data:
        scrambled += random.choice(string.ascii_letters + string.digits)
    return scrambled

# Connect to your Redshift cluster
conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()
# Fetch data from your table
cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

# Scramble the data
scrambled_rows = [(random_scramble(row[0]),) for row in rows]

# Update the data in the table
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

# Close the connection
cursor.close()
conn.close()

Shuffle

The Shuffle option enhances security by rearranging data characters. However, it remains prone to brute-force attacks, despite being harder to reverse-engineer.


import random
import psycopg2

def shuffle_scramble(data):
    data_list = list(data)
    random.shuffle(data_list)
    return ''.join(data_list)

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

scrambled_rows = [(shuffle_scramble(row[0]),) for row in rows]

cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Reversible

By scrambling characters in a decryption key-reversible manner, the Reversible method poses a greater challenge to attackers but is still vulnerable to brute-force attacks. We’ll use the Caesar cipher as an example.


def caesar_cipher(data, key):
    encrypted = ""
    for char in data:
        if char.isalpha():
            shift = key % 26
            if char.islower():
                encrypted += chr((ord(char) - 97 + shift) % 26 + 97)
            else:
                encrypted += chr((ord(char) - 65 + shift) % 26 + 65)
        else:
            encrypted += char
    return encrypted

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

key = 5
encrypted_rows = [(caesar_cipher(row[0], key),) for row in rows]
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(encrypted, original) for encrypted, original in zip(encrypted_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Custom

The Custom option enables users to create tailor-made algorithms to resist specific attack types, potentially offering superior security. However, the development and implementation of custom algorithms demand greater time and expertise.

Best Practices for Using Redshift Data Scrambling

There are several best practices that should be followed when using Redshift Data Scrambling to ensure maximum protection:

Use Unique Keys for Each Table

To ensure that the data is not compromised if one key is compromised, each table should have its own unique key pair. This can be achieved by creating a unique index on the table.


CREATE UNIQUE INDEX idx_unique_key ON table_name (column_name);

Encrypt Sensitive Data Fields 

Sensitive data fields such as credit card numbers and social security numbers should be encrypted to provide an additional layer of security. You can encrypt data fields in Redshift using the ENCRYPT function. Here's an example of how to encrypt a credit card number field:


SELECT ENCRYPT('1234-5678-9012-3456', 'your_encryption_key_here');

Use Strong Encryption Algorithms

Strong encryption algorithms such as AES-256 should be used to provide the strongest protection. Redshift supports AES-256 encryption for data at rest and in transit.


CREATE TABLE encrypted_table (  sensitive_data VARCHAR(255) ENCODE ZSTD ENCRYPT 'aes256' KEY 'my_key');

Control Access to Encryption Keys 

Access to encryption keys should be restricted to authorized personnel to prevent unauthorized access to sensitive data. You can achieve this by setting up an AWS KMS (Key Management Service) to manage your encryption keys. Here's an example of how to restrict access to an encryption key using KMS in Python:


import boto3

kms = boto3.client('kms')

key_id = 'your_key_id_here'
grantee_principal = 'arn:aws:iam::123456789012:user/jane'

response = kms.create_grant(
    KeyId=key_id,
    GranteePrincipal=grantee_principal,
    Operations=['Decrypt']
)

print(response)

Regularly Rotate Encryption Keys 

Regular rotation of encryption keys ensures that any compromised keys do not provide unauthorized access to sensitive data. You can schedule regular key rotation in AWS KMS by setting a key policy that specifies a rotation schedule. Here's an example of how to schedule annual key rotation in KMS using the AWS CLI:

 
aws kms put-key-policy \\
    --key-id your_key_id_here \\
    --policy-name default \\
    --policy
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    \\":\\"kms:RotateKey\\",\\"Resource\\":\\"*\\"},{\\"Effect\\":\\"Allow\\",\
    \"Principal\\":{\\"AWS\\":\\"arn:aws:iam::123456789012:root\\"},\\"Action\\
    ":\\"kms:CreateGrant\\",\\"Resource\\":\\"*\\",\\"Condition\\":{\\"Bool\\":
    {\\"kms:GrantIsForAWSResource\\":\\"true\\"}}}]}"

Turn on logging 

To track user access to sensitive data and identify any unwanted access, logging must be enabled. All SQL commands that are executed on your cluster are logged when you activate query logging in Amazon Redshift. This applies to queries that access sensitive data as well as data-scrambling operations. Afterwards, you may examine these logs to look for any strange access patterns or suspect activities.

You may use the following SQL statement to make query logging available in Amazon Redshift:

ALTER DATABASE  SET enable_user_activity_logging=true;

The stl query system table may be used to retrieve the logs once query logging has been enabled. For instance, the SQL query shown below will display all queries that reached a certain table:

Monitor Performance 

Data scrambling is often a resource-intensive practice, so it’s good to monitor CPU usage, memory usage, and disk I/O to ensure your cluster isn’t being overloaded. In Redshift, you can use the <inlineCode>svl_query_summary<inlineCode> and <inlineCode>svl_query_report<inlineCode> system views to monitor query performance. You can also use Amazon CloudWatch to monitor metrics such as CPU usage and disk space.

Amazon CloudWatch

Establishing Backup and Disaster Recovery

In order to prevent data loss in the case of a disaster, backup and disaster recovery mechanisms should be put in place. Automated backups and manual snapshots are only two of the backup and recovery methods offered by Amazon Redshift. Automatic backups are taken once every eight hours by default. 

Moreover, you may always manually take a snapshot of your cluster. In the case of a breakdown or disaster, your cluster may be restored using these backups and snapshots. Use this SQL query to manually take a snapshot of your cluster in Amazon Redshift:

CREATE SNAPSHOT ; 

To restore a snapshot, you can use the <inlineCode>RESTORE<inlineCode> command. For example:


RESTORE 'snapshot_name' TO 'new_cluster_name';

Frequent Review and Updates

To ensure that data scrambling procedures remain effective and up-to-date with the latest security requirements, it is crucial to consistently review and update them. This process should include examining backup and recovery procedures, encryption techniques, and access controls.

In Amazon Redshift, you can assess access controls by inspecting all roles and their associated permissions in the <inlineCode>pg_roles<inlineCode> system catalog database. It is essential to confirm that only authorized individuals have access to sensitive information.

To analyze encryption techniques, use the <inlineCode>pg_catalog.pg_attribute<inlineCode> system catalog table, which allows you to inspect data types and encryption settings for each column in your tables. Ensure that sensitive data fields are protected with robust encryption methods, such as AES-256.

The AWS CLI commands <inlineCode>aws backup plan<inlineCode> and <inlineCode>aws backup vault<inlineCode> enable you to review your backup plans and vaults, as well as evaluate backup and recovery procedures. Make sure your backup and recovery procedures are properly configured and up-to-date.

Decrypting Data in Redshift

There are different options for decrypting data, depending on the encryption method used and the tools available; the decryption process is similar to of encryption, usually a custom UDF is used to decrypt the data, let’s look at one example of decrypting data scrambling with a substitution cipher.

Step 1: Create a UDF with decryption logic for substitution


CREATE FUNCTION decrypt_substitution(ciphertext varchar) RETURNS varchar
IMMUTABLE AS $$
    alphabet = 'abcdefghijklmnopqrstuvwxyz'
    substitution = 'ijklmnopqrstuvwxyzabcdefgh'
    reverse_substitution = ''.join(sorted(substitution, key=lambda c: substitution.index(c)))
    plaintext = ''
    for i in range(len(ciphertext)):
        index = substitution.find(ciphertext[i])
        if index == -1:
            plaintext += ciphertext[i]
        else:
            plaintext += reverse_substitution[index]
    return plaintext
$$ LANGUAGE plpythonu;

Step 2: Move the data back after truncating and applying the decryption function


TRUNCATE original_table;
INSERT INTO original_table (column1, decrypted_column2, column3)
SELECT column1, decrypt_substitution(encrypted_column2), column3
FROM temp_table;

In this example, encrypted_column2 is the encrypted version of column2 in the temp_table. The decrypt_substitution function is applied to encrypted_column2, and the result is inserted into the decrypted_column2 in the original_table. Make sure to replace column1, column2, and column3 with the appropriate column names, and adjust the INSERT INTO statement accordingly if you have more or fewer columns in your table.

Conclusion

Redshift data scrambling is an effective tool for additional data protection and should be considered as part of an organization's overall data security strategy. In this blog post, we looked into the importance of data protection and how this can be integrated effectively into the  data warehouse. Then, we covered the difference between data scrambling and data masking before diving into how one can set up Redshift data scrambling.

Once you begin to accustom to Redshift data scrambling, you can upgrade your security techniques with different techniques for scrambling data and best practices including encryption practices, logging, and performance monitoring. Organizations may improve their data security posture management (DSPM) and reduce the risk of possible breaches by adhering to these recommendations and using an efficient strategy.

<blogcta-big>

Veronica is the security researcher at Sentra. She brings a wealth of knowledge and experience as a cybersecurity researcher. Her main focuses are researching the main cloud provider services and AI infrastructures for Data related threats and techniques.

Subscribe

Latest Blog Posts

Yair Cohen
Yair Cohen
April 27, 2026
4
Min Read

Sentra Q2 2026 Product Updates: Data Security in the Age of AI

Sentra Q2 2026 Product Updates: Data Security in the Age of AI

Every quarter I get asked some version of the same question: "What's the biggest shift you're seeing in enterprise data security right now?" My answer hasn't changed in the past year, but the urgency behind it keeps growing.

AI is no longer a side project. Copilots, agents, and LLM-powered apps are spinning up across Microsoft 365, AWS, Databricks, Azure, and beyond; often faster than security teams can track. At the same time, most large enterprises still have critical regulated data living on file shares and databases in their own data centers, largely invisible to cloud-first tools. And the DLP stacks organizations spent years building? They're only as smart as the labels and context they can see, which, for most companies, isn't very much.

These aren't new problems. But they've collided in a way that makes 2026 a genuinely pivotal year for data security. Read this post (or watch the on-demand webinar) for a walk through of what we shipped in Q2 and where we're taking Sentra for the rest of the year.

The Three Problems We Kept Hearing

Before I walk through our Q2 updates, it's worth naming the friction points that drove them. Across our customer conversations, three questions kept coming up without clean answers:

"What AI assets do we actually have, and what data do they touch?" Organizations know they're deploying copilots and agents. They often have no unified view of what those assets are connected to.

"We have critical data on-prem that never moved to the cloud. What do we do about it?" Almost every large enterprise we work with still has regulated data sitting in data centers. Historically, the choices were. 1) ignore it, 2) try to move it to the cloud just to scan it, which is usually a non-starter for compliance and operations.

"Our DLP stack isn't working the way it should. Is that a classification problem?" Almost always, yes. Enforcement agents, whether it's Microsoft Purview, Google DLP, SASE, CASB, or endpoint DLP, are only as good as the labels and context they see. If data isn't classified accurately and consistently, policies either never trigger or they trigger constantly and generate noise.

These three problems shaped our Q2 investments directly.

Q2 Update #1: AI Security - Turning AI Chaos Into a Governable Surface

The real risk with enterprise AI isn't the models themselves. It's that no one has a clean answer to three basic questions: What AI assets do we have? What data do they touch? And are they using that data in a way that would pass an audit?

In Q2, we took the first concrete step toward answering all three.

Unified AI Asset Inventory. We now give you a single view of your agents, models, and endpoints - with owners and environments - instead of having them scattered across different consoles. If you're running Copilot in M365, SageMaker models on AWS, and custom agents on Bedrock or Azure, they all show up in one place.

Data Lineage Into AI. For each agent, we map which knowledge bases and data stores it relies on and roll up the sensitive data classes and business context to the AI asset level. This is the part that matters most. Until now, people thought about data security in terms of how employees accessed files and permissions. With GenAI, data flows much faster through agents, so understanding the data at rest, and which AI assets touch it, is the critical control point.

Govern Data Use in AI. Once you have that lineage, you can start making real policy decisions. These are the data classes we're comfortable using for copilots and agents; these are the ones that must never be touched. We flag high-risk agents, those with access to regulated data or broad permissions, before they roll out, not after something leaks.

This is the first step toward our broader 2026 AI readiness vision: treating AI assets the same way we treat any other sensitive data store, with inventory, lineage, posture assessment, and policy enforcement. The goal is that when your organization wants to move faster with GenAI, Sentra gives you the map, the policies, and the evidence you need to say yes - safely.

Q2 Update #2: On-Prem & Hybrid Coverage - Securing the Data That Never Moved to the Cloud

Almost every large enterprise we work with still has critical regulated data on file shares and databases in their own data centers. It's often the riskiest and least visible part of the estate.

In Q2, we introduced local on-premise scanners that run inside your environment, scan file shares and data stores where they live, and send us only the metadata and classifications, not the sensitive data itself. You get the same AI-powered discovery, classification, sensitivity mapping, and posture analytics you're used to in cloud and SaaS. Your data never leaves your data center.

"How realistic is full coverage?" - very realistic. We essentially took the technology we built for our cloud scanners and packaged it for any private data center or on-premise environment. We ship lightweight local scanners, support all types of SMB and NFS file shares, and cover databases including MySQL, Oracle, Postgres, and more. Sentra also connects to your Active Directory to map access levels across identities, file shares, and databases.

All of that feeds into a single map across on-prem, cloud, and SaaS, so security teams can finally reason about all their sensitive data everywhere, instead of managing separate point solutions for each island. And critically, this isn't a POC exercise. We focused on easy, secure deployment; lightweight collectors, quick rollout, and alignment with enterprise network and security requirements. This is something you can actually put into production.

Q2 Update #3: Automatic Labeling & Tagging - Making Your Existing DLP Stack Actually Smart

Most organizations aren't looking to rip and replace their DLP stack. The real pain is that enforcement is flying blind. DLP, SSE, CASB, and endpoint tools are like muscles without a brain. They can be powerful, but only if the underlying classification is accurate and consistent.

Sentra's role is to be the data security and classification brain that makes those existing tools actually smart.

In Q2, we doubled down on cross-platform auto-labeling. Automatically applying Microsoft Purview Information Protection (MPIP) labels in M365 and Google sensitivity labels in Google Drive, based on our high-accuracy discovery and classification. Those labels then become the control plane for everything downstream; email DLP, endpoint and web proxies, SaaS DLP, and even AI and Copilot controls that decide which data can be surfaced in responses.

Instead of authoring hundreds of brittle regex rules, you're keying policies off rich business context; HR compensation documents, customer financial statements, high-sensitivity intellectual property. The result is fewer false positives, better enforcement, and a classification foundation that scales.

Strategically, this is how we move from DSPM-plus-alerts to cloud-native DLP and automated remediation at scale. Sentra discovers and understands the data, stamps it with the right labels, and your existing enforcement stack, plus our own remediation, ensures data is only used, shared, and accessed in ways that match its true sensitivity.

Classification Is Still the Core of Everything

One thing I want to leave you with, because I don't think it gets said enough: classification is the foundation that makes all of this work. It's still where we invest the most at Sentra, and with advances in AI, we're making our capabilities more ambitious and more automatic.

We're building classifiers that are specific to each organization's proprietary data. Sentra learns your specific environment, and for every piece of data found, whether it's a file, a column, or a table, we know what it is and what its business context means. Beyond that, we're evolving our sensitivity scoring engine so security teams can bring their own definitions of what's sensitive, and our engine automatically translates that using AI into rules that ensure every piece of data gets the right label.

The goal is to make the effort of classifying and labeling data as easy as describing it to another human being. And to remove the manual research and validation work that doesn't scale in the AI era.

The Bottom Line

The challenge of enterprise data security in 2026 isn't a lack of tools. It's that the tools organizations have - DLP, CASB, SSE, endpoint controls - are only as effective as the data intelligence feeding them. At the same time, AI is creating an entirely new attack surface that most security teams can't see clearly yet. And on-premise data, the part of the estate that never moved to the cloud, remains the riskiest and least visible.

Sentra is building toward a single platform that addresses all three: a data-first security platform that discovers your critical data, understands its context, and drives the controls in your existing tools and in ours, so data stays safe, compliant, and usable for the business.

We'll see you next quarter with more updates. In the meantime, reach out if you have questions or schedule a demo if you want to go deeper on any of this.

Read More
Team Sentra
Team Sentra
April 24, 2026
3
Min Read
AI and ML

Patchwork AI Security vs. Purpose-Built Protection: Thoughts on Cyera’s Ryft Acquisition

Patchwork AI Security vs. Purpose-Built Protection: Thoughts on Cyera’s Ryft Acquisition

Yesterday’s news that Cyera is acquiring Ryft, a two-year-old startup building automated data lakes for AI agents, is the latest sign of how fast the agentic AI security market is moving. It’s also Cyera’s fourth acquisition in five years, on the heels of Trail Security and Otterize, a clear signal that the company is trying to buy its way into new narratives as quickly as they emerge.

For security and data leaders, the question isn’t “Is agentic AI important?” It absolutely is. The question is: What’s the real cost of stitching together yet another acquisition into an already complex platform?

The hidden cost of rapid, piecemeal integrations

On paper, adding Ryft gives Cyera a new story around “agentic AI security.” In practice, it creates a familiar set of integration problems:

  • Multiple architectures to reconcile
    Trail Security, Otterize, and now Ryft were all built as independent products with their own data models, UX patterns, and engineering roadmaps. Four acquisitions in five years means customers are effectively buying an integration project that’s still in progress, not a single, mature platform.

  • Gaps, overlaps, and inconsistent controls
    Every acquired module has its own blind spots and strengths. Until they’re truly unified, you get overlapping coverage in some areas, gaps in others, and policy engines that don’t behave consistently across cloud, SaaS, and on-prem.

  • Slower time-to-value for AI initiatives
    AI programs move quickly; integrations do not. Each acquisition has to be wired into discovery, classification, policy, reporting, access control, and remediation workflows before it delivers real value. That’s measured in quarters and years, not weeks.

  • Operational drag on security teams
    When you tie together multiple acquired engines, you often see scan-based coverage, noisy false positives, and limited self-serve reporting that still depends on the vendor’s team to interpret results. That’s the opposite of what already stretched security teams need as they take on AI data risk.

The Ryft deal fits this pattern. It’s a high-priced bet on an early-stage team with a small set of digital-native customers, not a proven, enterprise-scale AI data security engine. That’s fine as a venture bet. It’s more problematic when packaged as an answer for Fortune 500 AI governance.

Why agentic AI security can’t be bolted on

Agentic AI changes the risk profile of enterprise data:

  • Agents traverse structured and unstructured data across cloud, SaaS, and on-prem.
  • They act on behalf of identities, often chaining tools and APIs in ways that are hard to predict.
  • The blast radius of a misconfiguration or over-permissioned identity grows dramatically once agents are in the loop.

Trying to solve that by bolting an AI data lake acquisition onto a legacy, scan-based DSPM engine is risky. You’re adding another moving part on top of a system that already struggles with:

  • Point-in-time scans instead of real-time, continuous coverage
  • High false positives without strong prioritization
  • Shallow support for hybrid and on-prem environments
  • Vendor-controlled workflows instead of customer-controlled, self-serve reporting

If the underlying platform can’t continuously understand where sensitive data lives, which identities can touch it, and how that access is used, then adding an “AI data lake” on the side doesn’t fix the fundamentals. It just adds another place for risk to hide.

A different path: Sentra’s purpose-built, real-time platform

At Sentra, we took a different approach from day one: build a single, in-place, real-time data security platform, not a patchwork of stitched-together acquisitions.

A few principles guide the way we think about AI and data security:

  • One unified architecture
    Sentra is a purpose-built, unified platform, not an assortment of logos held together by integration roadmaps. There’s one architecture, one data model, one roadmap, and one team focused entirely on DSPM and AI data security, rather than a set of acquired point products that still need to be woven together.

  • Proven for real AI workloads today
    Our platform is already securing real AI workloads in production environments, rather than depending on the future maturation of a seed-stage acquisition. AI data security for us is not a sidecar story. It's built into how we discover, classify, govern, and remediate risk across your estate.

  • Higher-precision signal, not more noise
    Sentra delivers higher classification precision (4.9 vs. 4.7 stars on Gartner) and couples that with workflows your team controls, not processes that require vendor intervention every time you need a new report or policy tweak.

  • Complete coverage for complex environments
    Modern enterprises aren’t cloud-only. Sentra provides full coverage across IaaS, PaaS, SaaS, and on-premises from a single platform, built for hybrid and legacy-heavy environments as much as for cloud-native stacks.

In other words, while some vendors are racing to acquire their way into the next AI buzzword, Sentra is focused on delivering trustworthy, real-time, identity-aware data security that you can put in front of a CISO and a data platform owner today.

What to ask your vendors now

If you’re evaluating Cyera (or any vendor riding the latest AI acquisition wave), a few concrete questions can cut through the noise:

  1. How many acquisitions have you done in the last five years, and which parts of my deployment depend on those integrations actually working?
  2. What’s fully integrated and running in production today vs. what’s still on the roadmap?
  3. Are my AI and non-AI data risks handled by the same platform, policies, and reporting, or by separate acquired modules?
  4. Do you provide continuous coverage and identity-aware controls across cloud, SaaS, and on-prem, or am I still relying on periodic scans and partial visibility?

The AI security market doesn’t need more logos; it needs fewer moving parts, better signals, and real-time control over how data is used by humans and agents alike.

That’s the standard Sentra is building for and the lens through which we view every new acquisition announcement in this space.

Read More
Ron Reiter
Ron Reiter
April 24, 2026
3
Min Read
Data Security

Sentra Now Supports Solidworks 3D CAD Files – Protecting the Digital Blueprint in the Age of AI

Sentra Now Supports Solidworks 3D CAD Files – Protecting the Digital Blueprint in the Age of AI

Walk into any advanced manufacturing, aerospace, defense, or industrial design shop and you’re just as likely to see Solidworks as you are AutoCAD. The models, assemblies, and drawings built in Solidworks are the digital blueprints for everything from turbine blades and medical devices to satellites and weapons systems.

Earlier this year we announced native support for AutoCAD DWG files, making an entire class of previously opaque CAD data visible to security and compliance teams for the first time. Now we’re extending that same deep visibility to Solidworks 3D CAD files, so you can protect the IP and regulated technical data hiding inside your .sldprt, .sldasm, and related content—without slowing engineering down.

And as AI accelerates design cycles, that visibility is no longer optional.

AI is Supercharging Design – and Expanding the Blast Radius

Design teams are pushing faster than ever:

  • Generative design tools propose entire families of parts and assemblies.
  • Copilots summarize requirements, suggest changes, and draft documentation off CAD models.
  • PLM-integrated agents automatically create downstream artifacts—quotes, NC programs, service manuals—based on 3D designs.
  • RAG-style internal assistants answer questions using a mix of project docs, CAD files, and simulation outputs.

All of this is powerful. It also multiplies the ways sensitive CAD data can leak:

  • Entire assemblies uploaded to unmanaged AI tools “just to explore options.”
  • Export-controlled models referenced in prompts and ending up in long‑lived AI data lakes.
  • Supplier and customer CAD shared into external copilots with little visibility into who—or what agent—can access it.
  • Rich metadata from CAD (usernames, project codes, server paths, partner names) silently turned into reconnaissance material.

If you don’t understand what’s inside your CAD, where it lives, and which identities and AI agents can reach it, AI doesn’t just speed up design—it speeds up IP disclosure, compliance failures, and supply‑chain exposure.

CAD Has Been a Blind Spot for Security

Most traditional DSPM and DLP tools still treat specialized engineering formats as a big binary blob: “probably sensitive, treat with caution.” That may have been acceptable when CAD lived on a handful of on‑prem engineering servers.

It’s not acceptable when:

  • Decades of CAD history have been lifted and shifted into S3, Azure Blob, or SharePoint.
  • ITAR/EAR “technical data” now lives side‑by‑side with everyday project files in cloud object stores.
  • Those same repositories feed downstream systems—PLM, MES, AI assistants—where traditional security tools have little or no visibility.

We built native DWG parsing into Sentra to break that stalemate, making CAD content as transparent to security teams as a Word document. Solidworks 3D CAD support is the next logical step.

What’s Really Inside a Solidworks 3D CAD File?

Like DWG, a Solidworks file is far more than geometry. It’s a container for rich metadata, text, and structural context that describes both what you’re building and how it fits into regulated programs and commercial IP. Our Solidworks support is designed to surface that security‑relevant context—without requiring CAD tools, manual exports, or data movement.

Similar to what we do for DWG, Sentra can extract and analyze key elements, including:

  • Document properties
    Authors, “last saved by,” creation and modification timestamps, total editing time, and revision counters—signals that help you understand who is touching sensitive designs and when.

  • Custom properties and configuration metadata
    Project IDs, part and assembly numbers, revision codes, program names, business units, and export‑control or classification markings encoded as custom properties or notes.

  • Text content and annotations
    Notes, callouts, PMI, and embedded text that often contain material specifications, tolerances, customer names, contract IDs, and phrases like “COMPANY CONFIDENTIAL,” “EXPORT CONTROLLED,” or ITAR statements.

  • Assembly structure and component names
    Which parts roll up into which assemblies, and how those components are named—critical when you need to understand which physical systems a given sensitive model belongs to.

  • File dependencies and paths
    References to drawings, configurations, libraries, and external resources that routinely expose server names, share paths, usernames, and department structures—goldmine context for attackers, but also for incident response and insider‑risk investigations.

For organizations operating under ITAR and EAR, this is where truly export‑controlled technical data actually lives—not in the folder name, but in the title blocks, annotations, and metadata attached to models and drawings.

Turning Solidworks Models into Actionable Security Signals

By parsing Solidworks 3D CAD files in place, inside your own cloud accounts or VPCs, Sentra can now treat them as first‑class citizens in your data security program—just like we do for DWG and other specialized formats.

That unlocks concrete use cases, such as:

  • Finding export‑controlled or highly sensitive designs in cloud storage
    Automatically surface Solidworks files whose metadata, annotations, or custom properties contain ITAR statements, ECCN codes, proprietary markings, or customer‑confidential labels—so you can focus remediation on the drawings and models that are actually regulated.

  • Mapping who (and what) can access critical designs
    Combine CAD‑aware classification with Sentra’s DSPM and DAG capabilities to answer:
    Where are our most sensitive Solidworks assemblies stored, and which identities, service principals, and AI agents can currently reach them?

  • Monitoring AI and collaboration workflows for IP exposure
    Track when Solidworks files that contain regulated or high‑value IP are moved into AI data lakes, shared via collaboration platforms, or accessed by non‑human identities—so DDR policies can flag, quarantine, or route for review before they turn into public incidents.

  • Building a defensible audit trail for CAD‑resident technical data
    Maintain an inventory of Solidworks files that contain export‑control markings or IP‑critical content, tie each file to its exact storage location and access controls, and surface any out‑of‑policy placements—so when auditors ask “Where is your technical data?”, you can answer with data, not slideware.

Closing the Gap Between “Stored” and “Understood” for 3D CAD

As workloads like EDA, PLM, simulation, and AI‑assisted design move deeper into the cloud, the number of specialized formats in your environment explodes. Most tools still only truly understand emails, office documents, and a narrow slice of structured data.

The reality is simple: you cannot secure data you don’t understand. Understanding means being able to answer, at scale, not just “Where is this file?” but “What is inside this file, how sensitive is it, and how is AI amplifying its risk?”

For organizations whose crown‑jewel IP and export‑controlled technical data live in Solidworks 3D CAD, that’s the gap Sentra is now closing.

If you want to see what’s actually hiding inside your own Solidworks models and assemblies, the easiest next step is to run a focused assessment: pick a few representative buckets or repositories, let Sentra scan those CAD files in place, and review the inventory of regulated and high‑value designs that surfaces.

Chances are, once you’ve seen that map—and how it connects to your AI initiatives—you’ll never look at “just another CAD file” the same way again.

Read More
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

White Gartner Peer Insights Customers' Choice 2025 badge with laurel leaves inside a speech bubble.