Sentra Launches Breakthrough AI Classification Capabilities!
All Resources
In this article:
minus iconplus icon
Share the Blog

Use Redshift Data Scrambling for Additional Data Protection

May 3, 2023
8
Min Read

According to IBM, a data breach in the United States cost companies an average of 9.44 million dollars in 2022. It is now more important than ever for organizations to place high importance on protecting confidential information. Data scrambling, which can add an extra layer of security to data, is one approach to accomplish this. 

In this post, we'll analyze the value of data protection, look at the potential financial consequences of data breaches, and talk about how Redshift Data Scrambling may help protect private information.

The Importance of Data Protection

Data protection is essential to safeguard sensitive data from unauthorized access. Identity theft, financial fraud,and other serious consequences are all possible as a result of a data breach. Data protection is also crucial for compliance reasons. Sensitive data must be protected by law in several sectors, including government, banking, and healthcare. Heavy fines, legal problems, and business loss may result from failure to abide by these regulations.

Hackers employ many techniques, including phishing, malware, insider threats, and hacking, to get access to confidential information. For example, a phishing assault may lead to the theft of login information, and malware may infect a system, opening the door for additional attacks and data theft. 

So how to protect yourself against these attacks and minimize your data attack surface?

What is Redshift Data Masking?

Redshift data masking is a technique used to protect sensitive data in Amazon Redshift; a cloud-based data warehousing and analytics service. Redshift data masking involves replacing sensitive data with fictitious, realistic values to protect it from unauthorized access or exposure. It is possible to enhance data security by utilizing Redshift data masking in conjunction with other security measures, such as access control and encryption, in order to create a comprehensive data protection plan.

What is Redshift Data Masking

What is Redshift Data Scrambling?

Redshift data scrambling protects confidential information in a Redshift database by altering original data values using algorithms or formulas, creating unrecognizable data sets. This method is beneficial when sharing sensitive data with third parties or using it for testing, development, or analysis, ensuring privacy and security while enhancing usability. 

The technique is highly customizable, allowing organizations to select the desired level of protection while maintaining data usability. Redshift data scrambling is cost-effective, requiring no additional hardware or software investments, providing an attractive, low-cost solution for organizations aiming to improve cloud data security.

Data Masking vs. Data Scrambling

Data masking involves replacing sensitive data with a fictitious but realistic value. However, data scrambling, on the other hand, involves changing the original data values using an algorithm or a formula to generate a new set of values.

In some cases, data scrambling can be used as part of data masking techniques. For instance, sensitive data such as credit card numbers can be scrambled before being masked to enhance data protection further.

Setting up Redshift Data Scrambling

Having gained an understanding of Redshift and data scrambling, we can now proceed to learn how to set it up for implementation. Enabling data scrambling in Redshift requires several steps.

To achieve data scrambling in Redshift, SQL queries are utilized to invoke built-in or user-defined functions. These functions utilize a blend of cryptographic techniques and randomization to scramble the data.

The following steps are explained using an example code just for a better understanding of how to set it up:

Step 1: Create a new Redshift cluster

Create a new Redshift cluster or use an existing cluster if available. 

Redshift create cluster

Step 2: Define a scrambling key

Define a scrambling key that will be used to scramble the sensitive data.

 
SET session my_scrambling_key = 'MyScramblingKey';

In this code snippet, we are defining a scrambling key by setting a session-level parameter named <inlineCode>my_scrambling_key<inlineCode> to the value <inlineCode>MyScramblingKey<inlineCode>. This key will be used by the user-defined function to scramble the sensitive data.

Step 3: Create a user-defined function (UDF)

Create a user-defined function in Redshift that will be used to scramble the sensitive data. 


CREATE FUNCTION scramble(input_string VARCHAR)
RETURNS VARCHAR
STABLE
AS $$
DECLARE
scramble_key VARCHAR := 'MyScramblingKey';
BEGIN
-- Scramble the input string using the key
-- and return the scrambled output
RETURN ;
END;
$$ LANGUAGE plpgsql;

Here, we are creating a UDF named <inlineCode>scramble<inlineCode> that takes a string input and returns the scrambled output. The function is defined as <inlineCode>STABLE<inlineCode>, which means that it will always return the same result for the same input, which is important for data scrambling. You will need to input your own scrambling logic.

Step 4: Apply the UDF to sensitive columns

Apply the UDF to the sensitive columns in the database that need to be scrambled.


UPDATE employee SET ssn = scramble(ssn);

For example, applying the <inlineCode>scramble<inlineCode> UDF to a column saying, <inlineCode>ssn<inlineCode> in a table named <inlineCode>employee<inlineCode>. The <inlineCode>UPDATE<inlineCode> statement calls the <inlineCode>scramble<inlineCode> UDF and updates the values in the <inlineCode>ssn<inlineCode> column with the scrambled values.

Step 5: Test and validate the scrambled data

Test and validate the scrambled data to ensure that it is unreadable and unusable by unauthorized parties.


SELECT ssn, scramble(ssn) AS scrambled_ssn
FROM employee;

In this snippet, we are running a <inlineCode>SELECT<inlineCode> statement to retrieve the <inlineCode>ssn<inlineCode> column and the corresponding scrambled value using the <inlineCode>scramble<inlineCode> UDF. We can compare the original and scrambled values to ensure that the scrambling is working as expected. 

Step 6: Monitor and maintain the scrambled data

To monitor and maintain the scrambled data, we can regularly check the sensitive columns to ensure that they are still rearranged and that there are no vulnerabilities or breaches. We should also maintain the scrambling key and UDF to ensure that they are up-to-date and effective.

Different Options for Scrambling Data in Redshift

Selecting a data scrambling technique involves balancing security levels, data sensitivity, and application requirements. Various general algorithms exist, each with unique pros and cons. To scramble data in Amazon Redshift, you can use the following Python code samples in conjunction with a library like psycopg2 to interact with your Redshift cluster. Before executing the code samples, you will need to install the psycopg2 library:


pip install psycopg2

Random

Utilizing a random number generator, the Random option quickly secures data, although its susceptibility to reverse engineering limits its robustness for long-term protection.


import random
import string
import psycopg2

def random_scramble(data):
    scrambled = ""
    for char in data:
        scrambled += random.choice(string.ascii_letters + string.digits)
    return scrambled

# Connect to your Redshift cluster
conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()
# Fetch data from your table
cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

# Scramble the data
scrambled_rows = [(random_scramble(row[0]),) for row in rows]

# Update the data in the table
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

# Close the connection
cursor.close()
conn.close()

Shuffle

The Shuffle option enhances security by rearranging data characters. However, it remains prone to brute-force attacks, despite being harder to reverse-engineer.


import random
import psycopg2

def shuffle_scramble(data):
    data_list = list(data)
    random.shuffle(data_list)
    return ''.join(data_list)

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

scrambled_rows = [(shuffle_scramble(row[0]),) for row in rows]

cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Reversible

By scrambling characters in a decryption key-reversible manner, the Reversible method poses a greater challenge to attackers but is still vulnerable to brute-force attacks. We’ll use the Caesar cipher as an example.


def caesar_cipher(data, key):
    encrypted = ""
    for char in data:
        if char.isalpha():
            shift = key % 26
            if char.islower():
                encrypted += chr((ord(char) - 97 + shift) % 26 + 97)
            else:
                encrypted += chr((ord(char) - 65 + shift) % 26 + 65)
        else:
            encrypted += char
    return encrypted

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

key = 5
encrypted_rows = [(caesar_cipher(row[0], key),) for row in rows]
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(encrypted, original) for encrypted, original in zip(encrypted_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Custom

The Custom option enables users to create tailor-made algorithms to resist specific attack types, potentially offering superior security. However, the development and implementation of custom algorithms demand greater time and expertise.

Best Practices for Using Redshift Data Scrambling

There are several best practices that should be followed when using Redshift Data Scrambling to ensure maximum protection:

Use Unique Keys for Each Table

To ensure that the data is not compromised if one key is compromised, each table should have its own unique key pair. This can be achieved by creating a unique index on the table.


CREATE UNIQUE INDEX idx_unique_key ON table_name (column_name);

Encrypt Sensitive Data Fields 

Sensitive data fields such as credit card numbers and social security numbers should be encrypted to provide an additional layer of security. You can encrypt data fields in Redshift using the ENCRYPT function. Here's an example of how to encrypt a credit card number field:


SELECT ENCRYPT('1234-5678-9012-3456', 'your_encryption_key_here');

Use Strong Encryption Algorithms

Strong encryption algorithms such as AES-256 should be used to provide the strongest protection. Redshift supports AES-256 encryption for data at rest and in transit.


CREATE TABLE encrypted_table (  sensitive_data VARCHAR(255) ENCODE ZSTD ENCRYPT 'aes256' KEY 'my_key');

Control Access to Encryption Keys 

Access to encryption keys should be restricted to authorized personnel to prevent unauthorized access to sensitive data. You can achieve this by setting up an AWS KMS (Key Management Service) to manage your encryption keys. Here's an example of how to restrict access to an encryption key using KMS in Python:


import boto3

kms = boto3.client('kms')

key_id = 'your_key_id_here'
grantee_principal = 'arn:aws:iam::123456789012:user/jane'

response = kms.create_grant(
    KeyId=key_id,
    GranteePrincipal=grantee_principal,
    Operations=['Decrypt']
)

print(response)

Regularly Rotate Encryption Keys 

Regular rotation of encryption keys ensures that any compromised keys do not provide unauthorized access to sensitive data. You can schedule regular key rotation in AWS KMS by setting a key policy that specifies a rotation schedule. Here's an example of how to schedule annual key rotation in KMS using the AWS CLI:

 
aws kms put-key-policy \\
    --key-id your_key_id_here \\
    --policy-name default \\
    --policy
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    \\":\\"kms:RotateKey\\",\\"Resource\\":\\"*\\"},{\\"Effect\\":\\"Allow\\",\
    \"Principal\\":{\\"AWS\\":\\"arn:aws:iam::123456789012:root\\"},\\"Action\\
    ":\\"kms:CreateGrant\\",\\"Resource\\":\\"*\\",\\"Condition\\":{\\"Bool\\":
    {\\"kms:GrantIsForAWSResource\\":\\"true\\"}}}]}"

Turn on logging 

To track user access to sensitive data and identify any unwanted access, logging must be enabled. All SQL commands that are executed on your cluster are logged when you activate query logging in Amazon Redshift. This applies to queries that access sensitive data as well as data-scrambling operations. Afterwards, you may examine these logs to look for any strange access patterns or suspect activities.

You may use the following SQL statement to make query logging available in Amazon Redshift:

ALTER DATABASE  SET enable_user_activity_logging=true;

The stl query system table may be used to retrieve the logs once query logging has been enabled. For instance, the SQL query shown below will display all queries that reached a certain table:

Monitor Performance 

Data scrambling is often a resource-intensive practice, so it’s good to monitor CPU usage, memory usage, and disk I/O to ensure your cluster isn’t being overloaded. In Redshift, you can use the <inlineCode>svl_query_summary<inlineCode> and <inlineCode>svl_query_report<inlineCode> system views to monitor query performance. You can also use Amazon CloudWatch to monitor metrics such as CPU usage and disk space.

Amazon CloudWatch

Establishing Backup and Disaster Recovery

In order to prevent data loss in the case of a disaster, backup and disaster recovery mechanisms should be put in place. Automated backups and manual snapshots are only two of the backup and recovery methods offered by Amazon Redshift. Automatic backups are taken once every eight hours by default. 

Moreover, you may always manually take a snapshot of your cluster. In the case of a breakdown or disaster, your cluster may be restored using these backups and snapshots. Use this SQL query to manually take a snapshot of your cluster in Amazon Redshift:

CREATE SNAPSHOT ; 

To restore a snapshot, you can use the <inlineCode>RESTORE<inlineCode> command. For example:


RESTORE 'snapshot_name' TO 'new_cluster_name';

Frequent Review and Updates

To ensure that data scrambling procedures remain effective and up-to-date with the latest security requirements, it is crucial to consistently review and update them. This process should include examining backup and recovery procedures, encryption techniques, and access controls.

In Amazon Redshift, you can assess access controls by inspecting all roles and their associated permissions in the <inlineCode>pg_roles<inlineCode> system catalog database. It is essential to confirm that only authorized individuals have access to sensitive information.

To analyze encryption techniques, use the <inlineCode>pg_catalog.pg_attribute<inlineCode> system catalog table, which allows you to inspect data types and encryption settings for each column in your tables. Ensure that sensitive data fields are protected with robust encryption methods, such as AES-256.

The AWS CLI commands <inlineCode>aws backup plan<inlineCode> and <inlineCode>aws backup vault<inlineCode> enable you to review your backup plans and vaults, as well as evaluate backup and recovery procedures. Make sure your backup and recovery procedures are properly configured and up-to-date.

Decrypting Data in Redshift

There are different options for decrypting data, depending on the encryption method used and the tools available; the decryption process is similar to of encryption, usually a custom UDF is used to decrypt the data, let’s look at one example of decrypting data scrambling with a substitution cipher.

Step 1: Create a UDF with decryption logic for substitution


CREATE FUNCTION decrypt_substitution(ciphertext varchar) RETURNS varchar
IMMUTABLE AS $$
    alphabet = 'abcdefghijklmnopqrstuvwxyz'
    substitution = 'ijklmnopqrstuvwxyzabcdefgh'
    reverse_substitution = ''.join(sorted(substitution, key=lambda c: substitution.index(c)))
    plaintext = ''
    for i in range(len(ciphertext)):
        index = substitution.find(ciphertext[i])
        if index == -1:
            plaintext += ciphertext[i]
        else:
            plaintext += reverse_substitution[index]
    return plaintext
$$ LANGUAGE plpythonu;

Step 2: Move the data back after truncating and applying the decryption function


TRUNCATE original_table;
INSERT INTO original_table (column1, decrypted_column2, column3)
SELECT column1, decrypt_substitution(encrypted_column2), column3
FROM temp_table;

In this example, encrypted_column2 is the encrypted version of column2 in the temp_table. The decrypt_substitution function is applied to encrypted_column2, and the result is inserted into the decrypted_column2 in the original_table. Make sure to replace column1, column2, and column3 with the appropriate column names, and adjust the INSERT INTO statement accordingly if you have more or fewer columns in your table.

Conclusion

Redshift data scrambling is an effective tool for additional data protection and should be considered as part of an organization's overall data security strategy. In this blog post, we looked into the importance of data protection and how this can be integrated effectively into the  data warehouse. Then, we covered the difference between data scrambling and data masking before diving into how one can set up Redshift data scrambling.

Once you begin to accustom to Redshift data scrambling, you can upgrade your security techniques with different techniques for scrambling data and best practices including encryption practices, logging, and performance monitoring. Organizations may improve their data security posture management (DSPM) and reduce the risk of possible breaches by adhering to these recommendations and using an efficient strategy.

<blogcta-big>

Veronica is the security researcher at Sentra. She brings a wealth of knowledge and experience as a cybersecurity researcher. Her main focuses are researching the main cloud provider services and AI infrastructures for Data related threats and techniques.

Subscribe

Latest Blog Posts

Ward Balcerzak
Ward Balcerzak
December 17, 2025
3
Min Read

How CISOs Will Evaluate DSPM in 2026: 13 New Buying Criteria for Security Leaders

How CISOs Will Evaluate DSPM in 2026: 13 New Buying Criteria for Security Leaders

Data Security Posture Management (DSPM) has quickly become part of mainstream security, gaining ground on older solutions and newer categories like XDR and SSE. Beneath the hype, most security leaders share the same frustration: too many products promise results but simply can't deliver in the messy, large-scale settings that enterprises actually have. The DSPM market is expected to jump from $1.86B in 2024 to $22.5B by 2033, giving buyers more choice - and greater pressure - to demand what really sets a solution apart for the coming years.

Instead of letting vendors dictate the RFP, what if CISOs led the process themselves? Fast-forward to 2026 and the checklist a CISO uses to evaluate DSPM solutions barely resembles the checklists of the past. Here are the 12 criteria everyone should insist on - criteria most vendors would rather you ignore, but industry leaders like Sentra are happy to highlight.

Why Legacy DSPM Evaluation Fails Modern CISOs

Traditional DSPM/DCAP evaluations were all about ticking off feature boxes: Can it scan S3 buckets? Show file types? But most CISO I meet point to poor data visibility as their biggest vulnerability. It's already obvious that today’s fragmented, agent-heavy tools aren’t cutting it.

So, what’s changed for 2026? Massive data volumes, new unstructured formats like chat logs or AI training sets, and rapid cloud adoption mean security leaders now need a different class of protection.

The right platform:

  • Works without agents, everywhere you operate
  • Focuses on bringing real, risk-based context - not just adding more alerts
  • Automates compliance and fixes identity/data governance gaps
  • Manages both structured and unstructured data across the whole organization

Old evaluation checklists don’t come close. It’s time to update yours.

The 13 DSPM Buying Criteria Vendors Hope You Don’t Ask

Here’s what should be at the heart of every modern assessment, especially for 2026:

  1. Is the platform truly agentless, everywhere? Agent-based designs slow you down and block coverage. The best solutions set up in minutes, with absolutely no agents - across SaaS, IaaS, or on-premises and will always discover any unknown and shadow data
  1. Does it operate fully in-environment? Your data needs to stay in your cloud or region - not copied elsewhere for analysis. In-environment processing guards privacy, simplifies compliance, and matches global regulations (Cloud Security Alliance).
  1. Can it accurately classify unstructured data (>98% accuracy)? Most tools stumble outside of databases. Insist on AI-powered classification that understands language, context, and sensitivity. This covers everything from PDF files to Zoom recordings to LLM training data.
  1. How does it handle petabyte-scale scanning and will it  break the bank? Legacy options get expensive as data grows. You need tools that can scan quickly and stay cost-effective across multi-cloud and hybrid environments at massive scale.
  1. Does it unify data and identity governance? Very few platforms support both human and machine identities - especially for service accounts or access across clouds. Only end-to-end coverage breaks down barriers between IT, business, and security.
  1. Can it surface business-contextualized risk insights? You need more than technical vulnerability. Leading platforms map sensitive data by its business importance and risk, making it easier to prioritize and take action.
  1. Is deployment frictionless and multi-cloud native? DSPM should work natively in AWS, Azure, GCP, and SaaS, no complicated integrations required. Insist on fast, simple onboarding.
  1. Does it offer full remediation workflow automation? It’s not enough to raise the alarm. You want exposures fixed automatically, at scale, without manual effort.

  2. Does this fit within my Data Security Ecosystem? Choose only platforms that integrate and enrich your current data governance stack so every tool operates from the same source of truth without adding operational overhead. 
  1. Are compliance and security controls bridged in a unified dashboard? No more switching between tools. Choose platforms where compliance and risk data are combined into a single view for GRC and SecOps.
  1. Does it support business-driven data discovery (e.g., by project, region, or owner)? You need dynamic views tied to business needs, helping cloud initiatives move faster without adding risk, so security can become a business enabler.
  1. What’s the track record on customer outcomes at scale? Actual results in complex, high-volume settings matter more than demo promises. Look for real stories from large organizations.
  2. How is pricing structured for future growth? Beware of pricing that seems low until your data doubles. Look for clear, usage-based models so expansion won’t bring hidden costs.

Agentless, In-Environment Power: Why It’s the New Gold Standard

Agentless, in-environment architecture removes hassles with endpoint installs, connectors, and worries about where your data goes. Gartner has highlighted that this approach reduces regulatory headaches and enables fast onboarding. As organizations keep adding new cloud and hybrid systems, only these platforms can truly scale for global teams and strict requirements.

Sentra’s platform keeps all processing inside your environment. There’s no need to export your data; offering peace of mind for privacy, sovereignty, and speed. With regulations increasing everywhere, this approach isn’t just helpful; it’s essential.

Classification Accuracy and Petabyte-Scale Efficiency: The Must-Haves for 2026

Unstructured data is growing fast, and workloads are now more diverse than ever. The difference between basic scanning and real, AI-driven classification is often the difference between protecting your company or ending up on the breach list. Leading platforms, including Sentra, deliver over 95% classification accuracy by using large language models and in-house methods across both structured and unstructured data.

Why is speed and scale so important? Old-school solutions were built with smaller data volumes in mind. Today, DSPM platforms must quickly and affordably identify and secure data in vast environments. Sentra’s scanning is both fast and affordable, keeping up as your data grows. To learn more about these challenges read: Reducing Cloud Data Attack Risk.

Don’t Settle: Redefining Best-in-Class DSPM Buying Criteria for 2026

Many vendors are still only comfortable offering the basics, but the demands facing CISOs today are anything but basic. Combining identity and data governance, multi-cloud support that works out of the box, and risk insights mapped to real business needs - these are the essential elements for protecting today’s and tomorrow’s data. If a solution doesn’t check all 12 boxes, you’re already limiting your security program before you start.

Need a side-by-side comparison for your next decision?  Request a personalized demo to see exactly how Sentra meets every requirement.

Conclusion

With AI further accelerating data growth, security teams can’t afford to settle for legacy features or generic checklists. By insisting on meaningful criteria - true agentless design, in-environment processing, precise AI-driven classification, scalable affordability, and business-first integration - CISOs set a higher standard for both their own organizations and the wider industry.

Sentra is ready to help you raise the bar. Contact us for a data risk assessment, or to discuss how to ensure your next buying decision leads to better protection, less risk, and a stronger position for the future.

Continue the Conversation

If you want to go deeper into how CISOs are rethinking data security, I explore these topics regularly on Guardians of the Data, a podcast focused on real-world data protection challenges, evolving DSPM strategies, and candid conversations with security leaders.

Watch or listen to Guardians of the Data for practical insights on securing data in an AI-driven, multi-cloud world.

<blogcta-big>

Read More
Nikki Ralston
Nikki Ralston
Romi Minin
Romi Minin
December 16, 2025
3
Min Read

Sentra Is One of the Hottest Cybersecurity Startups

Sentra Is One of the Hottest Cybersecurity Startups

We knew we were on a hot streak, and now it’s official.

Sentra has been named one of CRN’s 10 Hottest Cybersecurity Startups of 2025. This recognition is a direct reflection of our commitment to redefining data security for the cloud and AI era, and of the growing trust forward-thinking enterprises are placing in our unique approach.

This milestone is more than just an award. It shows our relentless drive to protect modern data systems and gives us a chance to thank our customers, partners, and the Sentra team whose creativity and determination keep pushing us ahead.

The Market Forces Fueling Sentra’s Momentum

Cybersecurity is undergoing major changes. With 94% of organizations worldwide now relying on cloud technologies, the rapid growth of cloud-based data and the rise of AI agents have made security both more urgent and more complicated. These shifts are creating demands for platforms that combine unified data security posture management (DSPM) with fast data detection and response (DDR).

Industry data highlights this trend: over 73% of enterprise security operations centers are now using AI for real-time threat detection, leading to a 41% drop in breach containment time. The global cybersecurity market is growing rapidly, estimated to reach $227.6 billion in 2025, fueled by the need to break down barriers between data discovery, classification, and incident response 2025 cybersecurity market insights. In 2025, organizations will spend about 10% more on cyber defenses, which will only increase the demand for new solutions.

Why Recognition by CRN Matters and What It Means

Landing a place on CRN’s 10 Hottest Cybersecurity Startups of 2025 is more than publicity for Sentra. It signals we truly meet the moment. Our rise isn’t just about new features; it’s about helping security teams tackle the growing risks posed by AI and cloud data head-on. This recognition follows our mention as a CRN 2024 Stellar Startup, a sign of steady innovation and mounting interest from analysts and enterprises alike.

Being on CRN’s list means customers, partners, and investors value Sentra’s straightforward, agentless data protection that helps organizations work faster and with more certainty.

Innovation Where It Matters: Sentra’s Edge in Data and AI Security

Sentra stands out for its practical approach to solving urgent security problems, including:

  • Agentless, multi-cloud coverage: Sentra identifies and classifies sensitive data and AI agents across cloud, SaaS, and on-premises environments without any agents or hidden gaps.
  • Integrated DSPM + DDR: We go further than monitoring posture by automatically investigating incidents and responding, so security teams can act quickly on why DSPM+DDR matters.
  • AI-driven advancements: Features like domain-specific AI Classifiers for Unstructure advanced AI classification leveraging SLMs, Data Security for AI Agents and Microsoft M365 Copilot help customers stay in control as they adopt new technologies Sentra’s AI-powered innovation.

With new attack surfaces popping up all the time, from prompt injection to autonomous agent drift, Sentra’s architecture is built to handle the world of AI.

A Platform Approach That Outpaces the Competition

There are plenty of startups aiming to tackle AI, cloud, and data security challenges. Companies like 7AI, Reco, Exaforce, and Noma Security have been in the news for their funding rounds and targeted solutions. Still, very few offer the kind of unified coverage that sets Sentra apart.

Most competitors stick to either monitoring SaaS agents or reducing SOC alerts. Sentra does more by providing both agentless multi-cloud DSPM and built-in DDR. This gives organizations visibility, context, and the power to act in one platform. With features like Data Security for AI Agents, Sentra helps enterprises go beyond managing alerts by automating meaningful steps to defend sensitive data everywhere.

Thanks to Our Community and What’s Next

This honor belongs first and foremost to our community: customers breaking new ground in data security, partners building solutions alongside us, and a team with a clear goal to lead the industry.

If you haven’t tried Sentra yet, now’s a great time to see what we can do for your cloud and AI data security program. Find out why we’re at the forefront: schedule a personalized demo or read CRN’s full 2025 list for more insight.

Conclusion

Being named one of CRN’s hottest cybersecurity startups isn’t just a milestone. It pushes us forward toward our vision - data security that truly enables innovation. The market is changing fast, but Sentra’s focus on meaningful security results hasn't wavered.

Thank you to our customers, partners, investors, and team for your ongoing trust and teamwork. As AI and cloud technology shape the future, Sentra is ready to help organizations move confidently, securely, and quickly.

Read More
Meni Besso
Meni Besso
December 15, 2025
3
Min Read

AI Governance Starts With Data Governance: Securing the Training Data and Agents Fuelling GenAI

AI Governance Starts With Data Governance: Securing the Training Data and Agents Fuelling GenAI

Generative AI isn’t just transforming products and processes - it’s expanding the entire enterprise risk surface. As C-suite executives and security leaders rush to unlock GenAI’s competitive advantages, a hard truth is clear: effective AI governance depends on solid, end-to-end data governance.

Sensitive data is increasingly used for model training and autonomous agents. If organizations fail to discover, classify, and secure these resources early, they risk privacy breaches, regulatory violations, and reputational damage. To make GenAI safe, compliant, and trustworthy from the start, data governance for generative AI needs to be a top boardroom priority.

Why Data Governance is the Cornerstone of GenAI Trustworthiness and Safety

The opportunities and risks of generative AI depend not only on algorithms, but also on the quality, security, and history of the underlying data. AWS reports that 39% of Chief Data Officers see data cleaning, integration, and storage as the main barriers to GenAI adoption, and 49% of enterprises make data quality improvement a core focus for successful AI projects (AWS Enterprise Strategy - Data Governance). Without strong data governance, sensitive information can end up in training sets, leading to unintentional leaks or model behaviors that break privacy and compliance.

Regulatory requirements, such as the Generative AI Copyright Disclosure Act, are evolving fast, raising the pressure to document data lineage and make sure unauthorized or non-compliant datasets stay out. In the world of GenAI, governance goes far beyond compliance checklists. It’s essential for building AI that is safe, auditable, and trusted by both regulators and customers.

New Attack Surfaces: Risks From Unsecured Data and Shadow AI Agents

GenAI adoption increases risk. Today, 79% of organizations have already piloted or deployed agentic AI, with many using LLM-powered agents to automate key workflows (Wikipedia - Agentic AI). But if these agents, sometimes functioning as "shadow AI" outside official oversight, access sensitive or unclassified data, the fallout can be severe.

In 2024, over 30% of AI data breaches involve insider threats or accidental disclosure, according to Quinnox Data Governance for AI. Autonomous agents can mistakenly reveal trade secrets, financial records, or customer data, damaging brand trust. The risk multiplies rapidly if sensitive data isn’t properly governed before flowing into GenAI tools. To stop these new threats, organizations need up-to-the-minute insight and control over both data and the agents using it.

Frameworks and Best Practices for Data Governance in GenAI

Leading organizations now follow data governance frameworks that match changing regulations and GenAI's technical complexity. Standards like NIST AI Risk Management Framework (AI RMF) and ISO/IEC 42001:2023 are setting the benchmarks for building auditable, resilient AI programs (Data and AI Governance - Frameworks & Best Practices).

Some of the most effective practices:

  • Managing metadata and tracking full data lineage
  • Using data access policies based on role and context
  • Automating compliance with new AI laws
  • Monitoring data integrity and checking for bias

A strong data governance program for generative AI focuses on ongoing data discovery, classification, and policy enforcement - before data or agents meet any AI models. This approach helps lower risk and gives GenAI efforts a solid base of trust.

Sentra’s Approach: Proactive Pre-Integration Discovery and Continuous Enforcement

Many tools only secure data after it’s already being used with GenAI applications. This reactive strategy leaves openings for risk. Sentra takes a different path, letting organizations discover, classify, and protect sensitive data sources before they interact with language models or agentic AI.

By using agentless, API-based discovery and classification across multi-cloud and SaaS environments, Sentra delivers immediate visibility and context-aware risk scoring for all enterprise data assets. With automated policies, businesses can mask, encrypt, or restrict data access depending on sensitivity, business requirements, or audit needs. Live Continuous monitoring tracks which AI agents are accessing data, making granular controls and fast intervention possible. These processes help stop shadow AI, keep unauthorized data out of LLM training, and maintain compliance as rules and business needs shift.

Guardrails for Responsible AI Growth Across the Enterprise

The future of GenAI depends on how well businesses can innovate while keeping security and compliance intact. As AI regulations become stricter and adoption speeds up, Sentra’s ability to provide ongoing, automated discovery and enforcement at scale is critical. Further reading: AI Automation & Data Security: What You Need To Know.

With Sentra, organizations can:

  • Stop unapproved or unchecked data from being used in model training
  • Identify shadow AI agents or risky automated actions as they happen
  • Support audits with complete data classification
  • Meet NIST, ISO, and new global standards with ease

Sentra gives CISOs, CDOs, and executives a proactive, scalable way to adopt GenAI safely, protecting the business before any model training even begins.

AI Governance Starts with Data Governance

AI governance for generative AI starts, and is won or lost, at the data layer. If organizations don’t find, classify, and secure sensitive data first, every other security measure remains reactive and ineffective. As generative AI, agent automation, and regulatory demands rise, a unified data governance strategy isn’t just good practice, it’s an urgent priority. Sentra gives security and business teams real control, making sure GenAI is secure, compliant, and trusted.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!

Before you go...

Get the Gartner Customers' Choice for DSPM Report

Read why 98% of users recommend Sentra.

Gartner Certificate for Sentra