All Resources
In this article:
minus iconplus icon
Share the Blog

Use Redshift Data Scrambling for Additional Data Protection

May 3, 2023
8
Min Read

According to IBM, a data breach in the United States cost companies an average of 9.44 million dollars in 2022. It is now more important than ever for organizations to place high importance on protecting confidential information. Data scrambling, which can add an extra layer of security to data, is one approach to accomplish this. 

In this post, we'll analyze the value of data protection, look at the potential financial consequences of data breaches, and talk about how Redshift Data Scrambling may help protect private information.

The Importance of Data Protection

Data protection is essential to safeguard sensitive data from unauthorized access. Identity theft, financial fraud,and other serious consequences are all possible as a result of a data breach. Data protection is also crucial for compliance reasons. Sensitive data must be protected by law in several sectors, including government, banking, and healthcare. Heavy fines, legal problems, and business loss may result from failure to abide by these regulations.

Hackers employ many techniques, including phishing, malware, insider threats, and hacking, to get access to confidential information. For example, a phishing assault may lead to the theft of login information, and malware may infect a system, opening the door for additional attacks and data theft. 

So how to protect yourself against these attacks and minimize your data attack surface?

What is Redshift Data Masking?

Redshift data masking is a technique used to protect sensitive data in Amazon Redshift; a cloud-based data warehousing and analytics service. Redshift data masking involves replacing sensitive data with fictitious, realistic values to protect it from unauthorized access or exposure. It is possible to enhance data security by utilizing Redshift data masking in conjunction with other security measures, such as access control and encryption, in order to create a comprehensive data protection plan.

What is Redshift Data Masking

What is Redshift Data Scrambling?

Redshift data scrambling protects confidential information in a Redshift database by altering original data values using algorithms or formulas, creating unrecognizable data sets. This method is beneficial when sharing sensitive data with third parties or using it for testing, development, or analysis, ensuring privacy and security while enhancing usability. 

The technique is highly customizable, allowing organizations to select the desired level of protection while maintaining data usability. Redshift data scrambling is cost-effective, requiring no additional hardware or software investments, providing an attractive, low-cost solution for organizations aiming to improve cloud data security.

Data Masking vs. Data Scrambling

Data masking involves replacing sensitive data with a fictitious but realistic value. However, data scrambling, on the other hand, involves changing the original data values using an algorithm or a formula to generate a new set of values.

In some cases, data scrambling can be used as part of data masking techniques. For instance, sensitive data such as credit card numbers can be scrambled before being masked to enhance data protection further.

Setting up Redshift Data Scrambling

Having gained an understanding of Redshift and data scrambling, we can now proceed to learn how to set it up for implementation. Enabling data scrambling in Redshift requires several steps.

To achieve data scrambling in Redshift, SQL queries are utilized to invoke built-in or user-defined functions. These functions utilize a blend of cryptographic techniques and randomization to scramble the data.

The following steps are explained using an example code just for a better understanding of how to set it up:

Step 1: Create a new Redshift cluster

Create a new Redshift cluster or use an existing cluster if available. 

Redshift create cluster

Step 2: Define a scrambling key

Define a scrambling key that will be used to scramble the sensitive data.

 
SET session my_scrambling_key = 'MyScramblingKey';

In this code snippet, we are defining a scrambling key by setting a session-level parameter named <inlineCode>my_scrambling_key<inlineCode> to the value <inlineCode>MyScramblingKey<inlineCode>. This key will be used by the user-defined function to scramble the sensitive data.

Step 3: Create a user-defined function (UDF)

Create a user-defined function in Redshift that will be used to scramble the sensitive data. 


CREATE FUNCTION scramble(input_string VARCHAR)
RETURNS VARCHAR
STABLE
AS $$
DECLARE
scramble_key VARCHAR := 'MyScramblingKey';
BEGIN
-- Scramble the input string using the key
-- and return the scrambled output
RETURN ;
END;
$$ LANGUAGE plpgsql;

Here, we are creating a UDF named <inlineCode>scramble<inlineCode> that takes a string input and returns the scrambled output. The function is defined as <inlineCode>STABLE<inlineCode>, which means that it will always return the same result for the same input, which is important for data scrambling. You will need to input your own scrambling logic.

Step 4: Apply the UDF to sensitive columns

Apply the UDF to the sensitive columns in the database that need to be scrambled.


UPDATE employee SET ssn = scramble(ssn);

For example, applying the <inlineCode>scramble<inlineCode> UDF to a column saying, <inlineCode>ssn<inlineCode> in a table named <inlineCode>employee<inlineCode>. The <inlineCode>UPDATE<inlineCode> statement calls the <inlineCode>scramble<inlineCode> UDF and updates the values in the <inlineCode>ssn<inlineCode> column with the scrambled values.

Step 5: Test and validate the scrambled data

Test and validate the scrambled data to ensure that it is unreadable and unusable by unauthorized parties.


SELECT ssn, scramble(ssn) AS scrambled_ssn
FROM employee;

In this snippet, we are running a <inlineCode>SELECT<inlineCode> statement to retrieve the <inlineCode>ssn<inlineCode> column and the corresponding scrambled value using the <inlineCode>scramble<inlineCode> UDF. We can compare the original and scrambled values to ensure that the scrambling is working as expected. 

Step 6: Monitor and maintain the scrambled data

To monitor and maintain the scrambled data, we can regularly check the sensitive columns to ensure that they are still rearranged and that there are no vulnerabilities or breaches. We should also maintain the scrambling key and UDF to ensure that they are up-to-date and effective.

Different Options for Scrambling Data in Redshift

Selecting a data scrambling technique involves balancing security levels, data sensitivity, and application requirements. Various general algorithms exist, each with unique pros and cons. To scramble data in Amazon Redshift, you can use the following Python code samples in conjunction with a library like psycopg2 to interact with your Redshift cluster. Before executing the code samples, you will need to install the psycopg2 library:


pip install psycopg2

Random

Utilizing a random number generator, the Random option quickly secures data, although its susceptibility to reverse engineering limits its robustness for long-term protection.


import random
import string
import psycopg2

def random_scramble(data):
    scrambled = ""
    for char in data:
        scrambled += random.choice(string.ascii_letters + string.digits)
    return scrambled

# Connect to your Redshift cluster
conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()
# Fetch data from your table
cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

# Scramble the data
scrambled_rows = [(random_scramble(row[0]),) for row in rows]

# Update the data in the table
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

# Close the connection
cursor.close()
conn.close()

Shuffle

The Shuffle option enhances security by rearranging data characters. However, it remains prone to brute-force attacks, despite being harder to reverse-engineer.


import random
import psycopg2

def shuffle_scramble(data):
    data_list = list(data)
    random.shuffle(data_list)
    return ''.join(data_list)

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

scrambled_rows = [(shuffle_scramble(row[0]),) for row in rows]

cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(scrambled, original) for scrambled, original in zip(scrambled_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Reversible

By scrambling characters in a decryption key-reversible manner, the Reversible method poses a greater challenge to attackers but is still vulnerable to brute-force attacks. We’ll use the Caesar cipher as an example.


def caesar_cipher(data, key):
    encrypted = ""
    for char in data:
        if char.isalpha():
            shift = key % 26
            if char.islower():
                encrypted += chr((ord(char) - 97 + shift) % 26 + 97)
            else:
                encrypted += chr((ord(char) - 65 + shift) % 26 + 65)
        else:
            encrypted += char
    return encrypted

conn = psycopg2.connect(host='your_host', port='your_port', dbname='your_dbname', user='your_user', password='your_password')
cursor = conn.cursor()

cursor.execute("SELECT sensitive_column FROM your_table;")
rows = cursor.fetchall()

key = 5
encrypted_rows = [(caesar_cipher(row[0], key),) for row in rows]
cursor.executemany("UPDATE your_table SET sensitive_column = %s WHERE sensitive_column = %s;", [(encrypted, original) for encrypted, original in zip(encrypted_rows, rows)])
conn.commit()

cursor.close()
conn.close()

Custom

The Custom option enables users to create tailor-made algorithms to resist specific attack types, potentially offering superior security. However, the development and implementation of custom algorithms demand greater time and expertise.

Best Practices for Using Redshift Data Scrambling

There are several best practices that should be followed when using Redshift Data Scrambling to ensure maximum protection:

Use Unique Keys for Each Table

To ensure that the data is not compromised if one key is compromised, each table should have its own unique key pair. This can be achieved by creating a unique index on the table.


CREATE UNIQUE INDEX idx_unique_key ON table_name (column_name);

Encrypt Sensitive Data Fields 

Sensitive data fields such as credit card numbers and social security numbers should be encrypted to provide an additional layer of security. You can encrypt data fields in Redshift using the ENCRYPT function. Here's an example of how to encrypt a credit card number field:


SELECT ENCRYPT('1234-5678-9012-3456', 'your_encryption_key_here');

Use Strong Encryption Algorithms

Strong encryption algorithms such as AES-256 should be used to provide the strongest protection. Redshift supports AES-256 encryption for data at rest and in transit.


CREATE TABLE encrypted_table (  sensitive_data VARCHAR(255) ENCODE ZSTD ENCRYPT 'aes256' KEY 'my_key');

Control Access to Encryption Keys 

Access to encryption keys should be restricted to authorized personnel to prevent unauthorized access to sensitive data. You can achieve this by setting up an AWS KMS (Key Management Service) to manage your encryption keys. Here's an example of how to restrict access to an encryption key using KMS in Python:


import boto3

kms = boto3.client('kms')

key_id = 'your_key_id_here'
grantee_principal = 'arn:aws:iam::123456789012:user/jane'

response = kms.create_grant(
    KeyId=key_id,
    GranteePrincipal=grantee_principal,
    Operations=['Decrypt']
)

print(response)

Regularly Rotate Encryption Keys 

Regular rotation of encryption keys ensures that any compromised keys do not provide unauthorized access to sensitive data. You can schedule regular key rotation in AWS KMS by setting a key policy that specifies a rotation schedule. Here's an example of how to schedule annual key rotation in KMS using the AWS CLI:

 
aws kms put-key-policy \\
    --key-id your_key_id_here \\
    --policy-name default \\
    --policy
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    "{\\"Version\\":\\"2012-10-17\\",\\"Statement\\":[{\\"Effect\\":\\"Allow\\"
    \\":\\"kms:RotateKey\\",\\"Resource\\":\\"*\\"},{\\"Effect\\":\\"Allow\\",\
    \"Principal\\":{\\"AWS\\":\\"arn:aws:iam::123456789012:root\\"},\\"Action\\
    ":\\"kms:CreateGrant\\",\\"Resource\\":\\"*\\",\\"Condition\\":{\\"Bool\\":
    {\\"kms:GrantIsForAWSResource\\":\\"true\\"}}}]}"

Turn on logging 

To track user access to sensitive data and identify any unwanted access, logging must be enabled. All SQL commands that are executed on your cluster are logged when you activate query logging in Amazon Redshift. This applies to queries that access sensitive data as well as data-scrambling operations. Afterwards, you may examine these logs to look for any strange access patterns or suspect activities.

You may use the following SQL statement to make query logging available in Amazon Redshift:

ALTER DATABASE  SET enable_user_activity_logging=true;

The stl query system table may be used to retrieve the logs once query logging has been enabled. For instance, the SQL query shown below will display all queries that reached a certain table:

Monitor Performance 

Data scrambling is often a resource-intensive practice, so it’s good to monitor CPU usage, memory usage, and disk I/O to ensure your cluster isn’t being overloaded. In Redshift, you can use the <inlineCode>svl_query_summary<inlineCode> and <inlineCode>svl_query_report<inlineCode> system views to monitor query performance. You can also use Amazon CloudWatch to monitor metrics such as CPU usage and disk space.

Amazon CloudWatch

Establishing Backup and Disaster Recovery

In order to prevent data loss in the case of a disaster, backup and disaster recovery mechanisms should be put in place. Automated backups and manual snapshots are only two of the backup and recovery methods offered by Amazon Redshift. Automatic backups are taken once every eight hours by default. 

Moreover, you may always manually take a snapshot of your cluster. In the case of a breakdown or disaster, your cluster may be restored using these backups and snapshots. Use this SQL query to manually take a snapshot of your cluster in Amazon Redshift:

CREATE SNAPSHOT ; 

To restore a snapshot, you can use the <inlineCode>RESTORE<inlineCode> command. For example:


RESTORE 'snapshot_name' TO 'new_cluster_name';

Frequent Review and Updates

To ensure that data scrambling procedures remain effective and up-to-date with the latest security requirements, it is crucial to consistently review and update them. This process should include examining backup and recovery procedures, encryption techniques, and access controls.

In Amazon Redshift, you can assess access controls by inspecting all roles and their associated permissions in the <inlineCode>pg_roles<inlineCode> system catalog database. It is essential to confirm that only authorized individuals have access to sensitive information.

To analyze encryption techniques, use the <inlineCode>pg_catalog.pg_attribute<inlineCode> system catalog table, which allows you to inspect data types and encryption settings for each column in your tables. Ensure that sensitive data fields are protected with robust encryption methods, such as AES-256.

The AWS CLI commands <inlineCode>aws backup plan<inlineCode> and <inlineCode>aws backup vault<inlineCode> enable you to review your backup plans and vaults, as well as evaluate backup and recovery procedures. Make sure your backup and recovery procedures are properly configured and up-to-date.

Decrypting Data in Redshift

There are different options for decrypting data, depending on the encryption method used and the tools available; the decryption process is similar to of encryption, usually a custom UDF is used to decrypt the data, let’s look at one example of decrypting data scrambling with a substitution cipher.

Step 1: Create a UDF with decryption logic for substitution


CREATE FUNCTION decrypt_substitution(ciphertext varchar) RETURNS varchar
IMMUTABLE AS $$
    alphabet = 'abcdefghijklmnopqrstuvwxyz'
    substitution = 'ijklmnopqrstuvwxyzabcdefgh'
    reverse_substitution = ''.join(sorted(substitution, key=lambda c: substitution.index(c)))
    plaintext = ''
    for i in range(len(ciphertext)):
        index = substitution.find(ciphertext[i])
        if index == -1:
            plaintext += ciphertext[i]
        else:
            plaintext += reverse_substitution[index]
    return plaintext
$$ LANGUAGE plpythonu;

Step 2: Move the data back after truncating and applying the decryption function


TRUNCATE original_table;
INSERT INTO original_table (column1, decrypted_column2, column3)
SELECT column1, decrypt_substitution(encrypted_column2), column3
FROM temp_table;

In this example, encrypted_column2 is the encrypted version of column2 in the temp_table. The decrypt_substitution function is applied to encrypted_column2, and the result is inserted into the decrypted_column2 in the original_table. Make sure to replace column1, column2, and column3 with the appropriate column names, and adjust the INSERT INTO statement accordingly if you have more or fewer columns in your table.

Conclusion

Redshift data scrambling is an effective tool for additional data protection and should be considered as part of an organization's overall data security strategy. In this blog post, we looked into the importance of data protection and how this can be integrated effectively into the  data warehouse. Then, we covered the difference between data scrambling and data masking before diving into how one can set up Redshift data scrambling.

Once you begin to accustom to Redshift data scrambling, you can upgrade your security techniques with different techniques for scrambling data and best practices including encryption practices, logging, and performance monitoring. Organizations may improve their data security posture management (DSPM) and reduce the risk of possible breaches by adhering to these recommendations and using an efficient strategy.

<blogcta-big>

Veronica is the security researcher at Sentra. She brings a wealth of knowledge and experience as a cybersecurity researcher. Her main focuses are researching the main cloud provider services and AI infrastructures for Data related threats and techniques.

Subscribe

Latest Blog Posts

Meni Besso
Meni Besso
August 21, 2025
3
Min Read
Compliance

NYDFS 2.0: New Cybersecurity Requirements and Enforcement

NYDFS 2.0: New Cybersecurity Requirements and Enforcement

NYDFS Steps Up Enforcement

The New York State Department of Financial Services (NYDFS) has long been one of the most influential regulators in the financial sector, but over the past two years, it’s made one thing crystal clear: cybersecurity is no longer a back-office IT concern, it’s a regulatory priority.

In response to growing threats, increasing reliance on third-party services, and persistent operational risks, NYDFS has tightened its expectations around how financial institutions protect sensitive data. And it’s backing that stance with real financial consequences.

Just ask PayPal or OneMain Financial, two major firms hit with multimillion-dollar penalties for cybersecurity lapses. These weren’t headline-grabbing breaches or ransomware attacks, they were the result of basic control failures, delayed reporting, and repeated gaps in governance.

What do a $2M fine for PayPal and a $4.25M penalty for OneMain have in common?


Weak cybersecurity practices, and a regulator that’s no longer willing to wait for companies to catch up.

The Recent Crackdowns: PayPal and OneMain

a. PayPal – $2M Civil Penalty (January 2025)

In January 2025, NYDFS announced a $2 million penalty against PayPal for violations of its cybersecurity regulations under Part 500. The enforcement focused on failures to report a cybersecurity event in a timely manner and gaps in maintaining certain required controls.

The incident involved unauthorized access to over 34,000 user accounts, exposing sensitive personal data including tax IDs and financial information. NYDFS emphasized that PayPal’s delayed reporting and lack of specific security measures put both consumers and the broader financial ecosystem at risk.

What it signals: No company - not even a digital-native fintech giant is immune from enforcement. The bar is rising, and NYDFS is expecting organizations to report, respond, and remediate swiftly and transparently.

b. OneMain Financial – $4.25M Fine (May 2023)

In May 2023, NYDFS fined OneMain Financial $4.25 million after discovering systemic cybersecurity deficiencies, including improperly stored passwords, insufficient multi-factor authentication, and inadequate third-party risk management.

Even more concerning: many of these issues were identified in earlier audits and hadn’t been fully addressed. NYDFS made it clear that repeated inaction wouldn’t be tolerated.

What it signals: It’s not just about responding to one-off incidents — regulators are watching for long-term security maturity. Ongoing hygiene, policy enforcement, and consistent control testing are now table stakes.

What’s Changing: NYDFS 2.0 (Part 500 Amendments)

These enforcement actions aren’t just about past violations, they’re a preview of what’s to come.

With the rollout of the NYDFS Second Amendment to Part 500, also known as NYDFS 2.0, covered entities, especially those classified as Class A companies are facing a new set of enforceable expectations.

Key new requirements include:

  • Annual independent audits of cybersecurity programs
  • Mandatory multi-factor authentication (MFA) for all systems
  • Stronger access control policies, including role-based access
  • Board-level or senior executive oversight of cybersecurity governance

Full enforcement kicks in on November 1, 2025. At that point, these aren’t just checkboxes, they’re compliance requirements with real financial and reputational risk for falling short.

The message is clear: NYDFS is no longer satisfied with written policies and best-effort intentions. It's expecting demonstrated outcomes, measurable control, and leadership accountability.

The Broader Message: Enforcement Is the New Default

NYDFS isn’t the only regulator stepping up, but it’s arguably the most proactive, and most willing to act. These recent fines signal a broader shift across the industry: compliance is no longer about having good intentions or written policies. Regulators are now focused on evidence of execution, real controls, timely reporting, and provable outcomes.

In other words, enforcement is the new default. This shift reframes cybersecurity from a purely technical issue to a board-level governance challenge. It's not enough for IT or security teams to manage risk in isolation. Executive leadership, legal, and compliance functions all need to be aligned — and accountable.

If your organization is treating cybersecurity as just a tech responsibility, you’re behind.

What Organizations Should Do Now

The message from regulators is clear, and now is the time to act.

Here are four practical steps your team can take to stay ahead:

  • Audit your current posture against NYDFS Part 500. Focus especially on:
    • Incident reporting timelines
    • MFA coverage
    • Access controls
    • Third-party risk assessments

  • Prioritize visibility across your environment
    You can’t protect what you can’t see. Ensure you have continuous insight into where sensitive data lives, who can access it, and how it moves across cloud, SaaS, and on-prem systems.

  • Document everything
    Have clear records of your policies, security controls, vendor assessments, incident response processes, and risk decisions. If you had to prove your compliance tomorrow, could you?

  • Benchmark your controls against recent enforcement
    If PayPal and OneMain were fined for these issues, ask yourself:
    How would our program hold up under similar scrutiny?

Final Thoughts: Read the Signals Now, Not After a Fine

The writing is on the wall - NYDFS is raising the bar, and other regulators are likely to follow. This is your opportunity to get ahead of the curve, rather than scrambling after the fact.

Take these fines as what they are: a warning shot and a roadmap. Organizations that prepare now - with tighter controls, better visibility, and cross-functional ownership won’t just avoid penalties. They’ll be more resilient, more trusted, and better equipped to lead in a high-risk landscape.

If you’re not sure where to start, use these enforcement cases as a prompt for an internal review. And if you want to go deeper, we’ve put together a compliance checklist that can help you assess where you stand.

Better to find the gaps now before NYDFS does.

<blogcta-big>

Read More
Ward Balcerzak
Ward Balcerzak
August 18, 2025
4
Min Read
Data Security

CISO Challenges of 2025 and How to Overcome Them

CISO Challenges of 2025 and How to Overcome Them

The evolving digital landscape for cloud-first companies presents unprecedented challenges for chief information security officers (CISOs). The rapid adoption of AI-powered systems and the explosive growth of cloud-based deployments have expanded the attack surface, introducing novel risks and threats.

 

According to IBM's 2024 "Cost of a Data Breach Report," the average cost of a cloud data breach soared to $4.88 million - prompting a crucial question: Is your organization prepared to secure its expanding digital footprint? 

Regulatory frameworks and data privacy standards are in a constant state of flux, requiring CISOs to stay agile and proactive in their approach to compliance and risk management.

This article explores the top challenges facing CISOs today, illustrated by real-world incidents, and offers actionable solutions for them. By understanding these pressing concerns, organizations can stay proactive and secure their environments effectively.

Top Modern Challenges Faced by CISOs

Modern CISO concerns stem from a combination of technical complexity, workforce behavior, and external threats. Below, we explore these challenges in detail.

1. AI and Large Language Model (LLM) Data Protection Challenges

AI tools like large language models (LLMs) have become integral to modern organizations; however, they have also introduced significant risks to data security. In 2024, for example, Microsoft's AI system, Copilot, was manipulated to exfiltrate private data and automate spear-phishing attacks, revealing vulnerabilities in AI-powered systems.

Furthermore, insider threats have increased as employees misuse AI tools to leak sensitive data. For instance, the AI malware Imprompter exploited LLMs to facilitate data exfiltration, causing data loss and reputational harm. 

Robust governance frameworks that restrict unauthorized AI system access and implementation of real-time activity monitoring are essential to mitigate such risks.

2. Unstructured Data Management

Unstructured data (e.g., text, images, audio, and video files) is increasingly stored across cloud platforms, making it difficult to secure. Take the high-profile breach in 2022 involving Turkish Pegasus Airlines. It compromised 6.5 TB of unstructured data stored in an AWS S3 bucket, ultimately leading to 23 million files being exposed. 

This incident highlighted the dangers of poorly managed unstructured data, which can lead to severe reputational damage and potential regulatory penalties. Addressing this challenge requires automated classification and encryption tools to secure data at scale. In addition, real-time classification and encryption ensure sensitive information remains protected in diverse, dynamic environments.

3. Encryption and Data Labeling

Encryption and data labeling are vital for protecting sensitive information, yet many organizations struggle to implement them effectively. 

IBM's 2024 “Cost of a Data Breach Report” reveals that companies that have implemented security AI and automation “extensively” have saved an average of $2.2 million compared to those without these technologies.

 

The EU’s General Data Protection Regulation (GDPR) highlights the importance of data labeling and classification, requiring organizations to handle personal data appropriately based on its sensitivity. These measures are essential for protecting sensitive information and complying with all relevant data protection regulations.

Companies can enforce data protection policies more effectively by adopting dynamic encryption technologies and leveraging platforms that support automated labeling.

4. Regulatory Compliance and Global Standards

The expanding intricacies of data privacy regulations, such as GDPR, CCPA, and HIPAA, pose significant challenges for CISOs. In 2024, Microsoft and Google faced lawsuits for the unauthorized use of personal data in AI training, underscoring the financial and reputational risks of non-compliance.

Companies must leverage compliance automation tools and centralized management systems to navigate these complexities and streamline regulatory adherence.

5. Explosive Data Growth

The exponential growth of data creates immense opportunities but also heightens security risks. 

As organizations generate and store more data, legacy security measures often fall short, exposing critical vulnerabilities. Advanced, cloud-native, and scalable platforms help organizations scale their data protection strategies alongside data growth, offering real-time monitoring and automated controls to mitigate risks effectively.

6. Insider Threats

Both intentional and accidental insider threats remain among the most difficult challenges for CISOs to address. 

In 2024, a North Korean IT worker, hired unknowingly by an American company, stole sensitive data and demanded a cryptocurrency ransom. This incident exposed vulnerabilities in remote hiring processes, resulting in severe operational and reputational consequences. 

Combatting insider threats requires sophisticated behavior analytics and activity monitoring tools to detect and respond to anomalies early. Security platforms should provide enhanced visibility into user activity, enabling organizations to mitigate such risks and secure their data proactively.

7. Shadow Data

In the race to adopt new cloud and AI-powered tools, users are often generating, storing, and transmitting sensitive data in services that the security team never approved or even knew existed. This includes everything from unofficial file-sharing apps to unsanctioned SaaS platforms and ad hoc API integrations.

The result is shadow IT, shadow SaaS, and ultimately, shadow data: sensitive or regulated information that lives outside the visibility of traditional security tools. Without knowing where this data resides or how it’s being accessed, CISOs cannot protect it. These unknown data flows introduce real compliance, privacy, and security risk.

It is critical to expose and classify this hidden data in real time, in order to give security teams the visibility they need to secure what was previously invisible.

Overcoming the Challenges: A CISO's Playbook in 6 Steps

CISOs can follow a structured, data-driven, step-by-step playbook to navigate the hurdles of modern cybersecurity and data protection. However, in today's dynamic data landscape, simply checking off boxes is no longer sufficient—leaders must understand how each critical data security measure interconnects, creating a unified, forward-thinking strategy.

Before diving into these steps, it's important to note why they matter now more than ever: Emerging data technologies, rapidly evolving data regulations, and escalating insider threats demand an adaptable, holistic, and data-centric approach to security. By integrating these core elements with robust data analytics, CISOs can build an ecosystem that addresses current vulnerabilities and anticipates future data risks.

1. First, Develop a Scalable Security Strategy 

A strategic security roadmap should integrate seamlessly with organizational goals and data governance frameworks, guaranteeing that risk management, data integrity, and business priorities align. 

Accurately classifying and continuously monitoring data assets, even as they move throughout the organization, is a must to achieve sustainable scale. This solid data foundation empowers organizations to quickly pivot in response to emerging threats, keeping them agile and resilient.

The next step is key, as the right mindset is a must.

2. Build a Security-First Culture

Equip employees with the knowledge and tools to secure data effectively; regular data-focused training sessions and awareness initiatives help reduce human error and mitigate insider threats before they become critical risks. By fostering a culture of shared data responsibility, CISOs transform every team member into a first line of defense. 

This approach ensures that everyone is on the same page toward prioritizing data security. 

3. Leverage Advanced Tools and Automation

Utilize state-of-the-art platforms for comprehensive data discovery, real-time monitoring, automation, and visibility. By automating routine security tasks and delivering instant data-driven insights, these features empower CISOs to stay on top of new threats and make decisions based on the latest data. 

Naturally, even the best tools and automation require a strategic, data-centric approach to yield optimal results.

4. Implement Zero-Trust Principles 

Implement a zero-trust approach that verifies every user, device, and data transaction, ensuring zero implicit trust within the environment. Understand who has access to what data, and implement least privilege access. Continuous identity and device validation boosts security for both external and internal threats. 

Positioning zero trust as a core principle tightens data access controls across the entire ecosystem, but organizations must remain vigilant to the most recent threats.

5. Evaluate and Update Cybersecurity Frameworks

Regularly assess security policies, procedures, and data management tools to ensure alignment with the latest trends and regulatory requirements. Keep a current data inventory, and monitor all changes. Ongoing reviews maintain relevance and effectiveness, preventing outdated defenses from becoming liabilities.

For optimal data security, cross-functional collaboration is key.

6. Encourage Cross-Departmental Collaboration

Work closely with other teams, including IT, legal, compliance, and data governance, to ensure a unified and practical approach to data security challenges. Cooperation among stakeholders accelerates decision-making, streamlines incident response, and underscores the importance of security as a shared enterprise objective.

By adopting this data-centric playbook, CISOs can strengthen their organization's security posture, respond to threats quickly, and reduce the likelihood and impact of breaches. Platforms such as Sentra provide robust, data-driven tools and capabilities to execute this strategy effectively, enabling CISOs to confidently handle complex cybersecurity landscapes.  When these steps intertwine, the result is a robust defense that adapts to the ever-shifting digital landscape - empowering leaders to stay one step ahead.

The Sentra Edge

Sentra is an advanced data security platform that offers the strategic insights and automated capabilities modern CISOs need to navigate evolving threats without compromising agility or compliance. Sentra integrates seamlessly with existing processes, empowering security leaders to build holistic programs that anticipate new risks, reinforce best practices, and protect data in real time.

Below are several key areas where Sentra's approach aligns with the thought leadership necessary to stay ahead of modern cybersecurity challenges.

Secure Structured Data

Structured data - in tables, databases, and other organized repositories, forms the backbone of an organization’s critical assets. At Sentra, we prioritize structured data management first and foremost, ensuring automation drives our security strategy. While securing structured data might seem straightforward, rapid data proliferation can quickly overwhelm manual safeguards, exposing your data. By automating data movement tracking, continuous risk and security posture assessments, and real-time alerts for policy violations, organizations can offload these burdensome yet essential tasks. 

This automation-first approach not only strengthens data security but also ensures compliance and operational efficiency in today’s fast-paced digital landscape.

Secure Unstructured Data

Securing text, images, video, and other unstructured data is often challenging in cloud environments. Unstructured data is particularly vulnerable when organizations lack automated classification and encryption, creating blind spots that bad actors can exploit.

 

In response, Sentra underscores the importance of continuous data discovery, labeling, and protection—enabling CISOs to maintain visibility over their dynamic cloud assets and reduce the risk of inadvertent exposure.

Navigate Complex Regulations

Modern data protection laws, such as GDPR and CCPA, demand rigorous compliance structures that can strain security teams. Sentra's approach highlights centralized governance and real-time reporting, helping CISOs align with ever-shifting global standards.

 

By automating repetitive compliance tasks, organizations can focus more energy on strategic security initiatives, ensuring they remain nimble even as regulations evolve.

Tackle Insider Threats

Insider threats—accidental and malicious—remain one of the most challenging hurdles for CISOs. Sentra advocates a multi-layered strategy that combines behavior analytics, anomaly detection, and dynamic data labeling; this offers proactive visibility into user actions, enabling security leaders to detect and neutralize insider risks early. 

Such a holistic posture helps mitigate breaches before they escalate and preserves organizational trust.

Be Prepared for Future Risks

AI-driven attacks and large language model (LLM) vulnerabilities are no longer theoretical—they are rapidly emerging threats that demand forward-thinking responses. Sentra's focus on robust data control mechanisms and continuous monitoring means CISOs have the tools they need to safeguard sensitive information, whether it's accessed by human users or AI systems. 

This outlook helps security teams adapt quickly to the next wave of challenges. By emphasizing strategic insights, proactive measures, and ongoing adaptation, Sentra exemplifies an industry-leading approach that empowers CISOs to navigate complex data security landscapes without losing sight of broader organizational objectives.

Conclusion

As new threat vectors emerge and organizations face mounting pressures to protect their data, the role of CISO will become even more critical. Addressing modern challenges requires a proactive and strategic approach, incorporating robust security frameworks, cutting-edge tools, and a culture of vigilance.

Sentra's platform is a comprehensive data security solution designed to empower CISOs with the tools they need to navigate this complex landscape. By addressing key hurdles such as AI risks, structured and unstructured data management, and compliance, Sentra enables companies to stay on top of evolving risks and safeguard their operations. The modern CISO role is more demanding than ever, but the right tools make all the difference. Discover how Sentra's cloud-native approach empowers you to conquer pressing security challenges.

<blogcta-big>

Read More
Yogev Wallach
Yogev Wallach
August 11, 2025
4
Min Read
AI and ML

How to Secure Regulated Data in Microsoft 365 Copilot

How to Secure Regulated Data in Microsoft 365 Copilot

Microsoft 365 Copilot is a game-changer, embedding generative AI directly into your favorite tools like Word, Outlook, and Teams, and giving productivity a huge boost. But for governance, risk, and compliance (GRC) officers and CISOs, this exciting new innovation also brings new questions about governing sensitive data.

So, how can your organization truly harness Copilot safely without risking compliance? What are Microsoft 365 Copilot security best practices?

Frameworks like NIST’s AI Risk Management and the EU AI Act offer broad guidance, but they don't prescribe exact controls. At Sentra, we recommend a practical approach: treat Copilot as a sensitive data store capable of serving up data (including highly sensitive, regulated information).

This means applying rigorous data security measures to maintain compliance. Specifically, you'll need to know precisely what data Copilot can access, secure it, clearly map access, and continuously monitor your overall data security posture.

We tackle Copilot security through two critical DSPM concepts: Sanitization and Governance.

1. Sanitization: Minimize Unnecessary Data Exposure

Think of Copilot as an incredibly powerful search engine. It can potentially surface sensitive data hidden across countless repositories. To prevent unintended leaks, your crucial first step is to minimize the amount of sensitive data Copilot can access.

Address Shadow Data and Oversharing

It's common for organizations to have sensitive data lurking in overlooked locations or within overshared files. Copilot's incredible search capabilities can suddenly bring these vulnerabilities to light. Imagine a confidential HR spreadsheet, accidentally shared too broadly, now easily summarized by Copilot for anyone who asks.

The solution? Conduct thorough data housekeeping. This means identifying, archiving, or deleting redundant, outdated, or improperly shared information. Crucially, enforce least privilege access by actively auditing and tightening permissions – ensuring only essential identities have access to sensitive content.

How Sentra Helps

Sentra's DSPM solution leverages advanced AI technologies (like OCR, NER, and embeddings) to automatically discover and classify sensitive data across your entire Microsoft 365 environment. Our intuitive dashboards quickly highlight redundant files, shadow data, and overexposed folders. What's more, we meticulously map access at the identity level, clearly showing which users can access what specific sensitive data – enabling rapid remediation.

For example, in the screenshot below, you'll see a detailed view of an identity (Jacob Simmons) within our system. This includes a concise summary of the sensitive data classes they can access, alongside a complete list of accessible data stores and data assets.

sentra dspm identity access

2. Governance: Control AI Output to Prevent Data Leakage

Even after thorough sanitization, some sensitive data must remain accessible within your environment. This is where robust governance comes in, ensuring that Copilot's output never becomes an unintentional vehicle for sensitive data leakage.

Why Output Governance Matters

Without proper controls, Copilot could inadvertently include sensitive details in its generated content or responses. This risk could lead to unauthorized sharing, unchecked sensitive data sprawl, or severe regulatory breaches. The recent EchoLeak vulnerability, for instance, starkly demonstrated how attackers might exploit AI-generated outputs to silently leak critical information.

Leveraging DLP and Sensitivity Labels

Microsoft 365’s Purview Information Protection and DLP policies are powerful tools that allow organizations to control what Copilot can output. Properly labeled sensitive data, such as documents marked “Confidential – Financial,” prompt Copilot to restrict content output, providing users only with references or links rather than sensitive details.

Sentra’s Governance Capabilities

Sentra automatically classifies your data and intelligently applies MPIP sensitivity labels, directly powering Copilot’s critical DLP policies. Our platform integrates seamlessly with Microsoft Purview, ensuring sensitive files are accurately labeled based on flexible, custom business logic. This guarantees that Copilot's outputs remain fully compliant with your active DLP policies.

Below is an example of Sentra’s MPIP label automation in action, showing how we place sensitivity labels on data assets that contain Facebook profile URLs and credit card numbers belonging to EU citizens, which were modified in the past year:

Additionally, our continuous monitoring and real-time alerts empower organizations to immediately address policy violations – for instance, sensitive data with missing or incorrect MPIP labels – helping you maintain audit readiness and seamless compliance alignment.

sentra mpip label automation sensitive data microsoft purview information protection automation

A Data-Centric Security Approach to AI Adoption

By strategically combining robust sanitization and strong governance, you ensure your regulated data remains secure while enabling safe and compliant Copilot adoption across your organization. This approach aligns directly with the core principles outlined by NIST and the EU AI Act, effectively translating high-level compliance guidance into actionable, practical controls.

At Sentra, our mission is clear: to empower secure AI innovation through comprehensive data visibility and truly automated compliance. Our cutting-edge solutions provide the transparency and granular control you need to confidently embrace Copilot’s powerful capabilities, all without risking costly compliance violations.

Next Steps

Adopting Microsoft 365 Copilot securely doesn’t have to be complicated. By leveraging Sentra’s comprehensive DSPM solutions, your organization can create a secure environment where Copilot can safely enhance productivity without ever exposing your regulated data.


Ready to take control? Contact a Sentra expert today to learn more about seamlessly securing your sensitive data and confidently deploying Microsoft 365 Copilot.

<blogcta-big>

Read More
decorative ball
Expert Data Security Insights Straight to Your Inbox
What Should I Do Now:
1

Get the latest GigaOm DSPM Radar report - see why Sentra was named a Leader and Fast Mover in data security. Download now and stay ahead on securing sensitive data.

2

Sign up for a demo and learn how Sentra’s data security platform can uncover hidden risks, simplify compliance, and safeguard your sensitive data.

3

Follow us on LinkedIn, X (Twitter), and YouTube for actionable expert insights on how to strengthen your data security, build a successful DSPM program, and more!