Ariel Rimon
Ariel is a Software Engineer on Sentra’s Data Engineering team, where he works on building scalable systems for securing and governing sensitive data. He brings deep experience from previous roles at Unit 8200, Aidoc, and eToro, with a strong background in data-intensive and production-grade systems.
Name's Data Security Posts


How Modern Data Security Discovers Sensitive Data at Cloud Scale
How Modern Data Security Discovers Sensitive Data at Cloud Scale
Modern cloud environments contain vast amounts of data stored in object storage services such as Amazon S3, Google Cloud Storage, and Azure Blob Storage. In large organizations, a single data store can contain billions (or even tens of billions) of objects. In this reality, traditional approaches that rely on scanning every file to detect sensitive data quickly become impractical.
Full object-level inspection is expensive, slow, and difficult to sustain over time. It increases cloud costs, extends onboarding timelines, and often fails to keep pace with continuously changing data. As a result, modern data security platforms must adopt more intelligent techniques to build accurate data inventories and sensitivity models without scanning every object.
Why Object-Level Scanning Fails at Scale
Object storage systems expose data as individual objects, but treating each object as an independent unit of analysis does not reflect how data is actually created, stored, or used.
In large environments, scanning every object introduces several challenges:
- Cost amplification from repeated content inspection at massive scale
- Long time to actionable insights during the first scan
- Operational bottlenecks that prevent continuous scanning
- Diminishing returns, as many objects contain redundant or structurally identical data
The goal of data discovery is not exhaustive inspection, but rather accurate understanding of where sensitive data exists and how it is organized.
The Dataset as the Correct Unit of Analysis
Although cloud storage presents data as individual objects, most data is logically organized into datasets. These datasets often follow consistent structural patterns such as:
- Time-based partitions
- Application or service-specific logs
- Data lake tables and exports
- Periodic reports or snapshots
For example, the following objects are separate files but collectively represent a single dataset:
logs/2026/01/01/app_events_001.json
logs/2026/01/02/app_events_002.json
logs/2026/01/03/app_events_003.json
While these objects differ by date, their structure, schema, and sensitivity characteristics are typically consistent. Treating them as a single dataset enables more accurate and scalable analysis.
Analyzing Storage Structure Without Reading Every File
Modern data discovery platforms begin by analyzing storage metadata and object structure, rather than file contents.
This includes examining:
- Object paths and prefixes
- Naming conventions and partition keys
- Repeating directory patterns
- Object counts and distribution
By identifying recurring patterns and natural boundaries in storage layouts, platforms can infer how objects relate to one another and where dataset boundaries exist. This analysis does not require reading object contents and can be performed efficiently at cloud scale.
Configurable by Design
Sampling can be disabled for specific data sources, and the dataset grouping algorithm can be adjusted by the user. This allows teams to tailor the discovery process to their environment and needs.
Automatic Grouping into Dataset-Level Assets
Using structural analysis, objects are automatically grouped into dataset-level assets. Clustering algorithms identify related objects based on path similarity, partitioning schemes, and organizational patterns. This process requires no manual configuration and adapts as new objects are added. Once grouped, these datasets become the primary unit for further analysis, replacing object-by-object inspection with a more meaningful abstraction.
Representative Sampling for Sensitivity Inference
After grouping, sensitivity analysis is performed using representative sampling. Instead of inspecting every object, the platform selects a small, statistically meaningful subset of files from each dataset.
Sampling strategies account for factors such as:
- Partition structure
- File size and format
- Schema variation within the dataset
By analyzing these samples, the platform can accurately infer the presence of sensitive data across the entire dataset. This approach preserves accuracy while dramatically reducing the amount of data that must be scanned.
Handling Non-Standard Storage Layouts
In some environments, storage layouts may follow unconventional or highly customized naming schemes that automated grouping cannot fully interpret. In these cases, manual grouping provides additional precision. Security analysts can define logical dataset boundaries, often supported by LLM-assisted analysis to better understand complex or ambiguous structures. Once defined, the same sampling and inference mechanisms are applied, ensuring consistent sensitivity assessment even in edge cases.
Scalability, Cost, and Operational Impact
By combining structural analysis, grouping, and representative sampling, this approach enables:
- Scalable data discovery across millions or billions of objects
- Predictable and significantly reduced cloud scanning costs
- Faster onboarding and continuous visibility as data changes
- High confidence sensitivity models without exhaustive inspection
This model aligns with the realities of modern cloud environments, where data volume and velocity continue to increase.
From Discovery to Classification and Continuous Risk Management
Dataset-level asset discovery forms the foundation for scalable classification, access governance, and risk detection. Once assets are defined at the dataset level, classification becomes more accurate and easier to maintain over time. This enables downstream use cases such as identifying over-permissioned access, detecting risky data exposure, and managing AI-driven data access patterns.
Applying These Principles in Practice
Platforms like Sentra apply these principles to help organizations discover, classify, and govern sensitive data at cloud scale - without relying on full object-level scans. By focusing on dataset-level discovery and intelligent sampling, Sentra enables continuous visibility into sensitive data while keeping costs and operational overhead under control.
<blogcta-big>

Cloud Security 101: Essential Tips and Best Practices
Cloud Security 101: Essential Tips and Best Practices
Cloud security in 2026 is about protecting sensitive data, identities, and workloads across increasingly complex cloud and multi-cloud environments. As organizations continue moving critical systems to the cloud, security challenges have shifted from basic perimeter defenses to visibility gaps, identity risk, misconfigurations, and compliance pressure. Following proven cloud security best practices helps organizations reduce risk, prevent data exposure, and maintain continuous compliance as cloud environments scale and evolve.
Cloud Security 101
At its core, cloud security aims to protect the confidentiality, integrity, and availability of data and services hosted in cloud environments. This requires a clear grasp of the shared responsibility model, where cloud providers secure the underlying physical infrastructure and core services, while customers remain responsible for configuring settings, protecting data and applications, and managing user access.
Understanding how different service models affect your level of control is crucial:
- Software as a Service (SaaS): Provider manages most security controls; you manage user access and data
- Platform as a Service (PaaS): Shared responsibility for application security and data protection
- Infrastructure as a Service (IaaS): You control most security configurations, from OS to applications
Modern cloud security demands cloud-native strategies and automation. Leveraging tools like infrastructure as code, Cloud Security Posture Management (CSPM), and Cloud Workload Protection Platforms helps organizations keep pace with the dynamic, scalable nature of cloud environments. Integrating security into the development process through a "shift left" approach enables teams to detect and remediate vulnerabilities early, before they reach production.
Cloud Security Tips for Beginners
For those new to cloud security, starting with foundational practices builds a strong defense against common threats.
Control Access with Strong Identity Management
- Use multi-factor authentication on every login to add an extra layer of security
- Apply the principle of least privilege by granting users and applications only the permissions they need
- Implement role-based access control across your cloud environment
- Regularly review and audit identity and access policies
Secure Your Cloud Configurations
Regularly audit your cloud settings and use automated tools like CSPM to continuously scan for misconfigurations and risky exposures. Protecting sensitive data requires encrypting information both at rest and in transit using strong standards such as AES-256, ensuring that even if data is intercepted, it remains unreadable. Follow proper key management practices by regularly rotating keys and avoiding hard-coded credentials.
Monitor and Detect Threats Continuously
- Consolidate logs from all cloud services into a centralized system
- Set up real-time monitoring with automated alerts to quickly identify unusual behavior
- Employ behavioral analytics and threat detection tools to continuously assess your security posture
- Develop, document, and regularly test an incident response plan
Security Considerations in Cloud Computing
Before adopting or expanding cloud computing, organizations must evaluate several critical security aspects. First, clearly define which security controls fall under the provider's responsibility versus your own. Review contractual commitments, service level agreements, and compliance with data privacy regulations to ensure data sovereignty and legal requirements are met.
Data protection throughout its lifecycle is paramount. Evaluate how data is collected, stored, transmitted, and protected with strong encryption both in transit and at rest. Establish robust identity and access controls, including multi-factor authentication and role-based access - to guard against unauthorized access.
Conducting a thorough pre-migration security assessment is essential:
- Inventory workloads and classify data sensitivity
- Map dependencies and simulate attack vectors
- Deploy CSPM tools to continuously monitor configurations
- Apply Zero Trust principles—always verify before granting access
Finally, evaluate the provider's internal security measures such as vulnerability management, routine patching, security monitoring, and incident response capabilities. Ensure that both the provider's and your organization's incident response and disaster recovery plans are coordinated, guaranteeing business continuity during security events.
Cloud Security Policies
Organizations should implement a comprehensive set of cloud security policies that cover every stage of data and workload protection.
| Policy Type | Key Requirements |
|---|---|
| Data Protection & Encryption | Classify data (public, internal, confidential, sensitive) and enforce encryption standards for data at rest and in transit; define key management practices |
| Access Control & Identity Management | Implement role-based access controls, enforce multi-factor authentication, and regularly review permissions to prevent unauthorized access |
| Incident Response & Reporting | Establish formal processes to detect, analyze, contain, and remediate security incidents with clearly defined procedures and communication guidelines |
| Network Security | Define secure architectures including firewalls, VPNs, and native cloud security tools; restrict and monitor network traffic to limit lateral movement |
| Disaster Recovery & Business Continuity | Develop strategies for rapid service restoration including regular backups, clearly defined roles, and continuous testing of recovery plans |
| Governance, Compliance & Auditing | Define program scope, specify roles and responsibilities, and incorporate continuous assessments using CSPM tools to enforce regulatory compliance |
Cloud Computing and Cyber Security
Cloud computing fundamentally shifts cybersecurity away from protecting a single, static perimeter toward securing a dynamic, distributed environment. Traditional practices that once focused on on-premises defenses, like firewalls and isolated data centers—must now adapt to an infrastructure where applications and data are continuously deployed and managed across multiple platforms.
Security responsibilities are now shared between cloud providers and client organizations. Providers secure the core physical and virtual components, while clients must focus on configuring services effectively, managing identity and access, and monitoring for vulnerabilities. This dual responsibility model demands clear communication and proactive management to prevent issues like misconfigurations or exposure of sensitive data.
The cloud's inherent flexibility and rapid scaling require automated and adaptive security measures. Traditional manual monitoring can no longer keep pace with the speed at which applications and resources are provisioned or updated. Organizations are increasingly relying on AI-driven monitoring, multi-factor authentication, machine learning, and other advanced techniques to continuously detect and remediate threats in real time.
Cloud environments expand the attack surface by eliminating the traditional network boundary. With data distributed across multiple redundant sites and accessed via numerous APIs, new vulnerabilities emerge that require robust identity- and data-centric protections. Security measures must now encompass everything from strict encryption and access controls to comprehensive logging and incident response strategies that address the unique risks of multi-tenant and distributed architectures. For additional insights on protecting your cloud data, visit our guide on cloud data protection.
Securing Your Cloud Environment with AI-Ready Data Governance
As enterprises increasingly adopt AI technologies in 2026, securing sensitive data while maintaining complete visibility and control has become a critical challenge. Sentra's cloud-native data security platform addresses these challenges by delivering AI-ready data governance and compliance at petabyte scale. Unlike traditional approaches that require data to leave your environment, Sentra discovers and governs sensitive data inside your own infrastructure, ensuring data never leaves your control.
Cost Savings: By eliminating shadow and redundant, obsolete, or trivial (ROT) data, Sentra not only secures your organization for the AI era but also typically reduces cloud storage costs by approximately 20%.
The platform enforces strict data-driven guardrails while providing complete visibility into your data landscape, where sensitive data lives, how it moves, and who can access it. This "in-environment" architecture replaces opaque data sprawls with a regulator-friendly system that maps data movement and prevents unauthorized AI access, enabling enterprises to confidently adopt AI technologies without compromising security or compliance.
Implementing effective cloud security tips requires a holistic approach that combines foundational practices with advanced strategies tailored to your organization's unique needs. From understanding the shared responsibility model and securing configurations to implementing robust access controls and continuous monitoring, each element plays a vital role in protecting your cloud environment. As we move further into 2026, the integration of AI-driven security tools, automated governance, and comprehensive data protection measures will continue to define successful cloud security programs. By following these cloud security tips and maintaining a proactive, adaptive security posture, organizations can confidently leverage the benefits of cloud computing while minimizing risk and ensuring compliance with evolving regulatory requirements.
<blogcta-big>
