Why You Must Protect Sensitive Data in Cloud and Data Platform Environments

Learn how to protect sensitive data in cloud environments (AWS, Azure, GCP) and managed platforms, including why traditional encryption and BYOK fall short, how BYOE works, and how to achieve true data security through separation of control.

Executive Summary

Enterprises have moved critical systems to cloud infrastructure (AWS, Azure, GCP) and consolidated sensitive data into centralized platforms such as Snowflake, Databricks, BigQuery, and Redshift. These environments now power applications, analytics, and AI. They also introduce a structural security problem that most organizations underestimate.

Sensitive data is no longer just stored in the cloud. It is continuously processed inside execution environments that organizations do not fully control.

Encryption is widely assumed to solve this. It does not.

In most architectures, data is encrypted at rest, keys are managed through cloud KMS or BYOK models, and the platform invokes those keys during execution. The moment data is queried, joined, or processed, it is transformed into usable form inside the same environment. The system behaves exactly as designed, and in doing so, becomes capable of revealing the data it is supposed to protect.

This is not a failure of cryptography. It is a failure of architecture.

The issue is simple. If the same environment can store data, access keys, and execute decryption, then that environment is a point of compromise.

This problem is most acute in cloud and managed environments where organizations still control data architecture but not execution, including CSP-based applications, integrator-operated systems such as Infosys-managed platforms, and modern data platforms. It is not primarily a SaaS problem. It is an execution control problem.

Regulators are beginning to reflect this shift. The distinction is no longer just key ownership (BYOK). It is whether the environment processing the data can independently reveal it. That distinction separates compliance posture from actual security.

This paper explains why this condition exists, how it manifests in real systems, and what it takes to remove it.

The Real Scope of the Problem

This paper is intentionally focused on environments where the organization still has meaningful control over architecture and data handling. That includes applications deployed on AWS, Azure, or GCP, managed or integrator-operated platforms such as Infosys-managed systems, and modern analytics or AI data platforms such as Snowflake and Databricks.

These are the environments that matter because they sit in an uncomfortable middle ground. They are not pure infrastructure in the old sense, because the organization does not directly own or operate every layer. But they are not fully closed SaaS either, because the organization still makes meaningful design decisions about how data moves, how it is queried, how users interact with it, and where controls are applied.

That middle ground creates false confidence. Teams think, correctly, that they control schemas, queries, pipelines, and access policies. They then infer, incorrectly, that this means they control the exposure of sensitive data during execution. They do not.

In a cloud-native application or data platform, logical control and execution control are different things. The customer defines intent. The platform executes that intent. During execution, the platform must gain access to usable data. The system, not the customer, becomes the point at which sensitive data is transformed from protected form into plaintext.

That is why this problem is so dangerous. It hides inside normal system behavior.

Why Traditional Encryption Does Not Remove the Risk

Most organizations start from a familiar mental model. Encrypt the data, control the keys, enforce role-based access, and the problem is solved. That model works well for protecting data at rest or in transit. It does not fully work when the same environment that stores the data is also responsible for using it.

The reason is straightforward. Computation requires data to be usable. Queries cannot run on ciphertext unless the system is specifically designed for that mode of operation. Dashboards, joins, machine learning pipelines, ETL workflows, customer-facing applications, and reporting engines all expect the platform to access usable values at some point during processing.

This means that the execution environment needs some combination of the following:

  1. access to the protected data,
  2. access to a key or key usage path,
  3. the ability to decrypt or transform the data into usable form.

Once those three conditions exist inside the same environment, encryption stops being a control against that environment. It remains a control against outsiders or unauthorized storage access, but it no longer protects data from the system that is legitimately designed to use it.

That is why so many cloud security conversations get stuck in the wrong place. They focus on whether data is encrypted, when the more important question is whether the execution environment can independently reveal it.

BYOK, BYOE, and the Difference Between Ownership and Separation

This distinction is why regulatory language around cloud security has become more nuanced. A useful way to frame the issue is the difference between key ownership and control separation.

BYOK improves ownership. The enterprise originates or controls the keys, which is helpful for governance, audit posture, lifecycle management, and in some cases revocation. But in most implementations, BYOK does not move decryption outside the cloud or data platform execution path. The platform can still use those keys, directly or indirectly, to process data. In practical terms, the execution environment still has access to both the data and the decryption mechanism.

BYOE points in a stronger direction. Encrypting data before it enters the cloud or platform begins to create actual separation. The cloud provider or platform no longer receives plaintext by default. But even BYOE can stop short if decryption is later reintroduced inside the same environment through application workflows, API mediation, or loosely governed service paths.

The lesson is simple. Owning the keys is not the same thing as preventing the platform from using them. And moving encryption earlier in the data flow is not the same thing as guaranteeing that plaintext never becomes available inside the environment.

The only model that fully changes the risk equation is one in which the environment processing the data cannot independently decrypt it.

Threat Modeling the System Properly

A useful threat model does not begin with a laundry list of generic attacker ideas. It begins with an explicit model of the system, its trust zones, and the boundary crossings that matter. Once the structure of the system is clear, the threat paths become much easier to reason about.

System Model

The system in scope is a modern cloud and data platform environment in which sensitive structured data is stored and used for operational or analytical purposes. The relevant components are the storage layer, the execution layer, the key control layer, and the user and service interfaces that trigger workloads.

The storage layer contains the protected records. The execution layer includes query engines, compute clusters, application runtimes, and processing jobs. The key control layer includes KMS services, external key providers, encryption policy services, or other mechanisms used to authorize transformation of data into usable form. Above all of that sit the interfaces through which users, applications, services, and administrators interact with the platform.

This relationship can be represented at a high level as follows:

LayerFunctionTypical ExamplesSecurity Relevance
Data Storage LayerStores protected recordsS3, cloud databases, Snowflake storage, Delta LakeHolds high-value data at scale
Execution LayerRuns queries, jobs, joins, transforms, model pipelinesDatabricks compute, Snowflake query engine, application runtimeWhere plaintext is exposed during use
Key Control LayerAuthorizes or enables transformation into usable formKMS, BYOK integration, external key service, policy engineDetermines whether decryption can occur
Access and Control LayerTriggers workloads, administers systems, configures policiesSQL clients, APIs, notebooks, IAM roles, support toolsDefines who can act through the system

The most important observation is that these layers are logically separate but operationally linked. During execution, they converge.

Assets

The threat model is about protecting specific things of value, not abstract “data.”

AssetDescriptionWhy It Matters
Sensitive FieldsPII, PAN, account numbers, balances, regulated financial recordsDirect regulatory and business impact if exposed
Derived Sensitive ResultsQuery outputs, aggregates, joined datasets, model featuresOften more revealing than source data alone
Key Usage AuthorityAbility to invoke decryption or transformation operationsEquivalent to decryption power in many systems
Access PoliciesRules that determine which identities may access usable dataWeak policy turns control separation into fiction
Processing ContextRuntime memory, temporary files, intermediate tables, caches, logsCommon leakage points during legitimate execution

These assets matter because attackers do not always need the original table. In many cases, intermediate outputs, cached results, enriched datasets, or repeated query access are just as damaging.

Trust Zones

The system must be divided into trust zones to show where responsibility and control actually differ.

Trust ZoneControlled ByWhat Lives ThereWhy the Zone Matters
Customer Logic ZoneEnterpriseSchemas, query intent, application code, business rulesDefines what should happen
Platform Execution ZoneCSP, platform provider, shared runtimeQuery engine, compute nodes, application execution, temporary stateDefines what actually happens during processing
Key Authority ZoneCustomer, cloud provider, or external control serviceKey material, key invocation path, policy serviceDetermines whether data can become usable
Administrative / Operational ZoneInternal ops, vendor ops, support teams, automationDebug access, notebooks, consoles, support tooling, orchestration systemsCommon source of privileged exposure paths

The trust model begins to break down when the Platform Execution Zone and the Key Authority Zone are effectively fused during runtime. That fusion can happen even if the customer “owns” the keys on paper.

Trust Boundaries

A threat model becomes useful when it shows the exact boundaries where risk changes.

Boundary IDBoundary CrossingWhat Changes at This PointWhy It Is Critical
B1Protected data enters execution workflowStored data becomes available to processing engineProcessing context is created
B2Execution environment invokes key usageSystem gains ability to transform data into usable formDecryption authority is activated
B3Plaintext or usable values exist in runtimeSensitive values become observable in memory, temp state, output buffers, logs, cachesExposure becomes possible without “breaking” encryption
B4Results are returned or materializedSensitive or derived sensitive data exits core processing pathAmplifies blast radius through output channels
B5Administrative tools interact with runtime or outputOperators can inspect state, replay jobs, or observe resultsPrivileged access path becomes high impact

These boundaries matter more than static architecture diagrams because they show where control is lost. Data is not compromised merely because it is stored in the cloud. It becomes vulnerable when the system crosses from protected storage into usable execution.

Threat Actors and Their Real Capabilities

A useful model also needs realistic attacker types. Not every actor needs to steal keys or compromise the entire stack. Many only need to exploit the fact that the system is already designed to reveal plaintext during normal operation.

Threat ActorRealistic CapabilitiesWhat They Usually NeedWhy They Are Dangerous
Privileged Internal UserCan run broad queries, access notebooks, inspect outputsLegitimate platform accessCan extract large amounts of sensitive data without breaking controls
Platform Operator or IntegratorCan troubleshoot workloads, inspect execution state, access support toolingOperational privilegesMay observe or access data during processing
Cloud Administrator EquivalentCan influence infrastructure, storage, snapshots, runtime state, or service behaviorInfra-level privilege or compromiseCan potentially access system behavior below normal app controls
Compromised Workload or Service PrincipalCan execute code inside trusted environmentApplication compromise, notebook compromise, stolen tokenCan use the system as designed to obtain plaintext
External Attacker with Partial AccessCan pivot from app/API, abuse permissions, exfiltrate outputsAccess to workload, credentials, or misconfigured roleOften needs far less than full system takeover

This is why “but our admins are trusted” is not a complete answer. The issue is not just intent. It is concentration of capability.

Threat Scenario Matrix

The following matrix is the core of the threat model. It ties together the system, the actors, the trust boundaries, and the actual failure modes.

Scenario IDThreat ScenarioPrimary ActorBoundary CrossedWhat the Actor ExploitsResultWhy Encryption Alone Fails
T1Legitimate query returns sensitive plaintext or sensitive derived resultsPrivileged internal userB1, B2, B3, B4Normal query path and authorized executionExposure of raw or derived sensitive dataThe platform decrypts during normal processing
T2Operator or integrator inspects runtime, temp state, or job outputs during support activityPlatform operator / integratorB2, B3, B5Operational tooling and privileged accessPlaintext exposure during debugging or supportData is already usable inside execution layer
T3Compromised notebook, job, or service principal runs inside trusted platform contextCompromised workloadB1, B2, B3, B4Existing execution permissions and platform trustBulk data extraction or silent exfiltrationEncryption does not protect against trusted runtime
T4Mis-scoped role or policy allows broad access to decrypted outputsInternal user or compromised identityB2, B4Over-permissioned access policyLarge-scale exposure through legitimate interfacesKey ownership is irrelevant if policy allows broad use
T5Platform or cloud-level access observes memory, snapshots, caches, or intermediate stateCloud or platform privileged actorB3, B5Infra or operational visibility below app layerExposure of plaintext during processingPlaintext must exist somewhere to enable computation
T6Sensitive output is materialized into downstream tables, features, dashboards, or logsInternal user, service, or downstream systemB4Normal output channels and secondary storagePersistent spread of sensitive informationEncryption at source does not constrain derived outputs
T7Attacker compromises application path that can request and return sensitive dataExternal attacker with partial footholdB1, B2, B4Existing app logic and decryption workflowData theft without key theftThe app acts as the decryption proxy

The pattern is consistent. The attacker does not usually need to “steal the keys” in the traditional sense. The attacker only needs access to the system or workflow that can already use them.

Threat Analysis and Architectural Findings

The value of the matrix is not the number of scenarios. It is the pattern it reveals.

First, the execution environment is the decisive control point. That is where protected data is transformed into usable data. If that environment can independently perform that transformation, then it becomes the point of compromise for nearly every realistic scenario.

Second, the system’s normal features are the attacker’s best tools. Queries, notebooks, dashboards, processing jobs, support consoles, and runtime diagnostics all exist for legitimate business reasons. That is precisely why they are so hard to defend once the platform has decryption authority.

Third, output paths are often more dangerous than source tables. Sensitive information rarely remains in one place. Once decrypted data is queried, joined, aggregated, or materialized, it proliferates into downstream results, temporary tables, feature stores, logs, dashboards, extracts, and caches. This means the blast radius is often much larger than teams expect.

Fourth, policy alone is not enough. Identity and access management are necessary, but they do not fix the core problem if the system itself can reveal plaintext broadly whenever policy allows it. A single over-broad entitlement, a compromised token, or an abused runtime becomes sufficient to expose data at scale.

The threat model therefore points to one architectural conclusion: as long as the same environment both accesses the data and can independently decrypt or transform it into usable form, that environment remains a systemic point of compromise.

What Security Properties Are Actually Required

The answer is not “more encryption” in the abstract. The answer is a system design that changes the threat model itself.

A secure architecture for cloud and data platforms needs to satisfy several specific properties.

First, data should enter shared environments in protected form. This reduces dependence on platform-native storage controls as the primary line of defense.

Second, the authority to reveal usable values should not be embedded in the same environment that stores and processes the data. That authority must be separated, tightly scoped, and externally governed.

Third, access to usable values should depend on explicit identity and policy decisions, not merely on the fact that a workload is running inside a trusted platform context.

Fourth, the architecture should minimize downstream spread by limiting when, where, and for whom sensitive values become available.

These principles can be expressed more concretely as follows:

Required Security PropertyWhy It MattersWhat It Prevents
Separation of data from decryption authorityRemoves unilateral exposure capability from platformT1, T2, T3, T5, T7
Identity- and policy-bound reveal controlPrevents generic platform context from granting plaintext accessT1, T3, T4, T7
Scoped reveal with minimal blast radiusLimits downstream spread and output misuseT4, T6
Externalized control over usable value accessEnsures the platform cannot act as universal decryption engineT2, T3, T5
Reduced processing-time plaintext exposureShrinks runtime observation opportunitiesT2, T5

This is where the conversation moves from compliance language to real architecture. The question is not merely whether the key is customer-owned. The question is whether the platform can still use that key path in ways that make plaintext broadly available.

Why This Matters Most in Snowflake, Databricks, and Similar Platforms

The reason this issue is especially important in data platforms is that they centralize both value and access. These systems do not hold isolated application rows. They hold the institution’s aggregated analytical truth. Customer profiles, account history, transactions, model features, service interactions, and operational telemetry all come together in one place.

That concentration is what makes these platforms powerful. It is also what makes their execution environments so dangerous if they are allowed to function as universal decryption engines.

A compromise in a cloud-hosted application may expose one workflow or one table. A compromise in a central data platform can expose years of historical records, sensitive joins across domains, high-value derived insights, and AI or analytics artifacts that are even more revealing than the original data.

That is why treating Snowflake or Databricks as “just another encrypted system” is a mistake. They are not just systems of storage. They are systems of reveal.

Regulatory Perspective: MAS Cloud Advisory and the Limits of Traditional Encryption Models

Regulators are already moving in the direction this paper outlines. The Monetary Authority of Singapore (MAS), in its Cloud Advisory on Migration, Risk Management, and Data Security, makes it clear that financial institutions cannot assume that native cloud controls are sufficient simply because encryption is in place.

The guidance repeatedly emphasizes that sensitive data in cloud environments must be protected through a combination of controls, including encryption, tokenization, and strong key management. More importantly, it implicitly recognizes that how encryption is implemented matters just as much as whether it exists.

This becomes clear in MAS’ treatment of different key management and encryption models.

BYOK: Improved Governance, Not Separation

MAS identifies Bring Your Own Key (BYOK) as a model where the financial institution retains ownership and lifecycle control of cryptographic keys, while allowing those keys to be used within the cloud environment.

This improves governance in meaningful ways. It gives institutions control over:

  • key generation and rotation
  • revocation and lifecycle management
  • audit and compliance posture

However, BYOK does not change where decryption occurs.

In a typical BYOK implementation:

  • data remains stored in the cloud platform
  • the platform can request or invoke key usage
  • decryption occurs inside the execution environment

From an architectural standpoint, this means the same condition still exists:

the environment that processes the data can also decrypt it

Ownership of keys has changed. Control separation has not.

BYOE: A Step Toward Separation

MAS introduces Bring Your Own Encryption (BYOE) as a stronger model. In this approach, data is encrypted before it enters the cloud, and the encryption mechanism is not delegated to the cloud provider.

This begins to change the structure of the system.

Instead of relying on the cloud to protect data, the organization ensures that:

  • the cloud receives data in protected form
  • encryption is applied outside the execution environment
  • key material is not inherently exposed to the platform

At a high level, the progression MAS is pointing to can be summarized as follows:

ModelWhat ChangesWhat Remains
Native Cloud EncryptionBasic protection at restCloud controls keys and decryption
BYOKCustomer owns keysCloud still performs decryption
BYOEEncryption moves outside cloudDecryption may still occur inside execution

Where These Models Still Fall Short

Even with BYOE, most real-world implementations do not fully eliminate risk.

In practice:

  • applications and data platforms still need to process usable data
  • decryption is often reintroduced inside the runtime
  • key usage may still be accessible through system-integrated APIs

This means the execution environment can still function as a de facto decryption engine, even if encryption was applied earlier in the data flow.

The structural condition remains unchanged:

the system that processes the data can still reveal it

What MAS Is Actually Driving Toward

MAS does not explicitly prescribe a single architecture, but the direction is clear.

The evolution is not just:

  • from provider-managed keys to customer-managed keys
  • or from in-cloud encryption to pre-cloud encryption

It is toward true separation of control.

The real question MAS is pushing institutions to answer is:

Can any single environment, including the cloud or data platform, independently access and decrypt sensitive data?

If the answer is yes, then that environment remains a point of compromise, regardless of how strong the encryption appears on paper.

Why This Matters in Practice

This distinction becomes critical in environments such as AWS-based applications, Snowflake, and Databricks, where:

  • data is continuously processed, not just stored
  • decryption is part of normal execution
  • multiple actors can act through the same system

In these environments, encryption without separation does not prevent exposure. It simply defines how the system reveals data.

Summary

MAS guidance reinforces a key principle: Encryption is necessary, but it is not sufficient.

BYOK improves ownership.
BYOE improves placement.

But neither guarantees that the execution environment cannot act as a universal decryption point.

The only model that fully addresses the risk is one in which:

  • data can be processed by the platform
  • but the platform cannot independently reveal it

This is the difference between managing encryption and actually controlling data exposure.

Conclusion

Cloud and modern data platforms have changed the risk model for sensitive data. The main problem is no longer whether stored data is encrypted. The main problem is whether the environment processing that data can independently reveal it.

Threat modeling makes the issue unambiguous. Across realistic actors and realistic scenarios, the same pattern appears again and again. Data enters an execution environment. That environment gains or invokes decryption authority. Plaintext or usable values become available during runtime. An actor who can act through that environment can access the data without breaking cryptography.

That is the structural weakness.

The path forward is not to abandon cloud or analytics platforms. It is to adopt an architecture in which those environments are no longer trusted to function as universal reveal points for sensitive data. Once that condition is removed, the threat model changes fundamentally. Until it is removed, encryption alone will remain an incomplete answer.

The difference between strong cloud security and weak cloud security is not the presence of encryption. It is whether the environment processing the data can expose it on its own.


© 2026 Ubiq Security, Inc. All rights reserved.