Introduction to Concepts

As you embark on your journey with Ubiq, you’ll encounter various terms, methodologies, and concepts that form the backbone of our solution and data encryption in general. To help you better understand and navigate through these, we’ve created this “Concepts” section. Here, we’ve explained core ideas like structured and unstructured data, encryption keys, key rotation, and data re-keying, among others. These concepts will be referenced throughout our documentation and your user experience. Familiarizing yourself with them will enable you to leverage Ubiq to its full potential and ensure your data is encrypted, stored, and managed securely and efficiently. Happy exploring!

Structured Data

Structured data refers to any data that is organized and formatted in such a way that it's easily searchable by simple, straightforward search engine algorithms or other search operations.

At its core, structured data is data that is arranged according to a defined model or schema. It is typically tabular with columns and rows where each column represents a certain variable (like an ID, name, or timestamp), and each row corresponds to a certain record.

Common examples of structured data include relational databases (like SQL), where data is organized into tables, and CSV files, where data is presented in clear, delimited text format. Structured data is great for queries that need precise, complex conditionals because the schema is consistent across all records, allowing for accurate and speedy data retrieval.

While structured data is highly organized and easily searchable, it does come with its limitations. One of the most important constraints developers often face is related to the specific character and length limitations of structured data. This is largely due to the nature of how and where the data is stored.

For instance, if a database column is configured to only support 12 characters, any data that exceeds this limit cannot be stored in that column. This restriction necessitates careful planning and structuring of data, especially when dealing with larger text fields or unique identifiers.

Let's delve into a few real-world examples of structured data, focusing on sensitive information that requires careful handling and robust security measures.

Credit Card Information: This includes data like card number, cardholder name, expiry date, and CVV. Given the highly sensitive nature of this information, it is often stored in highly structured, encrypted formats. For example, a card number field in a database might have a specific character limit to accommodate a standard 16-digit credit card number.
Customer Information: This is another form of structured, sensitive data which includes fields like name, address, contact number, and email. Each of these fields will have their own unique data and character limitations, depending on the system's specifications.
Health Records: Medical records often contain structured data such as patient IDs, timestamps for visit, diagnosis codes, treatment codes, and more. Each of these fields requires stringent structuring rules, including character limitations, to ensure accurate, secure, and efficient data management.

In each of these cases, structured data allows for easy data querying and manipulation while also facilitating strict control over data format and size, which is especially important when handling sensitive data.

Due to nature of structured data, we recommend using structured data encryption, which leverages deterministic encryption techniques to protect the sensitive data, while still making the data searchable.

Guidance on Encrypting Numeric Types

When encrypting numeric fields like date/time data, there are inherent cryptographic and mathematical challenges due to the limited number of possible values. These concerns are rooted in the nature of encryption and data structure and addressing this requires special techniques when transforming and encrypting date and time fields into ciphertext. The small, predictable range of numeric fields like dates makes them more susceptible to brute-force or pattern analysis attacks, which can weaken the encryption’s effectiveness. This is a critical issue because:

Limited Range of Values: Numeric fields like dates and times typically have a small, predictable range (e.g., a year has only 365 or 366 days, and a timestamp may only span a few years). This limited range makes it easier for an attacker to perform brute-force or frequency analysis attacks to decrypt the data or infer patterns. For example, if you encrypt a set of dates from a few months or years, the number of possible outputs is finite and small, which weakens the cryptographic strength of the encryption.
Length of Data: Another related issue is the short length of the numeric data. Cryptographic algorithms, especially block ciphers like AES, typically operate on fixed-length blocks (128-bit, for example). When encrypting short numeric fields (like a 10-digit date or time), you need to ensure that padding or format-preserving techniques don’t leak information or result in insufficient randomness, which could undermine the encryption’s security.

Datetimes

Conceptually, datetimes can’t be encrypted as-is because 1) input and output has to be all numeric and 2) ciphertext has to be a valid date format (can’t be month 42, day 53). A practical solution is to convert the datetime into an epoch (timestamp) and then use a base12 output character set for encryption.

Input: We start with a 10-digit timestamp (epoch) like 0123456789, which represents a date and time.
Output: After encryption, this input becomes string like ab12345678, using base12. This allows for 6 key rotations, and the 10-digit epoch will cover dates up to November 20, 2286.
- We can’t use base13 because the largest possible value (cccccccccc) would convert to a future date (epoch 42423843051) that most databases can’t store—typically limited to December 31, 9999.
- Similarly, we can’t use an 11-digit epoch. Although an 11-digit epoch maxes out at 99999999999 (a date of November 16, 5138), its ciphered value (bbbbbbbbbbb, 11 b’s) would convert to a date far in the future (epoch 743008370687, or December 31, 25514), which exceeds what most databases can handle.
The largest valid cipher for a 10-digit string is bbbbbbbbbb (10 b’s), which converts back to 61917364223—representing a date of February 1, 3932, and is within the range that most databases support.

❗️
This method will only work for dates up to November 20, 2286, which corresponds to the maximum epoch value of 9999999999 for 10 digits.

Theoretically if we’re using a timestamp or datetime that stores milliseconds, we could lose precision to increase output string, but that’s a separate consideration.

// encrypt
// 1. convert datetime to epoch (seconds)
// 2. encrypt epoch (will get alphanum string)
// 3. convert encrypted string from base12 to base10
// 4. convert the base10 string from epoch to datetime (may be 10 or 11 digits)
// 5. store that datetime in the database

// string of datetime that i want to store in the database
string datetime_string = "8/21/2024 10:24:00";

// get the epoch in *seconds* for my variable datetime_string
// 1724235840
// this should be 10 digits (if the date is before 11/20/2286, where they go to 11-digit epochs)
string datetime_epoch (new java.text.SimpleDateFormat("MM/dd/yyyy HH:mm:ss").parse(datetime_string).getTime() / 1000).toString();

// encrypt that epoch and get a ciphertext
// this uses a structured dataset with
//      input characters 0123456789
//      minimum length 10
//      output characters 0123456789ab
// ciphertext will be something like a098b124aa (will be 10 characters)
string datetime_encrypted = UbiqFPEEncryptDecrypt.encryptFPE(ubiqCredentials, "DATETIME_DATASET_NAME", datetime_epoch, null);

// convert the ciphertext from base12 to base10 to get
// this gives us a new number that we can treat as an epoch (may be up to 11 digits, with a max date of 2/1/3932)
// a098b124aa base12 = 51946939714 base10
string datetime_encrypted_epoch = Integer.toString(Integer.parseInt(datetime_encrypted, 12), 10);

// put that new epoch into a datetime to store into the database
// 51946939714 epoch = 3616-02-18 17:28:34
Date datetime_encrypted_as_datetime = new Date(Long.parseLong(datetime_encrypted_base10));


// decrypt would do the reverse
// 1. retrieve data from database
// 2. convert datetime to epoch (seconds)
// 3. left-pad the epoch with zeros up to 10 digits
// 4. convert the resulting number (with leading zeros)  from base10 to base12 (will get alphanum string)
// 5. decrypt base12 string
// 5. convert the decrypted epoch to datetime

Dates

Conceptually, dates work similarly to datetimes. For date-typed columns in SQL engines, however, the numeric range is even smaller as they only store 1/1/0000 through 12/31/9999. This limits the working data size to only 3652424 available days. As such, dates also cannot be encrypted as-is because 1) input and output has to be all numeric and 2) ciphertext has to be a valid date format (can’t be month 42, day 53). A practical solution is to convert the date into a "number of days from the minimum" similar to an epoch and then use a base12 output character set for encryption.

Even with this approach, however, the maximum key rotations does not meet our best-practice minimum. As such, this guidance can only be applied with administrative support from Ubiq to bypass the minimum key rotation validation.

❗️
Date encryption cannot support key rotations due to the limited number set available for dates. This is against our minimumkey rotation guidance and should be considered against security best-practices.

With that disclaimer, the process is similar to that of datetimes above:

Input: we start with a 6-digit from_days of the date starting from 1/1/0000 - this is a maximum date of November 27, 2737 or 999999 days from 1/1/0000
If your starting date is less than 6 digits, left-pad with 0s to 6 characters
Output: After encryption, this input becomes a 6-character string like b012345 in base12. This does not allow for any key rotations and as such will not pass validation in the dataset creation UI and requires administrative support
- We can’t use base13 because the largest possible value (cccccc) would convert to a date beyond the max SQL date of December 31, 9999.
- Similarly, we can’t use an 7-digit number of days (9999999) because the maximum cipher that would create (bbbbbbb) would also convert to a date beyond December 31, 9999.
The largest valid cipher for a 6-digit string is bbbbbb (6 b’s), which converts back to 2985984—representing a date of May 7, 8175, and is within the range that most databases support.

❗️
This method will only work for dates up to November 27, 2737, which corresponds to the maximum 999999 days from January 1st, 0000.

// encrypt
// 1. convert date to days from 1/1/0000
// 2. left-pad the resulting number up to 6 digits with zeros
// 3. encrypt numerical result
// 4. convert encrypted string from base12 to base10 (this may be 6 or 7 digits)
// 5. convert the base10 string from numeric to days from 1/1/0000
// 6. store that date in the database

LocalDate EPOCH = LocalDate.of(0, 1, 1);
LocalDate endDate = LocalDate.of(2025, 05, 28);

// Julian Date - In Base 10
int b10Days = (int)ChronoUnit.DAYS.between(EPOCH, endDate);

// PT of the Julian Date
String pt = String.format("%06d", b10Days);

// Encrypted Julian Date - In Base 12
// this uses a structured dataset with
//      input characters 0123456789
//      minimum length 6
//      output characters 0123456789ab (base12)
// ciphertext will be something like b135a7 (will be 6 characters)
String ct = ubiqEncryptDecrypt.encrypt("BIRTH_DATE_JULIAN", pt, tweak); 
      
// Encrypted Julian Date
LocalDate ctDate = EPOCH.plusDays(Integer.parseInt(ct, 12));



// decrypt does the reverse
// 1. retrieve date from database and convert to a number (days from 1/1/0000, may be 6 or 7 digits)
// 2. convert that number to base12 (will get alphanum string)
// 3. left-pad the resulting number up to 6 digits with zeros
// 4. decrypt the base12 string
// 5. convert the resulting decrypted number to days from 1/1/0000

// Reverse the process - Extract the Encrypted Julian Date 
int daysBetween = (int)ChronoUnit.DAYS.between(EPOCH, ctDate);

// Convert the Encrypted Julian Date back to a Base12 String
String b12 = Integer.toString(Integer.parseInt(Integer.toString(daysBetween), 10), 12).toUpperCase();

// Pad the Encrypted Julian date to 6 digits, with leading zeros
String b12Padded = "000000".substring(b12.length()) + b12;

// Decrypt the Base12 Julian String
String pt2 = ubiqEncryptDecrypt.decrypt("BIRTH_DATE_JULIAN", b12Padded, tweak);

// Calculate the Ending date based on the decrypted Julian Date as a base 10 string
LocalDate endDateAfter = EPOCH.plusDays(Integer.parseInt(pt2));

// Make sure the original date and the decrypted version of the date match
assertEquals(endDate, endDateAfter);

Integers

Integers, as you would expect, work similrly to dates and datetimes. For this guidance, we assume an INT4 (32 bit signed) integer with a maximum value of 2,147,483,647 (and equivalent negative minimum value.) In their natural form, integers also cannot be encrypted as-is because input and output would have to be all numeric. This solution restricts the integer range in order to allow radix conversion post-encryption while staying within the range of an INT4 value.

With the below approach, 14 key rotations are allowed. Theoretically, a larget integer range can be achieved at the expense of key rotations, but our best practice guidance requires a minimum of 4 key rotations. If you require a larger integer range, an INT8 type can be used to store more characters, or you can work with Ubiq support to create a dataset that does not meet the minimum key rotation requirement.

The dataset to support this method is a bit different than datetime and date in order to preserve the negative (-) ... it should use:

Input: 0123456789
Output: 0123456789abcd
Pass-through character: -

With that context, the process to achieve integer encryption is:

Input: convert the integer has to a string
- Left-pad the string with zeros to 8 characters after the negative symbol_ (not including the dash - 8 digits)
Output: After encryption, this input becomes an 8-character string like 0123456789abcd, using base14. It may have a negative symbol at the beginning, which will be preserved. This allows for 14 key rotations.
- We can’t use base15 or larger because the largest possible value because the largest possible value (eeeeeeee) would convert to number beyond the range of an INT4

The decryption process is the reverse:

Convert the encrypted integer from base10 to base14
- If the number is negative, do NOT include that in the conversion and retain the negative symbol
- Left-pad the base14 string to 8 characters with zeros (not including the negative symbol)
Decrypt the string
Convert the decrypted string to base10 (preserving the negative symbol)

❗️
This method will only work for integers within the range of -99,999,999 to 99,999,999 (8 digits.)

// encrypt
// 1. left-pad the interger with zeros up to 8 digits (not including the negative sign)
// 2. encrypt numerical result
// 3. convert encrypted string from base14 to base10
// 4. store that number in the database

// remove the negative symbol if it has one (will add back in later)
bool is_negative = (my_integer < 0);

if (is_negative) {
	my_integer = Math.abs(my_integer);
}

// pad left up to 8 characters
string my_string_integer = String.format("%0" + 8 + "d", number);

// add the dash back in if it was negative
if (is_negative) {
	my_string_integer = "-" + my_string_integer;
}

// encrypt that number and get a ciphertext
// this uses a structured dataset with
//      input characters 0123456789
//      minimum length 8
//      output characters 0123456789abcd (base14)
//		  pass-through character of -
// ciphertext will be something like d13512a7 (will be 8 characters)
// ciphertext will retain the leading - if it is provided
string integer_encrypted = UbiqFPEEncryptDecrypt.encryptFPE(ubiqCredentials, "INTEGER_DATASET_NAME", my_string_integer, null);

// retain the leading dash from the ciphertext (make the resulting number negative) if it exists
bool cipher_is_negative = (integer_encrypted.charAt(0) == '-');

// remove it from the ciphertext so we can base convert it to digits
if (chipher_is_negative) {
	integer_encrypted = integer_encrypted.substring(1);
}

// convert the ciphertext from base14 to base10
// this gives us a new number
string integer_encrypted_numeric = Integer.toString(Integer.parseInt(integer_encrypted, 14), 10);

// make it negative if it started negative
if (chipher_is_negative) {
  integer_encrypted_numeric *= -1;
}

// integer_encrypted_numeric is now the *encrypted* integer representation of the original my_integer

// decrypt would do the reverse
// 1. retrieve integer from database
// 2. left-pad the resulting number up to 8 digits with zeros (not including the dash if negative)
// 3. convert that number to base14 (will get alphanum string, make sure to retain the dash if it is negative)
// 4. decrypt the base14 string
// 5. convert to a numeric (retain the dash for negative)

Unstructured Data

Unstructured data refers to data that doesn't conform to a specific data model or isn't organized in a pre-defined way. This data can be either textual or non-textual and is generally more challenging to process, analyze, and interpret than structured data due to its irregular and complex formats.

Examples of unstructured data commonly found in a business setting include:

Business Documents: Files like Word documents, PDFs, and Excel spreadsheets (particularly when used for non-tabular data or mixed content) represent unstructured data. For instance, a company policy document in a PDF format is unstructured data - while the content is valuable, it isn't readily searchable or analyzable without additional processing.
Images and Multimedia: This category includes graphics used for business branding, images within reports, or promotional videos. While these files contain valuable information, they don't adhere to a conventional, structured data model.
IoT Sensor Data: Internet of Things (IoT) devices generate a large amount of unstructured data. For example, a weather station might produce data about temperature, humidity, wind speed, and more. This data, while incredibly valuable for trend analysis and prediction, is considered unstructured because it doesn't adhere to a pre-defined structure. Typically, this data must undergo processing to transform it into a structured form for easier analysis.
Medical Imaging Data: In the healthcare sector, medical images like X-rays, MRIs, and CT scans are non-textual unstructured data. These images are vital for diagnosis and treatment but aren't easily categorized or analyzed without specific tools and software.
Audio/Video Files: Customer service call recordings or security camera footage are other examples of unstructured data. They can provide valuable insights but require specialized processing to transcribe or analyze the content.

While the processing and analysis of unstructured data might be challenging, with the right approach and tools, it can offer valuable insights that might not be captured with structured data alone.

Due to nature of unstructured data, we recommend using unstructured data encryption, which leverages randomized encryption techniques to protect the sensitive data.

Dataset and Dataset Group

A Dataset in the Ubiq UI is a logical structure representing data, classified into two categories:

Structured: This refers to data stored in a database column of fixed length and type, such as names, addresses, or social security numbers.
Unstructured: This encompasses files like audio, video, PDFs, text, etc., stored in unstructured data storage, such as AWS S3, Google Cloud Storage, or a Data Lake.

Datasets allow a flexible and logical representation of various data elements and types for encryption within an application.

A Dataset Group in the Ubiq UI visually clusters different Datasets sharing specific attributes, enabling efficient management and tracking. A Dataset can belong to multiple Dataset Groups. However, Datasets with the same name cannot coexist within a single Dataset Group.

Access and Authentication

Ubiq supports two primary models for authenticating and authorizing access to its encryption and decryption services: API keys and Identity Provider-based (IDP) integration.

These models determine how applications and users authenticate with Ubiq, and how access to encrypted data is granted.

API Key-Based Access
API keys are self-contained credentials issued by Ubiq. Each API key includes a public identifier and two secret components used to authenticate requests and securely access encryption keys.

This model is typically used by:

Applications or services (e.g., CI/CD pipelines, microservices)
Environments that don’t use a centralized identity provider

API keys are tied to specific Dataset Groups and enforce access based on the permissions configured during their creation.

IDP-Based Integration
Ubiq also supports SCIM and SAML-based integrations with identity providers such as Okta and Microsoft Entra ID. With this setup:

Users and groups in your IDP are automatically synced with Ubiq
Access to datasets is determined by group membership
API key issuance and revocation is automated for users/accounts

This model is ideal for organizations looking to centralize and automate user management and access control, using existing identity infrastructure.

When to Use Each
Both models are fully supported and can be used together or separately. Use whichever fits your operational model best for efficient and secure user access management.

Symmetric Encryption Algorithm

An encryption algorithm, specifically discussing symmetric encryption in this context, is a set of mathematical procedures that converts plaintext data into a scrambled ciphertext, thereby ensuring the data's confidentiality. The same key is used to both encrypt and decrypt data, ensuring that only those possessing the correct key can decipher the ciphertext back into its original plaintext. Symmetric encryption algorithms are a fundamental pillar of data security, particularly when transmitting data over insecure networks or storing sensitive information.

There are numerous symmetric encryption algorithms available, each with different strengths and considerations:

AES-256-GCM:* The Advanced Encryption Standard (AES) with a 256-bit key size in Galois/Counter Mode (GCM) is a widely-used symmetric encryption algorithm. It offers strong encryption and includes built-in authentication, ensuring the integrity of both the encrypted data and any associated data.
FF1:* Format-Preserving Encryption (FPE) FF1 is an encryption methodology where the output (the ciphertext) is in the same format as the input (the plaintext). This is particularly valuable when the validity of data formats needs to be maintained, such as encrypting credit card numbers into other valid credit card numbers.
Blowfish: Blowfish is a symmetric block cipher that can be used as a drop-in replacement for DES or IDEA. It takes a variable-length key, from 32 bits to 448 bits, making it ideal for both domestic and exportable use. Blowfish is known for its incredible speed and effectiveness.
Twofish: Twofish is a symmetric key block cipher with a block size of 128 bits and key sizes up to 256 bits. It's related to the earlier Blowfish algorithm and was one of the five finalists of the Advanced Encryption Standard contest, but it was not selected for standardization. Twofish is considered to be among the fastest of encryption algorithms and is free for any use.

*Currently supported by Ubiq

When using Ubiq, users can easily switch between different encryption algorithms as per their specific needs, just by updating a setting in our UI. This functionality means that no changes are required to their applications or previously encrypted data.

This ease of use is particularly valuable in the context of quantum readiness. As the world prepares for the advent of quantum computing, there is an urgent need to develop quantum-resistant algorithms, which are algorithms that remain secure against potential quantum computing threats. Once NIST approves these quantum-resistant algorithms, users of our encryption solution will be able to painlessly update their encryption settings, further safeguarding their data without disrupting their existing systems or workflows. This flexibility offers users peace of mind knowing they can readily adapt to evolving security standards and requirements.

Some additional thoughts:

Advanced Encryption Standard (AES) is currently the industry standard encryption algorithm used worldwide. It has key sizes of 128, 192, and 256 bits, with AES-256 providing the highest level of security. It's used in many protocols such as HTTPS, SSH, IPsec, and is even approved by the National Security Agency (NSA) for encrypting top-secret information.

ChaCha20 is a stream cipher that, along with its associated authenticated encryption construction (ChaCha20-Poly1305), is getting increased attention in applications due to its high speed and security. It's used in some versions of TLS and in secure protocols like WireGuard.

3DES (Triple DES) is an older encryption standard that applies the older DES algorithm three times to each data block. While still used in some systems, it's generally being phased out due to its lower security relative to newer algorithms and its relatively slow speed.

Blowfish and Twofish are both recognized symmetric key block cipher algorithms, known for their speed and effectiveness. However, the suitability and safety of these algorithms largely depend on the context and specific requirements of their usage. Let's delve into each one a bit more:

Blowfish: Developed in 1993, Blowfish has been considered secure for many applications. However, it has a block size of 64 bits, which can present security concerns for applications requiring encryption of large amounts of data. Furthermore, better alternatives such as Twofish and AES are now available.
Twofish: An evolution of Blowfish, Twofish has a block size of 128 bits, making it suitable for encryption of larger data sets. It was a finalist in the NIST's Advanced Encryption Standard (AES) contest, where it was well-regarded for its security and speed. It didn't win the contest, but it's still considered a robust and secure algorithm for many applications.

While these algorithms are secure in many cases, it's always important to consider the specific needs and requirements of your application. The Advanced Encryption Standard (AES), particularly with a 256-bit key size, has generally superseded these older algorithms and is currently the most widely accepted and secure standard for data encryption.

Primary Encryption Key

A primary encryption key, which is also commonly referred to as a master encryption key, root key, or key derivation key, is a critical component in a data encryption scheme. It is a cryptographic key that is used to encrypt other keys, usually referred to as data encryption keys or DEKs.

The primary encryption key serves as the primary key in the key hierarchy and is stored securely within our key management infrastructure inside of tamper-proof and FIPS 140-2 compliant hardware security modules (HSM), and in a completely separate location from the data and the data encryption keys it protects. Its primary purpose is to add an extra layer of security in the encryption infrastructure.

In a typical encryption process, the actual data is encrypted with a data encryption key. This key is then further encrypted with the primary encryption key. This layered approach provides two primary benefits:

Enhanced Security: Since the data encryption key, which is directly used for encrypting and decrypting data, is itself encrypted with the primary encryption key, an extra layer of security is added. Even if a malicious actor gains access to the encrypted data and the data encryption key, they cannot decrypt the data without the primary encryption key.
Key Management: primary encryption keys simplify the process of key management. Instead of having to securely store and manage every data encryption key, only the primary encryption key needs to be stringently protected. Data encryption keys can be created as needed, and when they are no longer needed, they can be safely discarded without affecting other data encryption processes, since the primary encryption key remains secure and unchanged.

It’s important to note that a primary encryption key does NOT directly create other data encryption keys, but it plays a crucial role in their lifecycle.

Here's a simplified summary of how it works:

Generation: Data encryption keys are generated using a cryptographically secure random process. The primary encryption key doesn't directly "create" these keys, but it provides the basis for their secure usage.
Encryption: Once a data encryption key is generated, it's used to encrypt the actual data. The resulting encrypted data can only be decrypted using the same data encryption key.
Key Encryption: After the data encryption key has done its job of encrypting the data, a primary encryption key or a public key is used to encrypt the data encryption key itself. This adds an extra layer of security. Now, even if someone were to gain unauthorized access to the encrypted data and the encrypted data encryption key, they would still need the primary encryption key or the private key to decrypt the data encryption key, and subsequently, the data.
Storage: The encrypted data and the encrypted data encryption key are then stored, typically in a database or data store (depending on your use case). The primary encryption key is stored separately within our key management infrastructure and is not exportable or accessible, to prevent unauthorized access.
Decryption: When the data needs to be decrypted, the primary encryption key is used to decrypt the data encryption key, which in turn is used to decrypt the actual data.

So while the primary encryption key doesn't directly create data encryption keys, it is critical in their secure usage. It's also essential in key management, as the system only needs to protect the primary encryption key stringently. The data encryption keys, once they are encrypted with the primary key, can be safely stored, even in less secure environments.

Data Encryption Key

A data encryption key (DEK) is a randomly generated key used to encrypt and decrypt data in the process of securing sensitive information. The primary role of the DEK is to convert plaintext data into ciphertext, rendering it unreadable without the correct key.

Data encryption keys are typically generated using a secure random process to ensure that they are as unpredictable as possible, increasing the security of the encryption. Here's a high-level view of how they're used:

Generation: A data encryption key is generated. This process must be cryptographically secure to prevent the key from being predictable.
Encryption: The DEK is used to encrypt the plaintext data. The resulting ciphertext is virtually impossible to convert back into plaintext without the correct DEK.
Primary Key Encryption or Public Key: To further secure the DEK, a primary encryption key, also known as a master key or a root key, or a public key that is part of a public-private key pair may be used to encrypt the DEK. This creates an extra layer of security and ensures that even if someone gains access to the encrypted data and the encrypted DEK, they cannot decrypt the data without also having the primary encryption key or the private key.
Storage: The encrypted data and the encrypted DEK are stored alongside the data, often in a database or data store (depending on your use case).
Decryption: When the encrypted data needs to be accessed, the process is reversed. The primary encryption key decrypts the DEK, which in turn is used to decrypt the data back into plaintext.

DEKs are central to the encryption process, transforming sensitive data into unreadable ciphertext and helping to ensure that even if unauthorized individuals gain access to the encrypted data, they cannot decipher it without the correct keys. Their lifecycle is closely tied to the primary encryption key, which provides an additional layer of security and simplifies key management in the encryption infrastructure.

Key Encryption Key

A Key Encryption Key (KEK) plays a pivotal role in encryption key management. In essence, a KEK’s primary purpose is to protect other encryption keys, primarily Data Encryption Keys (DEKs). The use of KEKs adds an extra layer of security to the encryption process.

There are two generally accepted approaches to “creating” KEKs:

Asymmetric approach: use the public key in a public/private key pair as a KEK
Symmetric approach: leverage a Hardware Security Module (HSM) to create a KEK

For illustrative purposes, consider a scenario where you have sensitive data encrypted with a DEK. The DEK, needed for future data decryption, must be stored or transmitted securely. Storing this DEK in plaintext exposes it to potential unauthorized access. To mitigate this risk, you could use an HSM to create a KEK that is then used to encrypt the DEK. In this setup, only systems authorized to access the HSM can decrypt the KEK and, subsequently, gain access to the DEK. Hence, even if an attacker were to get unauthorized access to the stored keys, they would still need access to the KEK in the HSM to utilize them. Please note, this is a generic illustration and not indicative of Ubiq's specific approach.

At Ubiq, we adapt to the customer's use case and use both symmetric and asymmetric encryption methods to protect DEKs both at rest within the customer's environment and during transit. Furthermore, we secure the transmission channel with Transport Layer Security (TLS) when transmitting between our backend infrastructure and the customer's environment.

Our approach employs symmetric and asymmetric encryption for DEK protection, coupled with TLS for secure transmission. This robust, dual-layered security approach safeguards the integrity and confidentiality of the DEKs throughout the entire process.

Key Rotation

Encryption key rotation is a vital security practice that involves periodically changing keys used to perform encryption. By frequently updating keys, you reduce the amount of data that a key has access to and limit the potential damage if a key is compromised.

Key rotation comes in two primary forms: primary encryption key rotation and data key rotation.

Primary Encryption Key Rotation

The primary encryption key, sometimes referred to as a root key, is the central key used to derive and protect data keys. Rotating the primary key means generating a new primary key that will be used to derive data keys from that point forward.

Primary key rotation doesn't involve re-keying data (re-encrypting the data itself), making the process relatively quick. The primary advantage of rotating a primary key is to reduce the number of data keys that any one primary key is capable of decrypting and thereby reducing the exposure if a primary key is compromised.

Data Key Rotation

Data key rotation controls the frequency with which data keys are changed - a data key is the key used to actually encrypt (or decrypt) data. Similarly to how primary key rotation minimizes the number of data keys that a single primary key can access, data key rotation minimizes the amount of data that a single data key can decrypt. By rotating data keys, you reduce the potential exposure (data that can be decrypted) if a data key is compromised.

Ubiq provides automatic data key rotation in two different forms, depending on the type of data being encrypted. For unstructured datasets, a new data key is issued for each encryption event. This means that if you encrypt 100 files, a unique data key will be used for each of those 100 files, effectively rotating the data key 100 times. You can change this default behavior to suit your use-case. For structured datasets, data key rotation is handled similarly to primary key rotation and can be done through Ubiq’s UI or set to automatically rotate on an interval.

Ensuring Data Accessibility After Key Rotation

Key rotation is a critical security practice to reduce the risks of key compromise and to maintain compliance with security mandates. In many traditional encryption systems, rotating encryption keys - such as a master key or data encryption key (DEK) - can introduce operational challenges or result in data becoming inaccessible unless the entire dataset is re-encrypted. Ubiq’s approach ensures that sensitive data remains accessible before, during, and after any key rotation event, with minimal disruption to operations.

Key Architecture & Rotation Model

Primary Encryption Key Rotation
1. Ubiq uses a Primary Key (also known as a master key) to protect individual Data Encryption Keys (DEKs).
2. When the Primary Key is rotated, Ubiq employs an efficient process to ensure that DEKs remain accessible without requiring full re-encryption. The system manages key references securely, so there is no impact on data accessibility, even after the Primary Key is updated.
Data Encryption Key Rotation
1. Each record or file is encrypted with its own DEK.
2. Ubiq allows independent rotation of DEKs for each data item, ensuring that updates or access to specific data trigger only the necessary changes, without disrupting the availability of other data.

Why Archived Data Remains Accessible After Rotation

Efficient Key Management
1. Ubiq’s key rotation model is designed to maintain access to encrypted data by securely managing metadata that references the appropriate encryption keys. This allows data to remain accessible, even after the Primary Key is rotated, ensuring historical data is not impacted.
Controlled Key Updates
1. While updates to encryption keys occur as needed, these changes do not require a complete re-encryption of data unless the application specifically interacts with the data. This approach ensures that data remains protected and accessible throughout the rotation process.
Default and Configurable Schedules
1. By default, both Primary and Data Encryption Keys rotate annually, but customers can fully customize rotation schedules to meet their specific security or compliance needs.

Example

If the Primary Key is rotated today, the system ensures that data remains accessible by securely referencing the updated key, without needing to re-encrypt the entire dataset.
When a DEK is rotated for a specific record, only that record is updated, with the system managing the transition to ensure seamless access to older data.

By adopting a robust key rotation model that separates key management from encryption processes, Ubiq ensures that data stays protected while remaining immediately accessible. Whether rotating a Primary Key or a DEK, the system is designed to support aggressive rotation schedules without impacting data availability, ensuring security enhancements don’t come at the cost of accessibility.

Data Re-Keying

Data re-keying is another important aspect of managing encryption keys. It refers to the process of decrypting data that was encrypted with an old key and then re-encrypting that same data with a new key.

The primary motivation behind data re-keying is to limit the potential damage if an encryption key is compromised. By periodically changing the keys used to encrypt data, you limit the amount of data that could be decrypted with a compromised key.

Data re-keying can also be used as part of an update process. For instance, if a newer, more secure encryption algorithm becomes available (e.g. quantum-resistant algorithms), data that was encrypted with an old algorithm can be re-keyed using the new algorithm. This is an essential aspect of maintaining data security as cryptographic techniques and standards evolve.

Use Case: Re-keying as part of a Data Breach

In the event of a confirmed or suspected data breach, or if there's evidence suggesting that an attacker has accessed or compromised your data or encryption keys, it's highly recommended to perform data re-keying. This process involves decrypting the affected data that was encrypted with the compromised key and then re-encrypting it with a new, secure key.

Re-keying in this context helps to mitigate potential damage by ensuring that the compromised keys can no longer be used to access more data than has already been exposed. This proactive approach enhances your overall data security, restricts the access of unauthorized users, and aids in the recovery process following a data breach.

EncryptForSearch

EncryptForSearch is a technical process that allows searching within a database for data that has been encrypted, such as sensitive information like an employee’s credit card number. To perform these searches, the method EncryptForSearchAsync() is employed. This method takes an original value (e.g., a credit card number) and generates a set of all possible encrypted values for that original value, considering various encryption keys that might have been used over different time periods.

Example:

Consider a credit card number “1234 5678 9012 3456”, Initially, this credit card number was encrypted with Key A, resulting in an encrypted value “EncA 9876 5432 1098". After some time, for security reasons, the encryption key was rotated to Key B, which would theoretically encrypt the credit card number as “EncB 9876 5432 1098”.

Now, if someone needs to search for this particular credit card number in the encrypted database, simply searching for the value “EncA 9876 5432 1098" won’t yield results, as the database now holds the credit card number in the “EncB 9876 5432 1098” form due to the key rotation.

In this situation, the EncryptForSearchAsync() method is used. It takes the original credit card number “1234 5678 9012 3456" and creates a collection of potential encrypted representations based on different key rotations—in this example, it generates “EncA 9876 5432 1098” from Key A and “EncB 9876 5432 1098" from Key B.

Once this set of potential encrypted values is generated by EncryptForSearchAsync(), the database can be queried with each of these values to find matches. Thus, the database would be searched for both “EncA 9876 5432 1098” and “EncB 9876 5432 1098", ensuring that the credit card number can be effectively found, irrespective of which encryption key was used at any point in time.

📘
In a real-world implementation, the encrypted values would likely appear as a seemingly random sequence of characters and numbers, rather than a prefixed and clearly recognizable form as presented in this illustrative example. The purpose of the example is to illustrate how the method might function rather than to depict actual encryption results.

Key Caching

Key caching is a technical process where Data Encryption Keys (DEK) are temporarily and securely stored in memory. This reduces the overhead of frequently retrieving keys from Ubiq's SaaS-based KMS tier. By allowing quick access to the keys when needed for encrypting or decrypting large amounts of data, this approach improves performance while maintaining security. The keys are only stored for a limited time and are asymmetrically protected while in cache to ensure security. The caching period is configurable, but it's strongly recommended to keep it as short as possible to minimize security risks. Key caching balances security and performance for high-volume transaction use cases.

Example

Use Case: E-commerce Platform

An e-commerce platform processes thousands of transactions per minute, each containing sensitive customer information like credit card numbers and addresses that must be encrypted and decrypted. To optimize performance, records are encrypted in batches using the same SDEK. This approach reuses a single DEK for a preconfigured number of encryptions, allowing the platform to handle high volumes of data efficiently.

Without Key Caching:

Transaction Received: The platform receives a transaction.
Key Retrieval: The platform requests a Data Encryption Key (DEK) from Ubiq's SaaS-based Key Management Service (KMS).
Encryption/Decryption: The transaction data is encrypted or decrypted using the retrieved DEK.
Repeat: This process repeats for every transaction, leading to overhead and latency due to constant key retrievals.

With Key Caching:

Transaction Received: The platform receives a transaction.
Initial Key Retrieval: The first time, the platform requests a DEK from Ubiq's SaaS-based KMS and stores it securely in memory (cache).
Encryption/Decryption: The transaction data is encrypted or decrypted using the cached DEK.
Subsequent Transactions: For subsequent transactions, the platform directly uses the cached DEK, significantly reducing overhead and latency.
Key Refresh: After a configurable period, the cached DEK is discarded, and a new DEK is retrieved from the KMS to ensure security.

Use Case: Batch Data Masking for Analytics
A financial institution needs to mask large volumes of customer data for analytics purposes while maintaining data privacy. The institution caches DEKs to mask batches of customer records, reusing a single DEK for a set number of operations. This method allows the institution to quickly and efficiently prepare data for analysis while ensuring that the original data remains secure.