Datasets
Step-by-step instructions for creating Datasets
Introduction
A Dataset is the primary building block (in the Ubiq Dashboard) of data that you choose to encrypt.
Datasets can be configured as two types:
1. Structured
- Example: Data stored in a database column with a fixed length and type. Like a name, address, or SSN.
2. Unstructured
- Example: Files (audio, video, PDF, text, etc.) stored in an unstructured data store such as AWS S3, Google Cloud Storage, or a Data Lake.
Given an application could have multiple data elements and data types that you’d like to encrypt, Datasets provide you a more logical and flexible representation of each.
Create a Dataset
- Prepare a secure location for storage of Ubiq API Key Credentials. The process of creating a Dataset will create cryptographic API Key Credentials for your application that will only be shown once in the Ubiq UI. To ensure confidentiality of encrypted data, it is important to keep these API Key Credentials secret. They should not be stored in standard files or checked into source code repositories. Additionally, the availability of these API Key Credentials is paramount. If lost or destroyed, they cannot be restored and data encrypted with those Credentials may be irrecoverable.
To ensure security of API Key Credentials, they should be stored in a well-managed and backed up secret management server or password vault.
- On the left side menu click Datasets.
- The Datasets panel appears.
- Click on the + New Dataset button to enter the Dataset Creation Wizard.
- Input the following Information:
a) Dataset Name - An internal name that will be used to help you identify what data you're encrypting b) Description - A short description to keep track of your dataset definitions c) Tags - Tags can be used to mark a dataset's purpose or intended audience. d) Primary Key - Create a new Primary Key, or manually select an existing one e) Click Continue to input the Data Type that you will be encrypting
- Enter Data Type information.
-
Select Structured or Unstructured
Structured has multiple sub-types available for commonly used data patterns. Choose the one that best suits your data. (Most common is Formatted String.)
Depending on your selected Data Type, there may be varying configuration afterwards.
-
Click Continue to go to the next step
-
- If you selected Unstructured Data, then you will skip Step 8 and go straight to Step 9 to review. If you selected Structured Data and the Formatted String type, you will be presented with the Formated String Definition page.
For the first time creating a Structured Formatted String Dataset, here are some suggested values for the variables:Example: A U.S. Social Security Number (SSN)
Input character set: 0123456789 Output character set: 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ Passthrough: - (dash & space) Min Input Length: 9 Max Input Length: 9
- Review all the setting for your New Dataset.
- Click Create and your new Dataset will be displayed on the Dataset Panel.
Updated 7 days ago
