BigQuery Integration Overview

The Ubiq BigQuery integration enables data-focused roles and use-cases where report writers, data analysts, data scientists, and even data pipelines need to protect data without having an “application” in between the user and the data. Encrypt, decrypt, and protect your data with granular access controls natively in BigQuery without changing how your users work with their data. This overview focuses specifically on how the BigQuery library works and how to use it.

How does Ubiq Work on BigQuery?

For some background, our whitepaper gets into the nitty gritty on how Ubiq works when integrated into an application. The native integration with the BigQuery library works just the same… but without the need for an application.

To achieve the same architectural design and data protection approach, BigQuery requires a few additional components to empower your data users to encrypt and decrypt data directly from their BigQuery SQL:

  1. GCP Cloud Functions that broker calls between BigQuery and the Ubiq backend
  2. BigQuery UDFs that enable encrypt() and decrypt() in SQL
  3. Ubiq libraries that get stored in GCP Cloud Storage

Integration instructions are available in our public docs as you’d expect here, and once you’re ready you use it, you use the same, simple code just like our other libraries:

Using with Visualization or Other Tools

Once you have the power to encrypt/decrypt at the SQL level, you can leverage that from any tool that’s accessing your data. Take, for example, the scenario where you have data encrypted with Ubiq and you don’t want a DBA that’s querying data to be able to see it, but you have a report in Tableau or PowerBI or some other visualization tool that needs to show the data decrypted in plain text.

Your data flow might look something like this:

Where your data is exposed through the entire ETL and reporting flow:

  • Stored in plain text in BigQuery
  • Shown in plain text when a BigQuery user or DBA queries it
  • Retrieved in plain text from the reporting / visualization tool
  • Shown in plain text on a report to all users

Once data is encrypted in BigQuery itself, however, you can choose where to expose (or not) the plain text data anywhere that it is queried. In this example, we would choose not to allow a DBA or regular BigQuery user to decrypt our SSNs, and we also will choose not to expose SSNs by default in any existing reports. But we can then choose to create a report that explicitly decrypts and shows that sensitive data.

BigQuery Integration vs. Other Ubiq Libraries

It’s all the same - the BigQuery integration and its encrypt/decrypt is completely cross-compatible with all of the other libraries and languages.

All of the same features and values still hold true for using Ubiq on BigQuery:

  • Data never leaves your environment to encrypt or decrypt data
  • No changes needed to your BigQuery schema to store data encrypted vs. plaintext
  • Data is encrypted before it gets persisted - so access from anyone else to your data will only see the ciphertext unless they have access to decrypt
  • No key management required - just like our application library usage, keys “follow the data” and are managed in the Ubiq SaaS UI for seamless key rotation and revocation
  • Flexible access controls; Ubiq API keys are used to authenticate the BigQuery user to Ubiq, and that gives them access (or not) to encrypt or decrypt various sets of data
  • Flexible key association - your key and dataset design in the Ubiq SaaS UI can enable granular key usage (like a unique key per table, per column or per BigQuery database) without any implementation complexity - the SQL queries don’t need to change or even know about the keys
  • Cross-library compatibility - our BigQuery integration uses the same NIST-approved structured (format-preserving) encryption algorithm, so you can encrypt/decrypt with BigQuery and then encrypt/decrypt with any other Ubiq library

Summary

The BigQuery library works the same way as our datawarehouse integrations and application-language-specific libraries:

  • Completely self-contained - no external dependencies or customizations in BigQuery
  • No change to your data flow (data never leaves BigQuery to encrypt/decrypt)
  • No encryption knowledge required to implement … simple exposure of an ubiq_encrypt() and ubiq_decrypt()function (UDF) that is directly callable from SQL
  • Similar performance profile to application-language libraries and similar performance design considerations for authenticating to our backend and key caching
  • Cross-compatible encryption/decryption with every other Ubiq library
  • Total feature parity - including key management and key rotation features delivered and managed through the Ubiq UI