Simplifying Data
Warehouse Governance

Introduction

Managing cost and usage of resources and data assets, especially in an era of potential economic downturn, is becoming a necessity and part of typical responsibilities for any IT organization. This is all the more important in the cloud world, where resources are spun up and down with dynamic access and cost management needs. In this blog, we introduce a simple yet flexible Firebolt
object model that organizes and aligns your data warehouse resources to your operating model.

Today, companies are challenged with establishing a solid governance model to manage their cloud data warehouse resources. Data security, cost management, resource isolation, and observability are among some of the challenges that various organizations face that a governance model must address.

For example, development, staging and production environments require isolation from each other to limit the blast radius of any development changes spilling into production by mistake, or to limit a developer’s access to only their code base and data. Departments in a company need to isolate and provide access to only their teams while limiting access to other departments. Another requirement may be to provide consolidated billing, but with the ability to understand consumption by department or development environment.

Organizations, accounts and the Firebolt object model

To address these requirements, Firebolt supports concepts of organizations and accounts. These two concepts are represented as two unique objects within the object model hierarchy.

But first - let’s introduce some core principles followed by our object model.
The Firebolt object model is hierarchical and comes with strong containment properties ­— parent objects can contain one or more child objects. Child objects are sole children of their parent objects (and cannot be shared). Furthermore, there are two classes of objects: global and regional. As the name suggests, global objects are managed globally and can contain objects that are deployed and grouped regionally.

The picture below depicts a Firebolt supported object model:

An organization, as a top-level and a global object, groups all objects together.
It contains a few similarly global organizational entities, and can contain one or more accounts that are deployed regionally. Conceptually, the organization helps enforce global policies in the form of system level authentication and principals accessing the system -  it manages Single Sign On (SSO) and network related security/protection, as well as being used for billing consolidation purposes.
An organization also simplifies management by consolidating multiple accounts under a single logical entity.

The following Firebolt functionality is tied to the organization:

  • Authentication process Firebolt handles user authentication and access control at the organization level. A login (represented by an email) is created for each user accessing Firebolt.
  • Programmatic access: Service accounts enable programmatic access to Firebolt.
  • Network policy enforcement: Network policies provide fine-grain control of IP ranges that are allowed or blocked from accessing an organization.

On the other hand, accounts are designed to depict physical representation of your data warehouse implementation. As regional objects, Firebolt accounts can be created within any of the Firebolt supported regions (for more information about supported regions refer to our documentation). All account children objects are contained within the region that account was created in.

Accounts contain objects that help you implement access control and authorization policies within your data warehouse. They also house your data model that is managed and deployed within your data warehouse through objects such as databases, tables, views, and indexes. And finally, compute used to support your data warehouse operation is represented by engine objects. Firebolt engines are a unit of economics that help you govern the spend and price/performance characteristics of your data warehouse. Accounts are identified and referenced
by the name. They are unique within a context of one organization and enable the following functionality:

  • Access control: Firebolt implements Role-based access control (RBAC). Every object in the Firebolt object model is a securable and it comes with a set of permissions. Permissions allow administrators to control functionality Firebolt users can exercise when logged-in.
  • Data modeling: Through objects (such as databases, tables, views, indexes and others) developers and architects can design their data warehouses and properly describe various business entities without compromising to deliver on ever-demanding performance needs.
  • Cost control: With engines, system administrators can deploy engines that fit the need while achieving desired price-performance characteristics. Engines can scale vertically (up and down) and horizontally (out and in) to meet demanding business needs while allowing granular cost control.
  • Workload management: With compute, data, and metadata separation, Firebolt offers full workload isolation. Firebolt users can deploy, if desired, separate engines to support heterogeneous workloads, while having access to the same data. Whether you have a data intensive application that requires instantaneous access to the data, or a complex business-critical dashboard that requires timely refresh, or need to run a complex Extract-Load-Transform (ELT) process to ingest data, Firebolt engines are there to support all needs

In summary, organizations and accounts provide you with flexibility and structure to fit your needs. They allow better resource allocation, fine-grained access control,
and overall easier management of your data warehouse. During the registering process, Firebolt allows creation of your organization and the first account while giving you the opportunity to create additional accounts either manually or programmatically as you go.

Sections below describe some common use cases and scenarios that organizations and accounts help address.

Use cases for organizations and accounts

Separating development, staging and production environments

When data applications are being built, engineers typically develop in different environments than where the end product will be running. IT organizations typically follow best practices in creating one environment for development, one for staging, and one for production. This is done to achieve environment separation and isolation, protect production environments from unintended changes, and to orchestrate holistic Continuous Integration / Continuous Delivery (CI/CD) pipelines and streamline deployment processes.

As the development cycle progresses from the development to staging and production environments, engineering teams need to run corresponding workloads in the relevant environment to validate data applications behave as intended. Furthermore, development environments typically do not have very stringent performance needs and Service Level Agreements (SLAs), unlike production environments. Similarly, development sandboxes may come with less-restrictive permissions to allow developers to experiment with new features, unlike production environments that need to be tightly managed and protected.

Let’s explore each environment and needs.

  • Development: During development, the freedom to iterate and experiment is essential. A dedicated development account (or accounts) can serve as a space for development efforts while promoting collaboration and reuse of artifacts produced. Doing so brings quick and iterative development and needed access control.
  • Staging: To validate developed features or fixes before the code is moved to production, engineers run tests and thorough stress and performance validation in staging environments. Account abstraction enables a structured and dedicated environment for performing these thorough validations and rigorous testing, using real-life data to validate application fully, and appropriate security and regulatory compliance.
  • Production: Production environments, where applications are deployed and scaled, require careful considerations and handling. Organizations require seamless transition and deployment of new code and changes from the staging environment to production. By having a separate account for production environments, IT organizations can orchestrate carefully crafted CI/CD pipelines that replicate configurations and settings that were used in the staging accounts. This minimizes the potential for unforeseen issues during deployment and ensures smooth end-user experience. Best of all, with Firebolt enabling SQL-first developer experiences, all changes that need to be made in the production can be fully managed via SQL – one can easily orchestrate necessary SQL changes and deploy them in the production account.

To support all of this, each environment requires different resources and configurations to serve its needs and purpose. With organizations and accounts, it is easy to support these needs by creating a separate account for each environment. Engineering teams now have full control and can navigate the intricacies of data applications rollout to production with precision and efficiency.

Achieving departmental separation

It’s very common for any company, irrespective of its size, to have multiple departments – finance, IT organization, marketing and so on. For bigger organizations, this may come with different manageability functions, where dedicated IT organizations support independent department operations. Furthermore, different departments may require data to be physically segregated. As an example, sensitive financial data can only be seen and accessed by the finance department while market insights can only be seen by marketing. Or, various departments may be geographically dispersed requiring physical data separation and access for compliance reasons. EU privacy regulations, that force data locality and require complete and geographical data isolation, fall into this last category.

For situations like these where companies want to achieve physical departmental separation and manageability while enjoying a unified view over operations of all those departments, an organization and accounts can be leveraged. Given that Firebolt accounts can be deployed regionally, companies using Firebolt can enjoy deployment flexibility while satisfying stringent compliance requirements. This way, departments get needed independence while the organization enforces a global security policy, such as Single Sign On, across all departments.

With organizations and multiple accounts in Firebolt, it is easy to deliver on the above requirements and needs; simply create a different account for each department. Within each account, create users, and databases, load data, and grant privileges to the new objects so your teams can work independently and efficiently.

Granular billing to support chargebacks

Quick and accurate cost management is a critical concern for businesses. As Firebolt introduces the concept of organizations, it’s important to understand how this new framework can support billing and chargeback policies.

Managing costs across multiple accounts and teams can be a complex and time-consuming endeavor. With the introduction of the organization object, Firebolt simplifies the process of monitoring consumption and allocating costs across those various accounts. A Firebolt organization is linked to AWS Marketplace for billing purposes. By linking accounts with an organization, you seamlessly achieve unified billing while preserving the ability to uniquely identify charges per each account. This brings several benefits:

  • Simplified cost tracking: With all charges consolidated into a single billing source, monitoring and tracking expenses across the organization becomes considerably easier. This enhances transparency and provides insights into overall resource consumption.
  • Efficient budgeting: Organizations enable you to set comprehensive budgets for the entire group. This allows you to allocate funds strategically, preventing overspending and ensuring that financial resources are appropriately distributed.
  • Accurate cost allocation to support chargebacks: For companies with multiple teams or departments, charges per team/department can be accurately identified to support proper chargebacks. The costs incurred by each account are accurately attributed, facilitating transparent cost-sharing arrangements.
  • Centralized reporting: A unified billing model simplifies reporting by providing consolidated breakdown of expenses. This aids in generating clear, detailed reports that showcase the distribution of costs among various accounts. Both organization and per-account level reporting is available.

The introduction of organizations opens the door to implementing flexible chargeback policies that align with your company’s structure and goals. Here are some key considerations to keep in mind as you start tackling this need:

  • Resource usage metrics: To ensure fair chargeback, consider using resource usage as a basis for allocating costs. This approach ties expenses directly to the resources consumed by each account.
  • Costs breakdown by engine and storage: Firebolt provides costs breakdown per account, per engine, and storage used. This makes it easy to understand the exact consumption within each account.
  • Regular review and adjustment: Chargeback policies should not remain static. Regularly review the effectiveness of your policies and adjust them as needed to align with changing business needs and resource consumption patterns using the cost breakdowns provided.

By leveraging unified billing and implementing flexible chargeback policies, your company can establish a robust framework for accurate cost allocation, streamlined reporting, and optimized resource usage.

Get started with organizations

Getting started with organizations requires simply registering with a valid business email address. Once that’s done, an organization is created with your first account. Shown below are the basic steps to establishing the foundational elements of your organization and accounts.

Let’s say John from Acme has registered with Firebolt. During the registration process, Acme organization and his first account are automatically created. Now, John wants to create an additional account named AcmeBI. He runs the following SQL command to do so:

CREATE ACCOUNT AcmeBI; 

To verify the account was created, he runs:

SELECT * FROM INFORMATION_SCHEMA.ACCOUNTS WHERE
account_name = ‘AcmeBI’
// running above statement enables John to see that an account named AcmeBI exists in the system.

Now, John wants to invite his team members to join the Acme organization and take Firebolt for a spin. Since John’s company is using Okta as an identity provider, he can simply configure his Firebolt organization to integrate with Okta so that eligible users from Okta can access Firebolt. To do so, John goes into the ‘Configure’ space in the Firebolt Workspace and sets up SSO by executing the following statement:

ALTER ORGANIZATION SET SSO = ‘{
  “signOnUrl”: “https://abc.okta.com/app/okta_firebolt_app_id/sso/saml”,
  “signOutUrl”: “https://myapp.acme.com/saml/logout”, 
  “issuer”: “Okta”,
  “provider”: “Okta”,
  “label”: “Okta”,
  “fieldMapping”: “mapping”,
  “certificate”: “XXXXXXXXXXXXXXXX” }’

Now, John wants to grant the team members access to the test account. To do so, he creates a login and a user for each team member. At this point, John’s team members can log in using SSO, and they can easily access the test account.

Here’s an example of how to create a login:

CREATE LOGIN ‘kate@acme.com’ WITH FIRST_NAME = ‘Kate’ LAST_NAME = ‘Peterson’;

To create users and link them to appropriate logins, John runs the following command:

CREATE USER kate WITH LOGIN_NAME = ‘kate@acme.com’;

Now, John creates his first database in the test account:

CREATE DATABASE my_db;
USE DATABASE my_db;

In addition to that, he creates an internal table called “rankings” and loads it with data using the COPY statement.

CREATE TABLE IF NOT EXISTS rankings (
    GameID INTEGER,
    PlayerID INTEGER,
    MaxLevel INTEGER,
    TotalScore BIGINT,
    PlaceWon INTEGER,
    TournamentID INTEGER
  ) PRIMARY INDEX GameID, TournamentID, PlayerID;

COPY
INTO rankings(gameid $1
, playerid $2
, maxlevel $3
, totalscore $4
, placewon $5
, tournamentid $6
)
FROM ‘s3://firebolt-sample-datasets-public-us-east-1/gaming/parquet/rankings/WITH PATTERN =*’ TYPE = PARQUET;

Now, John want to enable Kate to run selects and inserts over the new table so he grants her with appropriate privileges:

GRANT SELECT ON TABLE rankings to marketing_role;
GRANT INSERT ON TABLE rankings to marketing_role;

Summary

With organizations, Firebolt provides a structured framework for managing accounts, authentication, databases, engines, etc. Moreover, organizations address security, resource allocation, usage, and cost management challenges. Organizations demonstrate their versatility in catering to diverse business needs through distinct use cases, including development/staging/production environments and departmental separation. With unified billing ability and support for implementing chargeback policies, organization and accounts simplify cost allocation and enhance financial transparency. In a world where efficient data governance is paramount, organizations offer a holistic approach to streamline processes, bolster security,
and optimize resource utilization.

Contact Firebolt

For more information about Firebolt

Contact us
Send me as pdf