Home

Automating service database users with Terraform

Recently I have been making a big push to improve infrastructure automation in our main system. We are repositioning the system from maintenance-mode, to resume more active development and evolution. Existing legacy components will be retired in a push to increase the evolvability and maintainability of the system.

The trouble with starting on something like this is it requires significant effort to begin to make rapid progress, because legacy systems are typically inert, have inconsistent documentation and manual processes. It is easy to forget how new the DevOps movement is, and many systems that predate the movement are slow to evolve because so much groundwork must be laid.

To that end I have begun provisioning our infrastructure using Terraform. Existing infrastructure remains out of Terraform’s control for now, but as new services come online, they are being managed in code. I want as much of the code to serve as documentation as possible, because it is guaranteed to be up-to-date, and the maintenance overhead is reduced.

I find working with Terraform very simple when the AzureRM provider has the feature set required for the configuration we need. I have also found that the PaaS solutions integrate well, as they do not require post-provisioning setup, such as is needed for VMs. One exception to this is managing databases within Azure SQL; Terraform provides no out-of-the-box solution for running commands within a database instance, say to manage accounts and drive internal configuration. It can only deal with the platform itself.

The Goal

The gap I found in our setup was that while I could provision a new App Service very rapidly, and integrate it with our database via a virtual network with ease, creating an account within the target database and assigning roles was completely manual. Sure, I can write documentation, but that still provides some barrier to entry for new services. Something I greatly want to reduce the setup effort of.

The focus instead became on using Terraform to reliably, idempotently, drive the database account setup for provisioned app services.

The Solution

An idempotent SQL script

As Terraform is built to be idempotent, and because it cannot reliably store the internal database configuration in its state, I needed to create a user setup script that could be invoked more than once without failure for a given user. Further, subsequent iterations should not produce side effects. If the user already exists, then the script should not fail. The below script will create a role for application services and add a new user to that role if they are missing only.

IF (DATABASE_PRINCIPAL_ID('${target_role}') IS NULL)
BEGIN
    CREATE ROLE [${target_role}]
    GRANT EXECUTE TO [${target_role}]
    GRANT INSERT TO [${target_role}]
    GRANT SELECT TO [${target_role}]
    GRANT UPDATE TO [${target_role}]

    IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'Comment')
    BEGIN
        GRANT DELETE ON [Comment] TO [${target_role}]
    END
END

IF NOT EXISTS (SELECT [name]
                FROM [sys].[database_principals]
                WHERE [name] = '${managed_identity}')
BEGIN
    CREATE USER [${managed_identity}] FROM EXTERNAL PROVIDER
END

ALTER ROLE [${target_role}] ADD MEMBER [${managed_identity}]

As will become clearer later, the ${} notation within the above sample allows Terraform to perform substitution on the above template, providing real values from its state. This makes the script dynamic.

Note also that most tables use a soft-delete mechanism, so applications should not be granted the DELETE permission. However, some do use a hard-delete, so this permission is granted on an as-needed basis. The check for the presence of such tables allows the script to run against mock versions of the database that do not have the full schema, as that is driven currently by the application at deployment-time:

IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'Comment')
BEGIN
    GRANT DELETE ON [Comment] TO [${target_role}]
END

Connecting to the database

The only shell that has good support for Azure SQL connections using Entra authentication is PowerShell. It has the SqlServer package, backed by the Microsoft.Data.SqlClient library used in modern .NET applications. As our database server allows Entra authentication only, PowerShell was a must for running the actual SQL commands.

To drive this from Terraform, I added a null resource to invoke the shell command, conditional on the database name, server name, username and role changing:

resource "null_resource" "db_config" {
  triggers = {
    db_name     = var.database_name,
    server_name = var.database_server_name
    target_role = var.target_role
    username    = var.username
  }
}

The variables are inputs to a module created to encapsulate this entire process. Module consumers then need provide only the database_name, database_server_name, target_role and username, without needing to worry about the gritty detail given here.

The null resource then has a provisioner block that contains the PowerShell script necessary to invoke the SQL commands on the database:

resource "null_resource" "db_config" {
  ...
  provisioner "local-exec" {
    command     = <<EOT
    Install-Module -AcceptLicense -Force -Scope CurrentUser SqlServer
    Import-Module SqlServer

    Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" `
      -Database ${var.database_name} `
      -AccessToken "${data.external.db_token.result.token}" `
      -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}"
  EOT
    interpreter = ["pwsh", "-Command"]
  }
}

Of note here is the block calling Invoke-SqlCmd, as that is the point where the template SQL file is loaded, transformed, and executed:

Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" `
  -Database ${var.database_name} `
  -AccessToken "${data.external.db_token.result.token}" `
  -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}"

The same variables that serve as triggers for changes to the nullresource drive the connection details. Terraform’s templatefile function is leveraged to peform substitution on the SQL file. An object is passed as its second parameter with the values for the managed_identity and target_role.

The -AccessToken argument is used to perform Entra authentication with the database server, but a separate Terraform data block is required to actually fetch a token. The below block invokes a script and makes its output available to subsequent Terraform code blocks, as output values.

data "external" "db_token" {
  program = ["bash", "${path.module}/script/get_access_token.sh"]
}

The invoked script simply retrieves an access token using the Azure CLI and uses jq to morph the script output to a form Terraform expects. The access token generated is one specifically for the Azure SQL resource type, this is achieved by passing the --resource flag:

set -e

ACCESS_TOKEN=$(az account get-access-token -o tsv --query accessToken --resource "https://database.windows.net")

jq -n --arg token "$ACCESS_TOKEN" '{"token":$token}'

Putting it all together

The final Terraform configuration looks as below, with variable definitions omitted for brevity. Now new application services can be integrated into the database with their own account, given minimal permissions,and the whole process can be made auditable by running within a CI pipeline.

data "external" "db_token" {
  program = ["bash", "${path.module}/script/get_access_token.sh"]
}

resource "null_resource" "db_config" {
  triggers = {
    db_name     = var.database_name,
    server_name = var.database_server_name
    target_role = var.target_role
    username    = var.username
  }
  provisioner "local-exec" {
    command     = <<EOT
    Install-Module -AcceptLicense -Force -Scope CurrentUser SqlServer
    Import-Module SqlServer

    Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" `
      -Database ${var.database_name} `
      -AccessToken "${data.external.db_token.result.token}" `
      -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}"
  EOT
    interpreter = ["pwsh", "-Command"]
  }
}