Automating service database users with Terraform
Recently I have been making a big push to improve infrastructure automation in our main system. We are repositioning the system from maintenance-mode, to resume more active development and evolution. Existing legacy components will be retired in a push to increase the evolvability and maintainability of the system.
The trouble with starting on something like this is it requires significant effort to begin to make rapid progress, because legacy systems are typically inert, have inconsistent documentation and manual processes. It is easy to forget how new the DevOps movement is, and many systems that predate the movement are slow to evolve because so much groundwork must be laid.
To that end I have begun provisioning our infrastructure using Terraform. Existing infrastructure remains out of Terraform’s control for now, but as new services come online, they are being managed in code. I want as much of the code to serve as documentation as possible, because it is guaranteed to be up-to-date, and the maintenance overhead is reduced.
I find working with Terraform very simple when the AzureRM provider has the feature set required for the configuration we need. I have also found that the PaaS solutions integrate well, as they do not require post-provisioning setup, such as is needed for VMs. One exception to this is managing databases within Azure SQL; Terraform provides no out-of-the-box solution for running commands within a database instance, say to manage accounts and drive internal configuration. It can only deal with the platform itself.
The Goal
The gap I found in our setup was that while I could provision a new App Service very rapidly, and integrate it with our database via a virtual network with ease, creating an account within the target database and assigning roles was completely manual. Sure, I can write documentation, but that still provides some barrier to entry for new services. Something I greatly want to reduce the setup effort of.
The focus instead became on using Terraform to reliably, idempotently, drive the database account setup for provisioned app services.
The Solution
An idempotent SQL script
As Terraform is built to be idempotent, and because it cannot reliably store the internal database configuration in its state, I needed to create a user setup script that could be invoked more than once without failure for a given user. Further, subsequent iterations should not produce side effects. If the user already exists, then the script should not fail. The below script will create a role for application services and add a new user to that role if they are missing only.
IF (DATABASE_PRINCIPAL_ID('${target_role}') IS NULL) BEGIN CREATE ROLE [${target_role}] GRANT EXECUTE TO [${target_role}] GRANT INSERT TO [${target_role}] GRANT SELECT TO [${target_role}] GRANT UPDATE TO [${target_role}] IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'Comment') BEGIN GRANT DELETE ON [Comment] TO [${target_role}] END END IF NOT EXISTS (SELECT [name] FROM [sys].[database_principals] WHERE [name] = '${managed_identity}') BEGIN CREATE USER [${managed_identity}] FROM EXTERNAL PROVIDER END ALTER ROLE [${target_role}] ADD MEMBER [${managed_identity}]
As will become clearer later, the ${}
notation within the above
sample allows Terraform to perform substitution on the above template,
providing real values from its state. This makes the script dynamic.
Note also that most tables use a soft-delete mechanism, so
applications should not be granted the DELETE
permission. However,
some do use a hard-delete, so this permission is granted on an
as-needed basis. The check for the presence of such tables allows the
script to run against mock versions of the database that do not have
the full schema, as that is driven currently by the application at
deployment-time:
IF EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = N'Comment') BEGIN GRANT DELETE ON [Comment] TO [${target_role}] END
Connecting to the database
The only shell that has good support for Azure SQL connections using
Entra authentication is PowerShell. It has the SqlServer
package,
backed by the Microsoft.Data.SqlClient
library used in modern .NET
applications. As our database server allows Entra authentication only,
PowerShell was a must for running the actual SQL commands.
To drive this from Terraform, I added a null resource to invoke the shell command, conditional on the database name, server name, username and role changing:
resource "null_resource" "db_config" { triggers = { db_name = var.database_name, server_name = var.database_server_name target_role = var.target_role username = var.username } }
The variables are inputs to a module created to encapsulate this
entire process. Module consumers then need provide only the
database_name
, database_server_name
, target_role
and username
,
without needing to worry about the gritty detail given here.
The null resource then has a provisioner
block that contains the
PowerShell script necessary to invoke the SQL commands on the database:
resource "null_resource" "db_config" { ... provisioner "local-exec" { command = <<EOT Install-Module -AcceptLicense -Force -Scope CurrentUser SqlServer Import-Module SqlServer Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" ` -Database ${var.database_name} ` -AccessToken "${data.external.db_token.result.token}" ` -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}" EOT interpreter = ["pwsh", "-Command"] } }
Of note here is the block calling Invoke-SqlCmd
, as that is the
point where the template SQL file is loaded, transformed, and
executed:
Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" ` -Database ${var.database_name} ` -AccessToken "${data.external.db_token.result.token}" ` -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}"
The same variables that serve as triggers for changes to the
nullresource drive the connection details. Terraform’s templatefile
function is leveraged to peform substitution on the SQL file. An
object is passed as its second parameter with the values for the
managed_identity
and target_role
.
The -AccessToken
argument is used to perform Entra authentication
with the database server, but a separate Terraform data block is
required to actually fetch a token. The below block invokes a script
and makes its output available to subsequent Terraform code blocks, as
output values.
data "external" "db_token" { program = ["bash", "${path.module}/script/get_access_token.sh"] }
The invoked script simply retrieves an access token using the Azure
CLI and uses jq
to morph the script output to a form Terraform
expects. The access token generated is one specifically for the Azure
SQL resource type, this is achieved by passing the --resource
flag:
set -e ACCESS_TOKEN=$(az account get-access-token -o tsv --query accessToken --resource "https://database.windows.net") jq -n --arg token "$ACCESS_TOKEN" '{"token":$token}'
Putting it all together
The final Terraform configuration looks as below, with variable definitions omitted for brevity. Now new application services can be integrated into the database with their own account, given minimal permissions,and the whole process can be made auditable by running within a CI pipeline.
data "external" "db_token" { program = ["bash", "${path.module}/script/get_access_token.sh"] } resource "null_resource" "db_config" { triggers = { db_name = var.database_name, server_name = var.database_server_name target_role = var.target_role username = var.username } provisioner "local-exec" { command = <<EOT Install-Module -AcceptLicense -Force -Scope CurrentUser SqlServer Import-Module SqlServer Invoke-SqlCmd -ServerInstance "${var.database_server_name}.database.windows.net" ` -Database ${var.database_name} ` -AccessToken "${data.external.db_token.result.token}" ` -Query "${templatefile("${path.module}/script/add_user.sql", { managed_identity = var.username, target_role = var.target_role })}" EOT interpreter = ["pwsh", "-Command"] } }