Deploying Azure Stream Analytics in Terraform using ARM templates to include SQL Reference Data

Stéphanie Visser
Nerd For Tech
Published in
4 min readJul 12, 2021

--

I didn’t want to use your standard ‘data’ or ‘computer’ pic. So here is something close to ‘streaming’. Feeling Zen already? — Pic by Baskin Creative Studios via Pexels.

🧭 Introduction

In this scenario, we want to create an Azure Stream Analytics job that takes input data from Azure Event Hub and uses reference data from an Azure SQL Database — and then outputs the data after transformation into a Service Bus Topic. We want to deploy the resources using Terraform, simply because it is an awesome Infrastructure as Code (IaC) tool that allows you to build, change and version infrastructure.

The challenge with deploying this? As described in this issue, the azurerm Terraform provider currently does not include support for SQL reference data lookup - it currently only supports Azure blob storage reference data lookup. Also there is also no support for compatibility level 1.2 as described in this issue.

But not to worry! There is another way to deploy Azure Stream Analytics including SQL reference data in Terraform: using ARM Template deployments.

Check out the GitHub repo for code samples with this article.

🍿 Requirements

Before you get started, make sure the following is in place:

  • An Azure subscription ID and tenant ID in which you want the resources to be deployed. You can find this information using the Azure CLI or the Azure Portal.
  • An Azure Service Principal ID and secret corresponding to above tenant ID. You can create this using the Azure CLI.
  • Add these four secrets to GitHub Secrets, so they can be used as environment variables to authenticate in the GitHub Actions workflow. Make sure to name your secrets as follows:
  • Terraform needs a place to store its state file. Therefore, add five additional GitHub Secrets that describe the desired resource group name, storage name and container name where the state file will be added, the name of the state file, and the region where you want these state file resources deployed. Please note: these resources don’t need to exist yet.

🏭 How it works

Once you run the GitHub Actions workflow, the following steps kick off.

  • Using state-resources.sh, the resources to store the Terraform state file are created (or, if they already exist, nothing happens), using the parameters you defined in your GitHub Secrets.
  • Terraform sets up, checks the format of your files, initializes and then generates an execution plan. This execution plan (create, update or destroy resources) is based on the resources defined in main.tf. In this case, it creates a separate resource group in which the resources are deployed; an Azure Event Hub for potential data input into Azure Stream Analytics; an empty Azure SQL Database for potential reference data to be used by Azure Stream Analytics; a Service Bus Topic for potential data output; and an Azure Stream Analytics Job including necessary blob storage.
  • Instead of deploying the Terraform Azure Stream Analytics Job, we deploy a Terraform Azure Resource Group Template Deployment in main.tf. This deployment template references an ARM template, which specifies the Azure Stream Analytics instance to be created. The ARM template references the parameters provided in the deployment template — specifically, for our use case, it uses the referenceQuery parameter (referring to the SQL reference data lookup query) and defines it as reference data for the Azure Stream Analytics job. Please remove the comments from the ARM template, as they result in compiling errors but are included in this repo for explanatory purposes.
  • The plan is then executed upon when there is a push to the main branch.
  • The repo also includes Azure Stream Analytics unit tests and a corresponding pipeline — because tests are, well, nice and good practice :). You can find more information on testing for Azure Stream Analytics here.

Please note that above set-up does not include any streaming data. The defined Azure Stream Analytics and SQL queries are therefore fictitious.

🚿 Clean-up

To avoid unnecessary costs, don’t forget to destroy the created resources.

💛 Thank you for reading!

If you found this useful, some 👏 would be much appreciated to further spread the word. Also, if you have additional questions or thoughts, please let me know in the comments or contribute in GitHub. Thank you!

--

--