Using Azure Event Hub with Power BI for real-time analysis
October 4th, 2023
This article explains how to create a real-time data integration using Microsoft Azure resources. Combining Azure Event Hub with Azure Stream Analytics, we will create a data pipeline that can send real-time data to Azure Power BI for further analysis & visualization.
Introduction
No matter how you collect data - from physical equipment in the factory to other IT systems such as an ERP or accounting tool - getting actionable insights from that data is key for a wide range of stakeholders within your company to make informed business decisions.
It is increasingly common for companies to use a data visualization provider such as Azure Power BI to create dashboards to track their core metrics and KPIs. Power BI is one of the most powerful data visualization & modelling tools on the market today, supporting both real-time & batch data processing. It hosts a broad array of data connectors, deep integration with other Azure services, as well as extensive data modelling, transformation & querying capabilities on top of the core visualization features.
One of the core challenges of using Power BI is the setup and connection to data sources that provide Power BI that input data. For real-time data, often Power BI is paired with two other Azure cloud products: Azure Event Hub and Azure Streaming Analytics.
In this walkthrough we provide a detailed guide on how to make the most of your real-time data through an end-to-end setup of an Azure Event Hub and Power BI, connected through an Azure Streaming Analytics job.
Prerequisites
An Azure account with a Subscription (like a billing account)
A Resource Group for your Subscription. You can quickly create one by following the guide here.
A Power BI workspace, created with the same email as the one you use for your Azure account. You should also have permission with Power BI to create a dataset there. You will also need a Pro licence as well.
Step 1: Give your Subscription access to Stream Analytics
For this set-up we’ll be use Azure Stream Analytics. Stream Analytics is a stream processing engine that can process, analyze and transform vast volumes of data with minimal latency. In particular, Azure supports a new no-code editor which allows users to graphically define and create their data pipelines without needing to write any code. More info on Azure Stream Analytics can be found here.
To enable Stream Analytics, your Subscription must have the permissions to do so:
In the Azure dashboard, search for “Subscriptions” and select your Subscription
In the navigation column under Settings, select “Resource Providers”
In the search bar, search for “Microsoft.StreamAnalytics”, select the option in the table below (it will be grey when highlighted) and select “Register”
This can take a couple of minutes! When complete, the Microsoft.StreamAnalytics row in the table will have a Registered icon next to it
Step 2: Setting up the Event Hub
An Azure Event Hub is a big data streaming and event ingestion service that connects with both other internal Azure cloud resources (i.e. Stream Analytics, Power BI etc) as well as external cloud services. Find out more here.
To set up the Event Hub, in your Azure dashboard, search for “Event Hubs”, navigate to the “Event Hubs” section, and select “+ Create”.
This will ask you to create a Namespace, which is a management container for multiple Event Hubs that you can control. Select your Subscription, Resource Group, and give your Event Namespace a name, location and a pricing tier. At the bottom of the page select “Review + create”.
In the newly created Event Hubs Namespace, on the navigation column, select “Event Hubs” under the Entities section. Select the “+ Event Hub” option.
You will now be prompted to create an Event Hub. Give your Event Hub a name (you can keep the rest of the options as default). Select “Review + create” to create your Event Hub.
Once the Event Hub has been created, go to the Event Hub dashboard you just created. In the navigation column on the left under the Features section, select “Process data”
Select “Build near real-time dashboard with Power BI” in the options that are created
Step 3: Create the Stream Analytics Job
This will open a drawer for you to give a name for your new Stream Analytics Job. Give it a name and select “Create” at the bottom of the screen.
This will open Stream Analytics no-code editor for you to quickly build and deploy your data pipeline to Power BI. Azure pre-populates an Event Hub input connector, a Manage Fields operations transformation, and a Power BI output connector.
Select the Event Hub input connector. Azure automatically pre-fills connection details for you on the right-hand side, all you have to do is select “Connect”. Then you can add your field names and types to specify the type of Events that Event Hub can read.
Pro-tip: ensure that you know the data types and format of the input data that Event Hub will receive, as well as the data labels. If there’s any inconsistencies, Event Hub will often forego the input data items!
Select the “Manage Fields” operations transformer. This transformation allows you to add, rename or remove fields coming from an input before sending it to an output. For ease, you can add all fields in one go as well. Select “Done” to finalise this step.
Pro-tip: routinely save your Stream Analytics jobs as you build them! It’s very easy to configure everything, and then navigate off screen and lose all your progress (happened to me multiple times)
We’re now left with the final step, configuring the Power BI output connector:
Step 4: Configuring and connecting Power BI
Before finalising the Stream Analytics job, we need to handle some configuration in Azure and Power BI:
Ensure that your Power BI account has a Pro license. This Pro license might not be auto-allocated to the PowerBI user. You can go into https://admin.microsoft.com/Adminportal/Home#/users (Microsoft 365 Admin Centre), and allocate the PowerBI Pro License to the PowerBI user there
Now in your Azure dashboard, search for and select your Resource Group. Under resources you should see the Stream Analytics job that you have created previously. Select it to be taken to the Stream Analytics overview page.
On the left-hand size navigation panel, under Settings, you should see an option for “Managed Identity”. Now this might have already be auto-provisioned to link to the Stream Analytics job, but if it’s not, then click on “Select Identity” → “System assigned identity” → Save.
Important! You need to check whether the Principal Name is the same as the Stream Analytics job (in our example here: ”stream-test-1”) for this to work.
In Power BI, go to the Admin page (via the settings button in the navigation bar)
Scroll down and find the section called “Developer Settings”
Under “Allow service principals to use Power BI APIs”, enable for the whole organization. This can take 15 minutes to take effect.
In your Power BI Workspace (as part of the pre-requisites), go to Workspace Access
Select “Add people or groups”
Enter the Stream Analytics job’s name, and make sure it has a Contributor role
Go back to the Stream Analytics Job in Azure with the no-code editor. Click on the PowerBI box. You might need to refresh the page (please Save beforehand!). You can now select the Power BI Workspace, and you enter names for the dataset and the table (the Streaming Analytics job will auto-create these for you, you don’t need to create them in Power BI)
Save and then start the Streaming Analytics job. If you’ve been using preview data generation you can enter the same preview data and it should propagate through the Streaming Analytics job. We’re nearly there!
In Power BI, go to your Workspace. Click “Create” → “Dashboard” → “Edit” → “Add a tile”. Finally, select “Custom Streaming Data” and choose the dataset name you gave in the Power BI connector in the Streaming Analytics job.
And that’s it, everything is setup!
Once this is set up, you can revisit the Stream Analytics job, and adjust the data pipeline as you need to. You can use different supported transformations such as Filter & more to adjust data on the fly.
How can I get data into my Event Hub?
There’s a couple of ways in which you can pass data into Event Hub which will trigger the streaming process we’ve built above, with some of the common being:
External custom software applications that directly write to Event Hub. Event Hub supports a range of SDKs to help with streamlining custom development.
Event Hub supports native Apache Kafka endpoints, which allows users to stream data to Event Hub without any code changes (just requires configuration updates).
What about IoT services?
Event Hub is great as a scalable low-latency data ingestion service for a wide variety of input services, but it is primarily designed in a standalone capacity to work with other cloud or IT systems, and doesn’t support more IoT-specific requirements (such as cloud-to-device messaging and device provisioning).
If you have IoT or hardware devices that generate and transmit data on the edge, those data messages are best routed into Azure IoT Hub. When you provision an Azure IoT Hub, Azure additionally creates an in-built Event Hub with Event Hub compatible endpoints within the IoT Hub (see here). This is how IoT Hub is able to retain messages for up to 24 hours that you can then use, transform and send to other cloud & IT systems. You can even use the IoT Hub as an input connector in a Stream Analytics job.
For example, at Ferry, we help industrial companies collect data from their factory equipment & frontline operations so that they can optimise production & reduce waste. Users of our platform can link their Azure cloud accounts to Ferry, and we auto-provision all the required Azure resources for them to be able to deploy their applications & data pipelines to their edge devices. As part of this, we provision an Azure IoT Hub, which comes with the in-built Event Hub as standard. You can use this tutorial, switching out the Event Hub for your IoT Hub, to stream data straight into Power BI using Stream Analytics.
If there’s any questions you have, or if you would like to find out more about Ferry, feel free to reach out to me at dominic@deployferry.io!