Azure and Amazon Data Stream Analytics and Processing: Amazon Kinesis, Azure Stream Analytics, and Azure Event Hub

Robert John
5 min readDec 27, 2020

--

6 min to read

source

Data Stream Analytics also called event stream processing or real-time analytics is the processing and analysis of real-time data. These streaming data could be transaction data from an e-commerce website, financial trading floors, telemetry from IoT devices, and social media data.

This article gives a brief description and use cases of the data stream analytics services in AWS and Azure. It also contains resources about their prices, components, hands-on examples, code samples, and comparison.

Use Cases for Steam Analytics

  • Anomaly detection in credit card transactions
  • Anomaly detection (fraud/outliers)
  • Application logging
  • Analytics pipelines, such as clickstreams
  • Live dashboarding
  • Archiving data
  • Transaction processing
  • User telemetry processing
  • Device telemetry streaming
  • Alert system
  • Click steam Analytics
  • Analyze streaming social media data

Amazon Kinesis

Amazon Kinesis is an Amazon Web Service for easy and scalable collection, processing, and analysis of video and data streams in real-time. It can also be used for batch processing. It is can be used for data ingestion for machine learning, analytics, and other applications.

Amazon Kinesis Data Stream Architecture source

Other Amazon Web Services used with Amazon Kinesis: Kinesis Video Streams, Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Amazon EC2, Amazon S3, Amazon Redshift, DynamoDB, Amazon EMR, Amazon Elasticsearch Service, and Amazon Lambda.

Types of Amazon Kinesis

Kinesis Video Streams — capture, process, and store video streams.

Kinesis Data Streams- capture, process, and store data streams.

Kinesis Data Firehose — capture, transform, and load data streams into AWS data stores.

Kinesis Data Analytics — process data streams in real-time with SQL or Apache Flink.

Pricing

Amazon Kinesis Data Streams Pricing is based on Shared Hour and PUT Payload Unit.

Amazon Kinesis Video Streams Pricing is based on the volume of data you ingest, store, and consume in your video streams.

Amazon Kinesis Data Firehose Pricing is based on the volume of data ingested into Kinesis Data Firehose.

Amazon Kinesis Data Analytics Pricing is based on the average number of Kinesis Processing Units (or KPUs) used to run your stream processing application which is charged per hour.

Hand-On Tutorials

Building a log Analytics Solution

  • Set up a kinesis Agent
  • Create an end to end data delivery stream using kinesis Data Firehose
  • Process incoming data using SQL queries using Amazon Kinesis Data Analytics
  • Load processed data from Kinesis Data Analytics to Amazon Elasticsearch Service
  • Analyze and visualize the processed data using Kibana.

Process Real-Time Stock Data Using KPL and KCL

Real-Time Hotspot Detection in Amazon Kinesis Analytics

Code Examples

Generating streams of weather dataKinesis Data Generator

Official Resources

Amazon KinesisAmazon Kinesis Blog

Other Resources

High Performance Data Streaming with Amazon Kinesis: Best Practices and Common Pitfalls(video)

AWS Streaming Data Solution for Amazon Kinesis(article)

Streaming Data Solutions on AWS with Amazon Kinesis(pdf)

AWS Kinesis Tutorial for Beginners | Introduction to Amazon Kinesis AWS Training | Edureka(video)

Azure Event Hub

Event Hub is a real time ingestion service. The streamed data can be stored in Blob storage, Data Lake, and other storage. It can be connected to Azure Stream Analytics to analyze the data. Data can be sent to Event Hub, analysed with Azure Stream Analytics at real time, then visualized with Power BI.

Azure Event Hub Workflow source

Other Azure services commonly used with Azure Event Hub :- Azure Stream Analytics, Power BI, Azure blob storage, Azure event grid, Azure Function, Azure Synapse Analytics, and Azure Event Gird.

Pricing

Azure Event Hub Pricing is based on throughput unit, ingress events, and capture.

Hands-On Tutorials

Migrate captured Event Hubs data to Azure Synapse Analytics using Event Grid and Azure Functions

  • Data sent to an Azure event hub is captured in Azure blob storage.
  • When the data capture is complete, an event is generated and sent to an Azure event grid.
  • The event grid forwards this event data to Azure function app.
  • The function app uses the blob URL in the event data to retrieve the blob from the storage.
  • The function app migrates the blob data to Azure Synapse Analytics.

Visualize data anomalies in real-time events sent to Azure Event HubsStream Twitter data into Azure Databricks using Event Hubs

Code Samples

Azure Event Hubs samples

Official Resources

Azure Event Hubs documentation

Other Resources

Azure Event Hub Tutorial | Big data message streaming service(video)

Azure Event Hubs for Apache Kafka | Azure Friday(video)

How to perform data ingestion with Azure Event Hubs | Azure Makers Series(video)

Messaging with Azure Event Hubs(video)

Azure Stream Analytics

Azure Stream Analytics is a serverless real time analytics service. It uses SQL which can be extensible with custom code and built in machine learning capabilities.

Azure Stream Analytics Workflow source

Other Azure services commonly used with Azure Stream Analytics:- Azure Event Hub, Power BI, Azure blob storage, Azure event grid, Azure Function, Azure Synapse Analytics

Pricing

Azure Stream Analytics Pricing is based on the number of streaming Units Provisioned. A Streaming Unit represents the amount of memory and compute allocated to your resources.

Hands-On Tutorials

Analyze fraudulent call data with Stream Analytics and visualize results in Power BI dashboard

  • Create event hubs.
  • Start event generator application.
  • Create Stream Analytics Job and configure job input using event Hub.
  • Visualize transformed data on Power BI.

Monitor GeoFences in real-time using Azure SQL and Stream Analytics

Code examples

Real-Time Serverless GeoSpatial Public Transportation GeoFencing Solution

Official Resources

Azure Stream Analytics documentation

Azure Stream Analytics

Other Resources

Azure Stream Analytics Tutorial | Processing stream data with SQL(video)

Anomaly detection using machine learning in Azure Stream Analytics | Azure Friday(video)

Azure Stream Analytics(video)

Comparing Amazon Kinesis, Azure Event Hub and Azure Stream Analytics

Comparing Apache Kafka, Amazon Kinesis, Microsoft Event Hubs, and Google Pub/Sub

A closer look at data streaming capabilities of Amazon Kinesis and Azure Event Hubs

An Introduction to stream processing systems: Kafka, AWS Kinesis and Azure Event Hubs

Apache Kafka VS AWS Kinesis

I hope you have been able to gain insight into stream analytics and services provided by AWS and Azure for processing stream data.

Originally published at https://trojrobert.github.io on December 26, 2020.

--

--

Robert John

I develop machine learning models and deploy them to production using cloud services.