Azure and Amazon Data Stream Analytics and Processing: Amazon Kinesis, Azure Stream Analytics, and Azure Event Hub
6 min to read
Data Stream Analytics also called event stream processing or real-time analytics is the processing and analysis of real-time data. These streaming data could be transaction data from an e-commerce website, financial trading floors, telemetry from IoT devices, and social media data.
This article gives a brief description and use cases of the data stream analytics services in AWS and Azure. It also contains resources about their prices, components, hands-on examples, code samples, and comparison.
Use Cases for Steam Analytics
- Anomaly detection in credit card transactions
- Anomaly detection (fraud/outliers)
- Application logging
- Analytics pipelines, such as clickstreams
- Live dashboarding
- Archiving data
- Transaction processing
- User telemetry processing
- Device telemetry streaming
- Alert system
- Click steam Analytics
- Analyze streaming social media data
Amazon Kinesis is an Amazon Web Service for easy and scalable collection, processing, and analysis of video and data streams in real-time. It can also be used for batch processing. It is can be used for data ingestion for machine learning, analytics, and other applications.
Other Amazon Web Services used with Amazon Kinesis: Kinesis Video Streams, Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Amazon EC2, Amazon S3, Amazon Redshift, DynamoDB, Amazon EMR, Amazon Elasticsearch Service, and Amazon Lambda.
Types of Amazon Kinesis
Kinesis Video Streams — capture, process, and store video streams.
Kinesis Data Streams- capture, process, and store data streams.
Kinesis Data Firehose — capture, transform, and load data streams into AWS data stores.
Kinesis Data Analytics — process data streams in real-time with SQL or Apache Flink.
Amazon Kinesis Data Streams Pricing is based on Shared Hour and PUT Payload Unit.
Amazon Kinesis Video Streams Pricing is based on the volume of data you ingest, store, and consume in your video streams.
Amazon Kinesis Data Firehose Pricing is based on the volume of data ingested into Kinesis Data Firehose.
Amazon Kinesis Data Analytics Pricing is based on the average number of Kinesis Processing Units (or KPUs) used to run your stream processing application which is charged per hour.
- Set up a kinesis Agent
- Create an end to end data delivery stream using kinesis Data Firehose
- Process incoming data using SQL queries using Amazon Kinesis Data Analytics
- Load processed data from Kinesis Data Analytics to Amazon Elasticsearch Service
- Analyze and visualize the processed data using Kibana.
Azure Event Hub
Event Hub is a real time ingestion service. The streamed data can be stored in Blob storage, Data Lake, and other storage. It can be connected to Azure Stream Analytics to analyze the data. Data can be sent to Event Hub, analysed with Azure Stream Analytics at real time, then visualized with Power BI.
Other Azure services commonly used with Azure Event Hub :- Azure Stream Analytics, Power BI, Azure blob storage, Azure event grid, Azure Function, Azure Synapse Analytics, and Azure Event Gird.
Azure Event Hub Pricing is based on throughput unit, ingress events, and capture.
- Data sent to an Azure event hub is captured in Azure blob storage.
- When the data capture is complete, an event is generated and sent to an Azure event grid.
- The event grid forwards this event data to Azure function app.
- The function app uses the blob URL in the event data to retrieve the blob from the storage.
- The function app migrates the blob data to Azure Synapse Analytics.
Azure Stream Analytics
Azure Stream Analytics is a serverless real time analytics service. It uses SQL which can be extensible with custom code and built in machine learning capabilities.
Other Azure services commonly used with Azure Stream Analytics:- Azure Event Hub, Power BI, Azure blob storage, Azure event grid, Azure Function, Azure Synapse Analytics
Azure Stream Analytics Pricing is based on the number of streaming Units Provisioned. A Streaming Unit represents the amount of memory and compute allocated to your resources.
- Create event hubs.
- Start event generator application.
- Create Stream Analytics Job and configure job input using event Hub.
- Visualize transformed data on Power BI.
Comparing Amazon Kinesis, Azure Event Hub and Azure Stream Analytics
I hope you have been able to gain insight into stream analytics and services provided by AWS and Azure for processing stream data.
Originally published at https://trojrobert.github.io on December 26, 2020.