Amazon Kinesis: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
Line 58: Line 58:


The consumers may process data in real time. A Kinesis Data Stream consumer can be a custom application running on Amazon EC2 or an Amazon Kinesis Data Firehose delivery stream.
The consumers may process data in real time. A Kinesis Data Stream consumer can be a custom application running on Amazon EC2 or an Amazon Kinesis Data Firehose delivery stream.
==Retention Period==
The retention period is the length of time that data records are accessible after they are added to the stream. The default retention period is 24 hours, and it can be increased up to 168 hours (7 days).


=Services=
=Services=

Revision as of 04:57, 19 December 2018

External

Internal

Overview

Kinesis acts as a highly available conduit to stream messages between data producers and data consumers.

Concepts

Stream

A Kinesis data stream is a named set of shards. Streams can be created from the AWS Management Console, with AWS CLI and via the Kinesis Data Stream API.

Stream Name

The name space if defined by the AWS Account and AWS region: for the same account, streams with the same name can exist in different regions.

Shard

A shard has a sequence of data records. Shards are identified in a stream by their partition key. They also automatically get a Shard ID, which can be obtained with AWS CLI describe-stream command.

Shard Iterator

A shard iterator represents the position of the stream and shard from which the consumer will read.

Record

A record is a unit of data stored in a stream. A records has a sequence number, a partition key and a data blob. After the data blob is stored in a record, Kinesis does not inspect, interpret or change it in any way.

Data Blob

The data blob is the immutable sequence of bytes constituting the payload of a record. Kinesis Data Streams does not inspect, interpret, or change the data in the blob in any way. A data blob can be up to 1 MB.

Partition Key

The partition key is used to identify different shards in a stream, and allow a data producer to distribute data across shards.

Sequence Number

A sequence number is a unique identifier for records inserted into a shard, assigned by Kinesis. Sequence numbers increase monotonically, and are specific to individual shards. The sequence number for a record is accessible as a header ("aws_receivedSequenceNumber").

Producer

The producers can continually push data to Kinesis Data Streams.

Consumer

The consumers may process data in real time. A Kinesis Data Stream consumer can be a custom application running on Amazon EC2 or an Amazon Kinesis Data Firehose delivery stream.

Retention Period

The retention period is the length of time that data records are accessible after they are added to the stream. The default retention period is 24 hours, and it can be increased up to 168 hours (7 days).

Services

Amazon Kinesis Streams

Amazon Kinesis Streams

Amazon Kinesis Firehose

Amazon Kinesis Firehose

Amazon Kinesis Analytics

Amazon Kinesis Analytics

Kinesis Data Streams API