The Streams API - Writing the Processor and the KStreams API

David Landup
Arpendu Kumar Garai

After exploring the basic Producer and Consumer APIs, let's delve into the Kafka Streams API. With the explosion of big data frameworks and data streaming technologies, client-side architecture and technologies required significant power to process massive clusters of data every day. This led to the introduction of the ability to process large quantities of data in bulk. However, this concept of bulk processing was not enough for the industry. There was a constant need for real-time event processing beyond the usual batch processing. Thus, a new strategy known as micro batching was introduced. Micro batching is similar to actual batch processing, but it handles smaller quantities of data. By reducing the size of the data, it usually produces results more quickly at faster intervals but doesn't provide real-time per-event processing power.

This prompted everyone to start looking for stream processing so that data could be processed as and when it arrived into the system. Many streaming libraries were introduced in the market, such as Spark Streaming, Nifi, Flink, Storm, Samza, and others. Most of them were either micro-batching processes or per-event processing frameworks. Therefore, Kafka introduced the Kafka Streams library, which was the only streaming library in the world of streaming that processes data exactly once. It is a true streaming framework as it processes one record at a time rather than micro-batches.

Start course to continue
Lessson 6/14
You must first start the course before tracking progress.
Mark completed

© 2013-2024 Stack Abuse. All rights reserved.