Kafka – Spark Streaming Integration

Kafka – Spark Streaming Integration

Spark streaming is a distributed stream processing engine which can ingest data from various sources. One of the most popular source is Apache Kafka which is a distributed streaming platform providing you publish and subscribe features of an enterprise messaging system while also supporting data stream processing. In this blog we will create a realtime streaming pipeline for ingesting credit card data and finding Merchants […]

Read Me

Setup Standalone Apache Kafka Instance

Setup Standalone Apache Kafka Instance

Apache Kafka is a distributed streaming platform providing you publish and subscribe features of an enterprise messaging system while also supporting data stream processing. In this blog we will setup a standalone Kafka topic on a local machine on Windows operating system. Please note, consider this setup as a Hello World application as it is not meant for production use.   Software versions used in […]

Read Me