Apache Kafka: basic installation and hands-on guide

Difficulty

Apache Kafka is the indispensable backbone for modern data architectures, underpinning everything from microservices communication to real-time analytics. While its capabilities are vast, getting a local setup running and understanding the basic command-line interface (CLI) tools is the crucial first step. This guide will walk you through a quick installation and introduce the essential scripts that bring Kafka to life.

Setting Up Your Local Apache Kafka Environment

Before diving into the scripts, you need a running Kafka instance. Since Kafka relies on ZooKeeper (now called Kraft) for cluster management, both need to be available.

Option 1: Manual Installation of Kafka

For a traditional setup, download the latest stable release of Apache Kafka from the official website.

https://kafka.apache.org/downloads

Once downloaded, extract the archive:

# Example for a Linux/macOS system
tar -xzf kafka_2.13-3.7.0.tgz
cd kafka_2.13-3.7.0/

Before proceeding, ensure you have Java (JRE/JDK 11 or higher) installed, as Kafka is a Java application. You can verify your version: java -version. Now, navigate to your installation directory.

1. Start ZooKeeper:

# Start the ZooKeeper server
./bin/zookeeper-server-start.sh ./config/zookeeper.properties

2. Start the Kafka Broker:

Once ZooKeeper is up, start the Kafka server (the “broker”) in a separate terminal window:

# Start the Kafka broker
./bin/kafka-server-start.sh ./config/server.properties

Option 2: Running with Docker (Recommended for Local Dev)

For local development, running Kafka via Docker is significantly faster and cleaner, avoiding direct system dependencies like Java versioning. You’ll need Docker and Docker Compose installed.

Here is a minimal docker-compose.yml file to get a single-broker Kafka and ZooKeeper setup running instantly:

version: '3.7'
services:
  zookeeper:
    image: confluentinc/cp-zookeeper:7.5.0
    container_name: zookeeper
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  kafka:
    image: confluentinc/cp-kafka:7.5.0
    container_name: kafka-broker
    ports:
      - "9092:9092"
    depends_on:
      - zookeeper
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka-broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: PLAINTEXT
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

networks:
  default:
    name: kafka-net

To launch your Kafka instance, simply run this command in the directory where you saved the file:

docker-compose up -d

With either the manual or Docker setup complete, your single-broker Kafka environment is ready for action. Note: When using Docker, you will execute the CLI tools (like kafka-topics.sh) inside the Kafka container by using docker exec -it kafka-broker bash.

The Essential Kafka CLI Toolkit

The true power of basic Kafka interaction lies in its bundled shell scripts. These tools allow you to manage topics, test data flow, and inspect the cluster configuration directly.

1. Topic Management: The `kafka-topics.sh`

A Topic is a category name to which records are published. You must create one before you can send or receive any data. The most common commands are for creating and listing topics.

To create a topic named my-first-topic with a single partition and a single replica (suitable for local testing):

./bin/kafka-topics.sh --create --topic my-first-topic --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1

To verify the creation:

./bin/kafka-topics.sh --list --bootstrap-server localhost:9092

2. Producing Data: The `kafka-console-producer.sh`

Producers write data to topics. The console producer is the simplest way to test your setup by typing messages directly into the terminal.

Run the following command and start typing messages. Each line you type and press Enter on is published as a message to my-first-topic:

./bin/kafka-console-producer.sh --topic my-first-topic --bootstrap-server localhost:9092
> Hello, Kafka!
> This is a second message.

3. Consuming Data: The `kafka-console-consumer.sh`

Consumers read data from topics. To see the messages you just produced, run the console consumer in another terminal. The --from-beginning flag ensures you see all messages currently stored in the topic:

./bin/kafka-console-consumer.sh --topic my-first-topic --from-beginning --bootstrap-server localhost:9092

You should immediately see the messages you sent: “Hello, Kafka!” and “This is a second message.”

4. Group Management: The `kafka-consumer-groups.sh`

In a real application, consumers belong to a Consumer Group to collaboratively read from a topic. This script is vital for monitoring their progress.

To check the state of consumer groups subscribed to your topic (useful for debugging lag):

./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
# To check the status of a specific group (e.g., console-consumer-...)
./bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group console-consumer-42001

Next Steps

Mastering these four basic command-line tools—along with occasionally using kafka-configs.sh for broker-level changes or kafka-delete-records.sh for maintenance—provides a solid foundation for any developer working with Apache Kafka.

Now that you have your first stream flowing, the next logical step is integrating Kafka with your Node.js or Java Spring applications, moving from the console to code.

Try it at home!

Post Views: 456

Be the first one to like this.

Please wait...

Setting Up Your Local Apache Kafka Environment

Option 1: Manual Installation of Kafka

Option 2: Running with Docker (Recommended for Local Dev)

The Essential Kafka CLI Toolkit

1. Topic Management: The kafka-topics.sh

2. Producing Data: The kafka-console-producer.sh

3. Consuming Data: The kafka-console-consumer.sh

4. Group Management: The kafka-consumer-groups.sh

Next Steps

Leave a Reply Cancel reply

1. Topic Management: The `kafka-topics.sh`

2. Producing Data: The `kafka-console-producer.sh`

3. Consuming Data: The `kafka-console-consumer.sh`

4. Group Management: The `kafka-consumer-groups.sh`