Creating a Kafka Client

Kafka client is an application that lets you interact with a Kafka cluster. You create producer applications to send data to Kafka topics and consumer applications to read the messages from Kafka topics.

Installing Kafka

On each compute instance you create, you install the Apache Kafka client libraries and tools.

  1. Connect to the compute instance you created.
    ssh -i <private-key-file> <username>@<public-ip-address>
  2. Apache Kafka requires Java. Install Java in the compute instance, if it's not already installed. You can verify if Java is installed by running the version command.
    java -version
    sudo yum install java-11-openjdk -y
  3. Download the Apache Kafka version you want to install from the official Apache Kafka server.
    wget https://downloads.apache.org/kafka/<version>/kafka_<version>.tgz
  4. Extract the downloaded package.
    tar -xzf kafka_<version>.tgz

Configuring the Client

On each compute instance you create, you configure the client properties file.

  1. Connect to the compute instance you created.
    ssh -i <private-key-file> <username>@<public-ip-address>
  2. Change directory to the config directory in the installed Apache Kafka client library location.
    cd kafka_<version>/config
  3. Create a file named client.properties.
    nano client.properties
  4. Depending on the authentication you configured for the Kafka cluster, create the security property settings in the client.properties file.
    • For SASL/SCRAM authentication
      security.protocol=SASL_SSL 
      sasl.mechanism=SCRAM-SHA-512 
      sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="<vault-username>" password="<vault-secret>"
    • For mTLS authentication
      security.protocol=SSL
      ssl.certificate.location=/leaf.cert
      ssl.key.location=/leaf.key
      ssl.keystore.password=<password>
      ssl.keystore.location=/kafka-keystore.p12

Managing Client Configuration

Kafka clients require specific configurations to specify how the client should send and receive messages, handle errors, and manage its connection to the Kafka cluster.

Apache Kafka provides command-line interface (CLI) tools in the /bin directory. For example, the kafka-configs.sh tool available in the installed Kafka client libraries can be used to manage client configurations.

Following are some example of common CLI commands. Get cluster details for the bootstrap URL in the commands. Specify the path to the client.properties file you created when you configured the client.

Viewing topic configuration

To view a topic configuration, run the kafka-configs.sh tool by specifying the name of the topic for which you want to view the configuration.

bin/kafka-configs.sh
--bootstrap-server $URL
--command-config /path/to/client.properties
--entity-type topics
--entity-name $TOPIC_NAME
--describe
Viewing partition and replication information for all topics

To view partition and replication information for all topics, run the kafka-configs.sh tool.

bin/kafka-configs.sh
--bootstrap-server $URL
--command-config /path/to/client.properties
--describe | grep Factor | sort 
Changing topic configuration

To change a topic configuration, such as retention and segment, run the kafka-configs.sh tool by specifying the name of the topic for which you want to change the configuration.

bin/kafka-configs.sh
--bootstrap-server $URL
--command-config /path/to/client.properties
--entity-type topics
--entity-name $TOPIC_NAME
--alter
--add-config retention.ms=1, segment.ms=60000
Removing topic configuration

To remove a topic configuration, such as retention and segment, run the kafka-configs.sh tool by specifying the name of the topic for which you want to remove the configuration.

bin/kafka-configs.sh
--bootstrap-server $URL
--command-config /path/to/client.properties
--entity-type topics
--entity-name $TOPIC_NAME
--alter
--delete-config retention.ms,segment.ms
Viewing disk usage for each partition

To view disk usage for each partition, run the kafka-log-dirs.sh tool and specify the path to the output log file.

bin/kafka-log-dirs.sh
--bootstrap-server $URL
--command-config /path/to/client.properties
--describe | tail -1 > /tmp/logdirs.output.txt

Then, run the following command to filter the log file and display its contents. Adjust brokers[0] to target a specific broker.

cat /tmp/logdirs.output.txt
  | jq -r '.brokers[0] | .logDirs[0].partitions[] | .partition + " " + (.size|tostring)' 
  | sort 
  | awk 'BEGIN {sum=0} {sum+=$2} END {print sum}'
Enabling trace log

To enable trace log for a broker, create a file log4j.properties with the following properties.

log4j.rootLogger=TRACE, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n 

Then, run the following command.

export KAFKA_OPTS="-Dlog4j.configuration=file:/path/to/log4j.properties"

Monitoring Client Metrics

You must monitor the client applications and the Kafka cluster.

Use OCI Monitoring service to monitor the metrics emitted by the Kafka brokers.

For the client side metrics, you must create your own custom dashboard to monitor client applications. At a minimum, monitor the following client metrics:

  • record-send-rate
  • request-latency-avg
  • error-rate
  • records-consumed-rate
  • lag
  • fetch-latency-avg
  • Retries
  • disconnects