1.Create main and sub transformation as discussed below
2.call sub transformation from main Transformation
Note:-Sub transformation required for Kafka consumer step
Download working sample from here
https://drive.google.com/open?id=1Z4C2miczU0BnB4n3r1LcpN78v2UjefWQ
In the kaka transformation,
1.We are using direct bootstrap server on connection.
2. we added the consumer group "test-consumer-group1" change consumer group after every run to retrieve Kafka message from start.
Important:-if you not change consumer group, kafka will not retrieve any message unless any new message arrived to topic.
like test-consumer-group1,test-consumer-group2,test-consumer-group3 .....
3. Changed the auto.offset.reset to "earliest" on options tab.
In the sub transformation.
In "Get records from stream" step, we gave the below fields Fieldname Type key None message None topic None partition None offset None timestamp Timestamp
In this tutorial, I will show how to set up and run Apache Kafka on Windows/linux and how to load/read data from pentaho.
Kafka comes with two sets of scripts to run Kafka. In the bin folder, the sh files are used to set up Kafka in a Linux environment. In the bin\windows folder, there are also some bat files corresponds to those sh files which are supposed to work in a Windows environment. Some say you can use Cygwin to execute the sh scripts in order to run Kafka. However, they are many additional steps involved, and in the end you may not get the desired outcome. With the correct bat files, there is no need to use Cygwin, and only Server JRE is required to run Kafka on Windows.
Step 0: Preparation
Install Java 8 SE Server JRE/JDK
You need Java SE Server JRE in order to run Kafka. If you have JDK installed, you already have Server JRE installed, just check if the folder {JRE_PATH}\bin\server exists. If it is not, follow the following steps to install Java SE Server JRE:
The config files need to be updated corresponding to Windows path naming convention.
Change this path if you using Windows
*** Important :- create this path inside Kafka root directory,other wise Kafka server may not start
Open config\server.properties, change
server.properties
1
log.dirs=/tmp/kafka-logs
to
server.properties
1
log.dirs=c:/kafka/kafka-logs
2.. Open config\zookeeper.properties, change
zookeeper.proerties
1
dataDir=/tmp/zookeeper
to
zookeeper.properties
1
dataDir=c:/kafka/zookeeper-data
Step 2: Start the Server
In Windows Command Prompt, switch the current working directory to C:\kafka:
1
cd C:\kafka
Start Zookeeper
you can create bat file to start Zooker (optional) kafka-server-start.bat and put below conteent kafka-server-start.bat {base folder}\kafkaBinary\kafka_2.11-1.1.0\config\server.properties
Kafka uses ZooKeeper so you need to first start a ZooKeeper server if you don’t already have one. You can use the convenience script packaged with Kafka to get a quick-and-dirty single-node ZooKeeper instance.
you can create bat file to start Kafka (optional) kafka-server-start.bat and put below conteent zookeeper-server-start.bat {base folder}\kafka_2.11-1.1.0\config\zookeeper.properties
Kafka comes with a command line client that will take input from a file or from standard input and send it out as messages to the Kafka cluster. By default each line will be sent as a separate message.
Start console producer
1
> .\bin\windows\kafka-console-producer.bat --broker-list localhost:9092 --topic test
If you have each of the above commands running in a different terminal then you should now be able to type messages into the producer terminal and see them appear in the consumer terminal.
Yay cheers!
Common Errors and Solutions
classpath is empty. please build the project first e.g. by running 'gradlew jarall'
Note: Do not download a source files from appache kafka, download a binary file