Process CSVs from Amazon S3 using Apache Flink, JHipster, and Kubernetes
Apache Flink is one of the latest distributed Big Data frameworks with a goal of…
Confluent & Twitter4j Tutorial
Reading a Real-Time stream of Tweets into Kafka Kafka is an amazing tool for processing…
This post demonstrates a cost-effective and automated solution for running Spark-Jobs on the EMR cluster on a daily basis using CloudWatch, Lambda, EMR, S3, and SNS.…
On our previous video on the basics of Nifi, we covered a brief definition of…
Performance Tweaking Apache Spark
Apache Spark Streaming applications need to be monitored frequently to be certain that they are…
In our previous article on Nifi, we discussed the history, architecture, and features of Apache…