Total 21 Posts

Big Data

ABC's of DQM: Control

This is the finale of a 3-part series introducing a Data Quality Management (DQM) framework…

Read More


Aug 31, 2020 8 min read

Dan Ferguson

Big Data

The ABCs of DQM: Balance

This blog is a part of a series of posts on Data Quality Management. The…

Read More


Aug 18, 2020 5 min read

Pooja Krishnan

Data

Apache Spark 3.0

Databricks recently announced the release of Apache Spark 3.0 with their Databricks Runtime 7.…

Read More


Jun 23, 2020 3 min read

Theo LEBRUN

Big Data

Basics of Apache Nifi: 2

On our previous video on the basics of Nifi, we covered a brief definition of…

Read More


Nov 15, 2017 1 min read

Malcolm Thirus

Big Data

Data Extrapolation: Learning From Your Big Data

The first step in answering any Big Data-oriented question is to simply obtain the data.…

Read More


Jun 08, 2017 6 min read

Justin Risch

Big Data

MongoDB and Apache Spark - Getting started tutorial

MongoDB and Apache Spark are two popular Big Data technologies. In my previous post, I…

Read More


May 03, 2017 6 min read

Raphael Brugier

Big Data

Basics of Apache Nifi: 1

In our previous article on Nifi, we discussed the history, architecture, and features of Apache…

Read More


Apr 25, 2017 1 min read

Malcolm Thirus

Big Data

Streaming With Scala: The Nuance of Real-Time Twitter Data

At Ippon Technologies USA, we're lucky enough to have "Coding Dojos" every 2-4…

Read More


Mar 08, 2017 3 min read

Justin Risch

LifeAtIppon

Pokemon GO: A Big Data Learning Opportunity

Nick Peterson and Justin Risch have begun to study Big Data, Spark, Hadoop, and the…

Read More


Feb 16, 2017 6 min read

Justin Risch

Big Data

Why NiFi?

In this day and age we are living in, it is not a luxury to…

Read More


Jan 26, 2017 4 min read

Doug Mengistu

Big Data

Kafka Streams - Scaling up or down

Kafka Streams is a new component of the Kafka platform. It is a lightweight library…

Read More


Oct 06, 2016 6 min read

Alexis Seigneurin

Big Data

Spark - Calling Scala code from PySpark

In a previous post, I demonstrated how to consume a Kafka topic using Spark in…

Read More


Sep 12, 2016 4 min read

Alexis Seigneurin

Big Data

Apache Spark Datasets

With a Spark 2.0 release imminent, the previously experimental Datasets API will be a…

Read More


Jun 15, 2016 4 min read

Malcolm Thirus

Big Data

Spark & Kafka - Achieving zero data-loss

Kafka and Spark Streaming are two technologies that fit well together. Both are distributed systems…

Read More


May 12, 2016 9 min read

Alexis Seigneurin

Big Data

A tour of Databricks Community Edition: a hosted Spark service

With the recent announcement of the Community Edition, it’s time to have a look…

Read More


Apr 13, 2016 6 min read

Raphael Brugier

Apache Spark