Apache Spark

Total 8 Posts

Performance Tweaking Apache Spark

Apache Spark Streaming applications need to be monitored frequently to be certain that they are…
Read More

Jun 26,2017

Incrementally loaded Parquet files

In this post, I explore how you can leverage Parquet when you need to load…
Read More

May 17,2017

MongoDB and Apache Spark - Getting started tutorial

MongoDB and Apache Spark are two popular Big Data technologies. In my previous post, I…
Read More

May 03,2017

Introduction to the MongoDB connector for Apache Spark

MongoDB is one of the most popular NoSQL databases. Its unique capabilities to store document-oriented…
Read More

Mar 31,2017

Spark Summit East 2017 - A summary

I attended Spark Summit East 2017 last week. This 2 day conference - February 8th…
Read More

Feb 21,2017

A tour of Databricks Community Edition: a hosted Spark service

With the recent announcement of the Community Edition, it’s time to have a look…
Read More

Apr 13,2016

Testing strategy for Apache Spark jobs - Part 1 of 2

Like any other application, Apache Spark jobs deserve good testing practices and coverage. Indeed, the…
Read More

Mar 11,2016

Applying Data Science with Apache Spark Coding Dojo

This week, at the power plant (Ippon Technologies USA headquarters), we had the pleasure of…
Read More

Aug 28,2015