Author image

35 posts

Record Linkage, a real use case with Spark ML

I participated to a project for a leading insurance company where I implemented a Record…

Read More


Feb 22, 2016 10 min read

Alexis Seigneurin

Apache Spark: MapReduce and RDD manipulations with keys

In a previous article, we saw that Apache Spark allows us to perform aggregations on…

Read More


Dec 30, 2014 5 min read

Alexis Seigneurin

Intro to MapReduce operations with Spark

In the previous post, we used the Map operation which allows us to transform values…

Read More


Nov 22, 2014 3 min read

Alexis Seigneurin

Introduction to Apache Spark

Spark is a tool intended to process large volumes of data in a distributed fashion…

Read More


Nov 11, 2014 5 min read

Alexis Seigneurin

From development to production with Vagrant and Packer

From development to production with Vagrant and Packer Have you heard of Vagrant? Vagrant is…

Read More


Apr 14, 2014 12 min read

Alexis Seigneurin