Author image

Alexis Seigneurin

21 posts Washington, DC, USA LinkedIn
Big Data Engineer & Managing Consultant - I work with Spark, Kafka and Cassandra. My preferred language is Scala!

Intro to MapReduce operations with Spark

In the previous post, we used the Map operation which allows us to transform values with a transformation function. We will now explore the Reduce operation which produces aggregates. Thus, we will work in MapReduce just like we do with Hadoop. Theory With Spark, just like Hadoop, a Reduce operation

Introduction to Apache Spark

Spark is a tool intended to process large volumes of data in a distributed fashion (cluster computing). Programming in Spark is simpler than in Hadoop, and Spark speeds up execution time by a factor of up to 100. Spark has a lot of momentum (almost as much as Docker) and