Total 51 Posts

Data

Boost the Performance of Your Databricks Jobs and Queries

Databricks is doing a lot of optimization and caching by default to have jobs and…

Read More


Mar 10, 2023 4 min read

Theo LEBRUN

Data

What if your data is too clean? Let's dirty some data!

I know what you are thinking, when is data too clean? And who wants dirty…

Read More


Jan 19, 2023 4 min read

Sarah Narum

Data

Coalesce 2022 - The Analytics Engineering Conference hosted by dbt Labs (Recap)

Coalesce 2022 is dbt Labs' analytics engineering conference. For its third year, from October 17th…

Read More


Jan 18, 2023 7 min read

Pooja Krishnan

Data

Data+AI Summit 2022 - Top Announcements and Recap

Data+AI Summit 2022 [https://databricks.com/dataaisummit/] is the world’s largest gathering among…

Read More


Jul 07, 2022 3 min read

Theo LEBRUN

Data

A Primer on Snowflake Stored Procedures

Snowflake is a data warehouse-as-a-service hosted completely in the cloud. For a Snowflake Primer, take…

Read More


Jul 05, 2022 14 min read

Pooja Krishnan

Cloud

Data Basics for Life-Long Software Engineers

Having recently made the switch from software to data engineering, I learned there are many…

Read More


Jan 18, 2022 3 min read

Hector Sanchez

Azure

Event-Driven Architecture: Getting Started with Kafka (Part 2)

An event-driven architecture is a paradigm that has become increasingly used in modern microservices-based architectures. It promises a more flexible and responsive architecture to business events, while offering better technical decoupling. Let's see how we can build it with Kafka.…

Read More


Nov 02, 2021 8 min read

Jean-François SIMON

Event-Driven

Event-Driven Architecture: Getting Started with Kafka (Part 1)

An event-driven architecture is a paradigm that has become increasingly used in modern microservices-based architectures. It promises a more flexible and responsive architecture to business events, while offering better technical decoupling. Let's see how we can build it with Kafka.…

Read More


Oct 26, 2021 7 min read

Jean-François SIMON

Event-Driven

A Beginner’s Guide to InfluxDB: A Time-Series Database

A time series database (TSDB) is specifically made for data that can be evaluated as…

Read More


Jun 29, 2021 4 min read

Ketki V Deshpande

Data

Data Hackathon Recap

Is the Holiday Spirit Contagious? During Ippon's first Data Hackathon in December 2020, the Data…

Read More


Feb 12, 2021 3 min read

Ramya Shetty

Data

Process CSVs from Amazon S3 using Apache Flink, JHipster, and Kubernetes

Apache Flink [https://flink.apache.org/] is one of the latest distributed Big Data frameworks…

Read More


Feb 04, 2021 6 min read

Theo LEBRUN

Data Streaming

Use Stargate by DataStax to effortlessly store and query your data

Stargate [https://stargate.io/] is one of the latest shiny tools from DataStax [https://www.…

Read More


Jan 15, 2021 5 min read

Theo LEBRUN

Cassandra

Tips and Tricks for Manually Scaling a Global DynamoDB Table from an AWS Lambda

Objective Write an AWS Lambda that manually scales a global DynamoDB table Why? DynamoDB tables…

Read More


Dec 01, 2020 3 min read

Dennis Sharpe

AWS

ABC's of DQM: Control

This is the finale of a 3-part series introducing a Data Quality Management (DQM) framework…

Read More


Aug 31, 2020 8 min read

Dan Ferguson

Data

The ABCs of DQM: Balance

This blog is a part of a series of posts on Data Quality Management. The…

Read More


Aug 18, 2020 5 min read

Pooja Krishnan

Data