In this article I want to show you a few lines of Python code that can help you to save a lot of time when comparing CSV and other files with eachother.

In literally every data migration project you will need to compare the actual outcome of the migration with the expected outcome. While there is a plethora of dedicated software tools (e.g. Redgate’s SQL Compare) a DIY approach can take you quite far.

Let us assume that we have two datasets in CSV format:

Dataset 1 (“expected”):

A;B;C;D;Key
1;A;1;J;1
1;A;2;J;2
1;A;3;J;3
1;A;4;J;4
1;A;5;J;5

Dataset 2 (“actual”), that has been…

Uwe Ziegenhagen

From Cologne, Germany, working in the financial industry as a specialist for data-related topics. Hobbies include LaTeX typesetting, electronics and programming

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store