Using Power Query for data reconciliation across two sources with a common ID

There are many ways to reconcile data, especiially for those with a sql db at their disposal. We had data in two systems that was supposed to be the same but had some discrepancies so I took a stab at sorting through it using power query and landed on a remarkably straightforward approach.

In my case one source was XML and the other salesforce. Power query has some odd behaviors with how it handles some data types in xml files, but i will leave those out and hope XML will disappear and make the world a better and safer place.

This exercise assumes you have two queries in a single power query file - one for each data set, and the column headers are the same in both files. If needed, there are ways to use a mapping file to transform the column names, which others have written about.

There were two key features that make this work

1: Unpivot other columsn

in each query, right click on the unique id column, and then select “unpivot other columns” which will turn your file from

id firstname lastname claim to fame
111 gorav seth salesforce mvp
222 peter churchill salesforce mvp

to

Id Attribute Value
111 firstname gorav
111 lastname seth
111 claim to fame salesforce mvp
222 firstname peter
222 lastname churchill
222 claim to fame salesforce mvp

2: Merge with multiple columns as the key

Now merge the two queries as new, matching on both Id AND Attribute. Prior to this I had not realized you can select more than one column as the key when merging tables in power query. To match on multiple columns, simply ctrl click the columns in order, ie source 1, id, value, source 2, id, value

merge_multi.png

Select an inner join if you only want to compare fields that exist in both sources, or use an outer join will also show fields that only exist in one of the sources.

After merging, expand the source 2 table, and include the value column only.

The output will now be

id Attribute Value Source2.Value
111 firstname gorav gorav
111 lastname seth seth
111 claim to fame salesforce mvp 2026 tdx mvp ping pong champion
222 firstname peter
222 lastname churchill
222 claim to fame salesforce mvp

Add a custom column to compare the value from the two sources in order to just see the rows that are different

if [Value] = [Source2.Value] then 0 else 1

And there you have it. in my example i did a full outer join so i see the delta for rows that had a value in source 1 and no value in source 2, as well as rows that exist in both source 1 and source 2 with different values.

id Attribute Value Source2.Value Has Delta
111 firstname gorav gorav 0
111 lastname seth seth 0
111 claim to fame salesforce mvp 2026 tdx mvp ping pong champion 1
222 firstname peter 1
222 lastname churchill 1
222 claim to fame salesforce mvp 1
 
0
Kudos
 
0
Kudos

Now read this

Creating a unique collection in flow

this is a very short post, that shows one small thing that can be very useful - creating a collection of unique elements in visual workflow there are now quite a few operators on collections in flow, but none of them are ‘remove... Continue →