Dynamic Lookups With Data Explorer

In my previous post, I used Data Explorer to grab the list of upcoming events from the SQL Saturday website. Excel has had the ability to grab data from HTML pages for awhile, but using Data Explorer provided a richer set of options, and allowed me to apply a series of transforms to get the data exactly how I wanted it.

Using this events data, I want to do two things:

  • Enrich the data with some more details about the event venue’s location
  • Plot the events on a map/chart so I can see which ones are close to me

This post will cover the first step, and let’s me show off some of the more advanced Data Explorer functionality.

Address Lookups

There are a number of web services that allow you to validate an address, or fetch additional details (such as latitude and longitude). I just happen to have a subscription to Melissa Data’s Address Check service, provided through the Windows Azure Marketplace. While the service was designed to work with Data Quality Services, it can also be invoked programmatically, and returns a whole bunch of additional information for the address you provide. It will even return suggestions when an exact match for the address isn’t found.

image

I previously covered how to retrieve data from the Windows Azure Marketplace, but unfortunately that method won’t help us here because we’re not dealing with a pre-defined set of data (I’m not sure how long it would take to pull in all possible US and Canada address combinations to Excel, but it’s probably not something I’d want to try on my laptop). The Address Check service I want to use is an OData Service (in fact, all data market offerings have OData interfaces), and it exposes a single function with three parameters

  • Address
  • MaximumSuggestions
  • MinimumConfidence

I found all of the details (including the service URL) on the service description page in the marketplace.

Dynamic Lookups

I’ll use the list of events I retrieved in my previous post as a starting point. To keep the original query intact, I’ll Duplicate my query using the option on the QUERY tab.

 image

Next, I click on the Filter & Shape button to bring up my query.

image

I want to invoke the address lookup function on my OData Service for each row of my data set, and pass in the value of the Location field, but couldn’t find an option that would allow this on the query editor UI. Instead, we can turn to the Data Explorer formula language, and start adding some custom transformations.

First, we need to add a new transformation step by clicking the formula button.

image

By default, the step we just created displays the same data as the previous step. We can change this by modifying the value in the formula bar. Note, at the moment there is no intellisense or syntax highlighting for the formulas, but I’m hoping we’ll see this in a future update. The formula to invoke the external OData Service took some trial and error, but I eventually got it working.

First we define the lookup as an auxiliary query. We’ll name the step Lookup to make it easy to reference.

= OData.Feed(“https://api.datamarket.azure.com/MelissaData/AddressCheck/v1/”){0}[Data]

This formula is accessing the OData Service at the given URL, and returning the data ([Data]) at the first member ({0}), which in this case is the Melissa Data SuggestAddress function that we’re looking for.

image

When the function is returned, Data Explorer provides a UI that shows us the arguments, and indicates that the result will come back in a table format. Pressing the Invoke button lets us enter static values (shown below for demonstration purposes), but as we’re interested in dynamic lookups using the data from our existing query, we’ll skip this for now.

image

We then add a second step which invokes the Lookup function we’ve defined for each row in our data set. We can click the formula button again to create the new step, and use this value:

= Table.AddColumn(RenamedColumns, “Address”, each Lookup([Location], 1, 0))

Let’s break down what this formula is doing:

  • Add a new column to the result of RenamedColumns step
  • Name the column “Address”
  • For each row in the data set, call the function defined by the Lookup step
  • When calling the function, pass in the [Location] field (with two hard coded values – 1, 0 – for the remaining arguments)

After entering the formula, we get a notification that credentials are required to access this data source.

image

Clicking Edit credentials brings up the Data Explorer credential manager. The Windows Azure Marketplace uses the concept of a Feed Key (similar to an “API key” used by many web/REST based APIs). Clicking the link on the page brings you directly to the Marketplace page which displays your key – you can copy & paste it from there. (Be careful with the copy & paste! On my first couple of tries, I included extra whitespace before and after the key, which ended up giving me a permission error when I tried to connect)

image

After entering the key, we can Save and go back to our query. We immediate see another error:

image

Formula.Firewall: Quer
y ‘SqlSaturdayEvents’ is accessing data sources which can not be used together.

 

By default, it appears that Data Explorer doesn’t allow you to pass data between services, as it could result in a “leak”. For example, in this case I’m sending my original Location data (coming from the SQL Saturday website) to the OData Service (run by Melissa Data). Thankfully, there is a way to allow this behavior using functionality called “Fast Combine”.

We can click Done to save the query we have at this point (even though it’s in an error state), and go back to Excel. On the DATA EXPLORER tab, we can see the Fast Combine option.

image

Clicking it brings up a warning dialog that enabling Fast Combine may improve query performance, but may introduce privacy issues.

image

Click Enable, go back to Filter & Shape, and now the data is displayed.

image

The results of the OData Service come back as a table, which we can expand inline by clicking the small icon next to the column title.

image

After selecting the columns we want, we can see the new values in our data set.

image

The Melissa Data service we’re using is for US and Canadian addresses, so we can see that the non-US locations aren’t getting Latitude and Longitude results. Addresses that aren’t matched should come back with a confidence level of 0, so we can use this to filter out these unmatched locations from our list. Our filter will only include rows that have a Confidence value greater than 0.

image

image

The remaining results look pretty good. We can click Done to save the query and get the results into Excel.

image

Summary

Wow – I’m impressed with Data Explorer so far. I really like the idea of mashing up data sources like this, and browsing through the formula language gave me an idea of what’s possible.

In my next post, I’ll take a look at importing the data into a PowerPivot, and then doing some visualizing with Power View.

For reference, here are the queries I used at each step:

 

Source = Table.TransformColumns(Data0,{},Value.FromText)
HiddenColumns = Table.RemoveColumns(ChangedType,{“Column1”, “Column3”})
SplitColumnDelimiter = Table.SplitColumn(HiddenColumns,”Column4″,Splitter.SplitTextByEachDelimiter({“-“}, null, false),{“Column4.1”, “Column4.2”})
SplitColumnPos = Table.SplitColumn(SplitColumnDelimiter,”Column4.1″,Splitter.SplitTextByPositions({0, 13}, false),{“Column4.1.1”, “Column4.1.2”})
HiddenColumns1 = Table.RemoveColumns(SplitColumnPos,{“Column4.1.1”})
RenamedColumns

= Table.RenameColumns(HiddenColumns1,{{“Column2”, “Date”}, {“Column4.1.2”, “Number”}, {“Column4.2”, “Name”}, {“Column5”, “Location”}})

Lookup = OData.Feed(“https://api.datamarket.azure.com/MelissaData/AddressCheck/v1/”){0}[Data]
AddressLookup

= Table.AddColumn(RenamedColumns, “Address”, each Lookup([Location], 1, 0))

Expand Address

= Table.ExpandTableColumn(AddressLookup, “Address”, {“Confidence”, “Latitude”, “Longitude”, “City”, “State”, “Country”}, {“Address.Confidence”, “Address.Latitude”, “Address.Longitude”, “Address.City”, “Address.State”, “Address.Country”})

FilteredRows

= Table.SelectRows(#”Expand Address”, each [Address.Confidence] > 0)

5 thoughts on “Dynamic Lookups With Data Explorer

  1. Hello,
    when i used the lookup function, i have the following error message :
    Expression.Error: The name 'Lookup' was not recognized. Is it spelled correctly?

    this is the expression i wrote : = Table.AddColumn(FirstRowAsHeader, "Ind_ID", each Lookup([Dimension Indicateur], 1, 0))

    any idea ?

    Olivier

  2. Matt – trying to set up some 'binning script' in power query – could you point me to someone that might be able to help
    — my options are:
    i) 'bin' it in excel using the lookup array feature (ie =LOOKUP(A3,{0,3,5,7},{"0-3","3-5","5-7","Spares"}) and simply uplift it into PQ,
    ii) use a third party 'binning tool' (ie predixion workbench – this works really well – but I'd prefer to highlight this functionality in PQ.
    — have searched the web (ie Chris Webb's BI blog& yours ) and looked the PQ formula spec – still unsure how to proceed.

Comments are closed.