Parsing Binary data using Power Query formulas

The Power Query formula language (M) contains a number of library functions that allow you to parse binary data. You can use these functions to build queries which parse custom binary file formats. This post contains two samples – a simple query which parses X & Y coordinate values from a binary field, and a more advanced query which reads PNG file headers.

Parsing binary data

The following query defines some binary data (Source), a record format (PointFormat), and a parsing format definition (FileFormat).

Let’s break this down.

The #binary function lets you pass in a list of byte values. Here we have specified a total of 12 bytes.

PointFormat defines a record format with two BinaryFormat.SignedInteger16 fields (2 bytes each), x and y, for a total of 4 bytes.

The first argument to the BinaryFormat.Choice function indicates that we should read the first 4 bytes (the size of an BinaryFormat.UnsignedInteger32). This value is read and passed in as the count parameter to the BinaryFormat.List function. BinaryFormat.List ends up reading the remaining 8 bytes of the binary (count * sizeof(PointFormat)), and outputs a list of records.

let
Source = #binary(
{0x00, 0x00, 0x00, 0x02,
0x00, 0x03, 0x00, 0x04,
0x00, 0x05, 0x00, 0x06}),

PointFormat = BinaryFormat.Record([
x = BinaryFormat.SignedInteger16,
y = BinaryFormat.SignedInteger16]),

FileFormat = BinaryFormat.Choice(BinaryFormat.UnsignedInteger32,
(count) => BinaryFormat.List(PointFormat, count))
in
FileFormat(Source)

If we put this M code into Power Query, convert the list of records to a table, and then expand, we get something like this:

image

Query

Parsing PNG Headers

This sample is provided as an exercise to the reader to understand. To test it out, replace the path value in the call to File.Contents.

Leave a Reply

Your email address will not be published. Required fields are marked *