Bulk Loading into MDS using SSIS

Each entity in SQL Server 2012 Master Data Services (MDS) will have it’s own staging table (stg.<name>_Leaf). Using this staging table, you can create, update, deactivate and delete left members in bulk. This post describes how to bulk load into an entity staging table and trigger the stored procedure to start the batch import process.

Staging Tables and Stored Procedures

The new entity based staging tables are an excellent feature in MDS 2012, and make it very easy to bulk load into MDS from SSIS. If you take a look at the SQL database used by your MDS instance, you’ll see at least one table in the stg schema for each entity. For this example I’ve created a Suppliers entity and I see a matching table called [stg].[Suppliers_Leaf]. If your entity is using hierarchies, you will have three staging tables (see BOL for details). If we expand the columns, we’ll see all of the attributes have their own columns, as well as some system columns that every staging table will have.

image

Each staging table will also have a stored procedure that is used to tell MDS that new data is ready to load. Details of the arguments can be found in BOL.

image

Import Columns

To load into this table from SSIS, our data flow will need to do the following:

  • Set a value for ImportType (see below)
  • Set a value for BatchTag
  • Map the column values in the data flow to the appropriate attribute columns

See the Leaf Member Staging Table BOL entry for details on the remaining system columns. If your Code value isn’t set to be generated automatically, then you’d also need to specify it in your data flow. Otherwise, the default fields can be safely ignored when we’re bulk importing.

The BatchTag column is used as an identifier in the UI – it can be any string value, as long as it’s unique (and under 50 characters).

MDS uses the same staging table for creating, updating and deleting entities. The ImportType column indicates which action you want to perform. The possible values are listed in the table below.

 


Value Description
0 Create new members. Replace existing MDS data with staged data, but only if the staged data is not NULL. NULL values are ignored. To change a string attribute value to NULL, set it ~NULL~. To change a number attribute value to NULL, set it to -98765432101234567890. To change a datetime attribute value to NULL, set it to 5555-11-22T12:34:56.
1 Create new members only. Any updates to existing MDS data fail.
2 Create new members. Replace existing MDS data with staged data. If you import NULL values, they will overwrite existing MDS values.
3 Deactivate the member, based on the Code value. All attributes, hierarchy and collection memberships, and transactions are maintained but no longer available in the UI. If the member is used as a domain-based attribute value of another member, the deactivation will fail. See ImportType 5 for an alternative.
4 Permanently delete the member, based on the Code value. All attributes, hierarchy and collection memberships, and transactions are permanently deleted. If the member is used as a domain-based attribute value of another member, the deletion will fail. See ImportType 6 for an alternative.
5 Deactivate the member, based on the Code value. All attributes, hierarchy and collection memberships, and transactions are maintained but no longer available in the UI. If the member is used as a domain-based attribute value of other members, the related values will be set to NULL. ImportType 5 is for leaf members only.
6 Permanently delete the member, based on the Code value. All attributes, hierarchy and collection memberships, and transactions are permanently deleted. If the member is used as a domain-based attribute value of other members, the related values will be set to NULL. ImportType 6 is for leaf members only.

When you are bulk loading data into MDS, you’ll use 0, 1 or 2 as the ImportType. To summarize the different modes:

  • Use 0 or 2 when you are adding new members and/or updating existing ones (i.e. doing a merge)
    • The difference between 0 and 2 is the way they handle NULLs when updating an existing member. With 0, NULL values are ignored (and require special handling if you actually want to set a NULL value). With 2, all values are replaced, even when the values are NULL.
  • Use 1 when you are only inserting new members. If you are specifying a code, then a duplicate value will cause the import to fail.

Package Design

You control flow will have at least two tasks:

  1. A Data Flow Task that loads your incoming data into the MDS staging table for your entity
  2. An Execute SQL Task which runs the staging table’s stored procedure which tells MDS to start processing the batch

image

Your data flow will have (at least) three steps:

  1. Read the values you want to load into MDS
  2. Add the BatchTag and ImportType column values (using a derived column transform)
  3. Load into the MDS staging table

image

As noted above, in your OLE DB Destination you’ll need to map your data flow columns to your member attributes (including Code if it’s not auto-generated), the BatchTag value (which can be automatically generated via expression), and the ImportType.

image

After the Data Flow, you’ll run the staging table stored procedure.

The first three parameters are required:

  1. The version name (i.e. VERSION_1)
  2. Whether this operation should be logged as an MDS transaction (i.e. do you want to record the change history, and make the change reversible?)
  3. The BatchTag value that you specified in your data flow

 

Additional resources:

EIM presentation material from DevTeach Montreal

I presented an Enterprise Information Manager talk earlier this week at the DevTeach Montreal conference. Unlike my previous talk from TechEd North America, which tackled the problem from the Data Curation (DQS/MDS) side, this talk has a focus on using SSIS to integrate and automate your solution. The demo files are now available from my Skydrive share, and the slides are embedded below.

SQL Server 2012 Case Studies for DQS

I’ve had a lot of people ask me recently for real-life examples of how customers are using Data Quality Services (DQS). Even though SQL Server 2012 has been out less than a month, we already have a number of case studies published which describe how DQS plays a key role within a customer’s infrastructure. Most of the studies involve end-to-end Enterprise Information Management (EIM) solutions which include SSIS and Master Data Services (MDS) as well.

Here are the five DQS case studies that are currently available on Microsoft.com:

  • Areva – Energy Firm Speeds the Delivery of Reliable, Centralized Master Data to Customers
  • China Guangdong Nuclear Power Holding Corporation – Chinese Energy Utility Builds BI Solution to Improve Information Sharing and Efficiency
  • Super 8 Hotels Co., Ltd. – Hotel Chain Uses Business Intelligence Tools to Guide Rapid Growth Across China
  • Great Western Bank – Fast-Growing Bank Gains Customers and Maximizes Profits with Microsoft BI Tools
  • RealtyTrac – Real Estate Website Helps Customers Make Better Decisions with Higher Quality Data

Getting Started with DQS and MDS

If you’re looking to get started with Data Quality Services (DQS) and Master Data Services (MDS), there are some fantastic resources available on Technet. The site includes videos and slides for full day training sessions on both products.

Data Quality Services for SQL Server 2012

  • Data Quality Basics and Introducing DQS: Video | Slides
  • Knowledge Management and Data Cleansing in DQS: Video | Slides
  • Data Matching in DQS: Video | Slides
  • DQS Integration with SSIS: Data Cleansing using SSIS: Video | Slides
  • DQS Integration with MDS: Data Matching using MDS: Video | Slides

Master Data Services for SQL Server 2012

  • Master Data Services Overview: Video | Slides
  • Managing Data Warehousing Dimensions with MDS, Part 1: Video | Slides
  • Managing Data Warehousing Dimensions with MDS, Part 2: Video
  • Data Loading via Entity Based Staging (EBS): Video | Slides
  • MDS Hierarchies and Collections: Video | Slides
  • Business Rules and Workflow in MDS: Video | Slides
  • MDS Model Migration and Upgrade: Video | Slides
  • Security Features and Guidelines in MDS: Video | Slides
  • Eliminate Duplicate Data with the MDS Add-In for Excel: Video | Slides