Sunday, January 10, 2016

Enabling Foundational Data Needs

In my previous post (Data Needs & Making it Useful), I talked about 4 levels of data needs and how each level serves a particular purpose.  Here, I want to dive deeper into the first level, Data Pulls.

(Assumption) With any area focused on data to make decisions, you first need data to capture and track.  I am making the assumption that this is already completed.  In today's world, there petabytes among petabytes of data.  From transactional data to messaging / conversation history, there as a large availability of data.  The key is ensuring you are capturing this data and making it available to your analysts.

When making this data available to your analysts, there is a foundational need of getting to the data.

This need is what I classify as "Data Pulls".  As mentioned before, this is a dump of the raw data.  This is could be an excel file with every single field and row of data.  For example, I could ask for a data pull of sales.  Below is an example of the type of data requested

Fields Field Description
Sales Date The date/time of the transaction
Customer ID The id of the customer
Customer Name The name of the customer
Transaction Number The id of the activity
Payment Type The type of payment of the transaction
Transaction Type The type of transaction made
Item Order The position of item in the transaction
Item Name The name of the item
Item Number The id of the item
Item Description The description of the item
Item Category The grouping of similar items
Location The location of the transaction
Location Category The grouping of the location
Item Quantity The number of items in the transaction
Base Price The standard price of an item
Discount Discounts taken off of the item

This data could be pulled by a variety of ways but you need to know who your audience is and what their capabilities are.  Can they write SQL to pull the data themselves?  What types of tools are they used to working with (Excel, Reporting Tools, etc.)?  Do you want to continually supply this data to them manually?  If you can automate it, what happens if they want a new field to be added?

If you cannot simply provide a table for someone to write a query against, my suggested approach is to provide a "Self Service" tool.  Self service has long been a hot button topic for many companies.  To approach self service, I look at it in two fashions.  You need to implement both to be successful as each type of self service tool plays a distinct role.

  1. Wide Coverage & Fixed Granularity
    • This type of self service tool allows you to span across multiple subject areas at a fixed level of data.  An example of this may be looking at inventory, sales, transactions, customers and more all at a single day level.  This allows you to get a wide view and compare metrics across subject areas. 
  2. Deep Dive Subject Area
    • This type of self service tool allows you to "go deep" into a particular subject area.  If you are interested in sales, you can drill into very specific details such as transaction types, items, and more.

With both of these types of "Data Pull" tools implemented, this allows your customers to fill the base need of "getting to the data" while allowing the flexibility for them to adjust their data requests.  With self service tools like these built, this allows teams responsible for the data to free up their time to move to the next level of data needs "Products".


  1. Nice post, Rob.

    I agree that self-service tools are critical for organizations looking to provide regular data consumers, not just analysts, with the flexibility and autonomy to get the right data for themselves in a timely manner--without having to rely on others (often a slow process). Company culture has a lot to do with this decision, in my opinion.

    Looking at patterns and volumes amongst common data requests helps to drive development--if the information is tracked. If you know what fields, subjects, etc. are being queried more than others, you can better understand customer demand. Data to drive data solutions.

    --Dale Kube

    1. Completely agree. This is about enablement. Self service is one way to do this. A lot has to do with culture and skill sets in the org but bottom line is getting people access to the data is step one.