The Importance of Envisioning the End State in Data Exploration

By: Rigo Santillano

Tell me if this sounds familiar: You’re presented with a project scope that suggests the turnaround time should be relatively quick. Questions arise as you roll up your sleeves and get into the trenches of the work. Hurdles pop up. Soon, the project that should have taken a few days evolved into taking several weeks with many twists and turns along the way. 

This scenario often happens in the world of data exploration and management. That’s because data is complex. Wrangling big data sets consistently and accurately requires teams to collaborate to clarify the scope of work upfront. Getting that clarity can often be difficult as you pull various departments together to form a project team. More depth to the scope can save hours of back-and-forth and keep the project moving in a straight line rather than a loopy track.

Keeping teams consistently moving in the same direction toward an end goal isn’t always as easy as it looks on paper. If you’ve worked on a project, perhaps you’ve experienced this hurdle too. In my role, I’ve found ways to help get teams back on track by envisioning the end state when tackling any data exploration project. These steps may help save you time too.

Clarify What You Need Up Front

When a project begins, teams must clarify a few things up front:

  • Who is the target audience?
  • Who needs access to the reporting at the end of the project?
  • What data objects exist in the current environment?
  • Do the data objects have different names or importance in downstream systems? 
  • How will the data be maintained after the project?

These details allow the data team to identify objects that automatically meet the need and what data needs to be enriched. It lets the data team position the insights in a way that will answer and honor the person’s needs on the other side of the screen. For example, many teams consistently want to see the change or transformation of a particular set and then decide how it correlates with other sources, documents, or information. In these cases, the data team needs to be able to manipulate the reporting in a way that will visualize those correlations. 

The data team must start with the outcome in mind to understand what that looks like. Having that outcome in mind allows teams to justify what gets loaded into the production environment and map the design to show specific elements.

It’s All About the Methodology

Starting with the end state in mind allows teams to take big data sets and backward map the methodology for bringing those data points together. In doing so, teams can ensure the data collected and modeled conveys what it needs to convey. 

The first step is to identify the sources of truth used in the project. Once those have been identified, teams can establish a set of rules and requirements to qualify the data points before moving them to the next round of analysis. Once in that final round, the data will go through the same rules again. Ultimately, this approach can be standardized and scaled across multiple projects or teams. 

This methodology lets the team know the if/then conditions. If something falls away from those rules, the team can instantly spot the anomaly and pinpoint what went wrong. 

Managing Deviations in Data is Easier When Everyone’s on the Same Page

By starting with the methodology first, teams get clearer on how to spot deviations when they arise. Rather than dealing with the whole information set to pinpoint something that went wrong, the teams can quickly discover why an anomaly exists. 

  • Is it wrong at the source? 
  • Is there a condition that hasn’t been identified? 

If something goes wrong, then the team knows exactly what went wrong. If something deviates, the team can document that deviation and uncover how or where something went awry. It could be because something is wrong with the data. Or, it could be because the business unit had something extra that wasn’t accounted for. 

By getting teams out of silo-based operations, everyone can access the big picture and verify that the data needed will be reported, allowing for a more comprehensive and complete plan. 

Standardization of the Process

Does this sound like a lot of work? That’s because it is, but don’t let that scare you. Once there’s a process in place, it can often be standardized across projects and teams. The way we approach and collect data and the way we analyze the data can remain consistent across various business units. That consistency leads to scalability, which allows teams to move faster, cut out excess documentation, and give data teams the ability to take control of the reporting. 

Don’t let the day-to-day minutia or the patterns you’re used to running put up blinders on to the other steps you should have in place. Standardizing this process lets your team make decisions and project implementation easier and more efficiently long term.