Foundations of Data Strategy

Learn the steps to roll out an effective Data Management initiative.

Author:  Nicole Janeway  |  Post Date:  Dec 8, 2020  |  Last Update:  Aug 1, 2024  |  Related Posts

This writeup presents the basics of Data Strategy, which outlines how data contributes to the vision, focus areas, guiding principles, essential capabilities, and goals of an organization. A strong Data Strategy positions an organization to capitalize on its strategic assets without jeopardizing its most important relationships with customers, employees, and partners.

climber in the mountains
Photo by Dino Reichmuth on Unsplash

Contents

Introduction

Data Strategy helps an organization mitigate risk and protect its stakeholders, including customers, suppliers, and employees. A strong Data Strategy allows an organization to use data to drive decisions. Data provides a common language across business functions, allowing the organization to support existing capabilities and try new things.

On the other hand,

"The very existence of an organization can be threatened by poor quality data."
Joe Peppard, Principal Research Scientist at MIT Sloan School of Management

Aiken Pyramid
Aiken Pyramid courtesy of DMBOK

As discussed in the Data Management Body of Knowledge (DMBOK), the Aiken Pyramid outlines core Data Strategy concepts and illustrates how these functions build upon each other. The foundational layers of the pyramid are Data Governance, Data Quality, and Data Architecture. These areas are essential to the success of any data-related initiative. They provide the necessary infrastructure to support the other Data Management capabilities, including Data Modeling, Master & Reference Data Management, and Data Warehousing & Business Intelligence.

Data Governance

Percentage of the CDMP Fundamentals Exam: 11%

It's important to begin a Data Management initiative with a conversation about the mission statement and strategic aims of the organization. Connecting data to business outcomes will drive the motivation to sustain Data Governance for the long haul.

The team can then document this motivation in the Data Governance Charter, which should include a clear vision of the organization's future state when it comes to data. You can review our Document Checklist for more information about what should be included in a Data Governance Charter and a list of other documents to support the Data Management business function.

Beyond drafting a Data Governance Charter, here are some additional processes to consider implementing as part of a Data Governance initiative:

  • Develop Operating Framework
  • Develop RACI matrix
  • Conduct Maturity Assessment
  • Conduct Gap Assessment
  • Develop Roadmap with transition steps

As we turn to talking about the organizational structures required to support Data Governance, it's worth remembering that Data Governance is a process, not a project. There isn't a set end date. Therefore, it's important to maintain continuity by setting up Roles & Responsibilities that will outlast any specific individual contributor.

Organizational structures to support Data Governance:

  • Chief Data Officer
  • Steering Committee
  • Data Governance Council
  • Program Office
  • Data Stewards Working Groups

How to get started — here are three practical initial steps for a Data Governance initiative:

  • First, align data-related benefits and risks to the organization's strategic objectives.
  • Second, set up an organizational structure for the long term.
  • Finally, conduct a Data Maturity Assessment that includes conversations across the organization about data culture. This will help the Data Governance team get a better understanding of attitudes and behaviors around data. You can ask questions like: Is data freely shared across business units? Do data consumers have the skills and endpoints required to access the data they need in their work? Is data quality trusted?

Setting up Data Governance requires time and energy, but it's foundational to improving the organization's data infrastructure. It's truly the key to unlocking the potential of all other data-related activities. Keep in mind that Data Governance is an ongoing process, not a project.

Watch a video about this topic

Data Quality

Percentage of the CDMP Fundamentals Exam: 11%

The DMBOK identifies nine dimensions related to Data Quality: Accuracy, Completeness, Consistency, Integrity, Reasonability, Timeliness, Uniqueness/Deduplication, Validity, and Accessibility.

Data should be fit for a purpose. It should meet the requirements of its authors, users, and administrators.

How to get started:

  • In order to move forward with a Data Quality initiative, you should first set up a formal Data Quality Reporting Process. In The Lean Startup, Eric Ries talks about the importance of asking “why” 5 times in order to reach the root cause of a problem. That exercise is relevant to investigating Data Quality issues and conducting root cause remediation.
  • Second, you might consider data literacy training for data owners as well as data consumers across the organization.
  • Third, a significant number of Data Quality issues stem from lack of quality metadata, especially in the era of Data Lakes. It's possible that data is loaded into the storage system and then lost due to poor quality metadata tags. Efforts to create a data catalog could be useful in resolving this challenge.

Watch a video about this topic

CDMP Study Plan

hand holding lightbulb

Save time as you prepare for the CDMP Fundamentals Exam!

Join the thousands of Data Professionals who have opted to maximize their time, energy, and motivation while studying for the CDMP exam with the help of Data Strategy Professionals' CDMP Study Plan.

The plan is delivered as 5-6 emails each week that cover each of the 14 chapters DMBOK. The 90-day plan outlines a study and review schedule scientifically proven to maximize long term retention of the material.

On a different timeline to achieve Data Strategy mastery? No problem! Purchase Immediate Access to the CDMP Study Plan to receive all 75 emails at one time.

Because becoming a Data Strategist is about more than a test score, the CDMP Study Plan also provides real world examples, case studies, and job prep resources that will help you advance your career.

Data Architecture

Percentage of the CDMP Fundamentals Exam: 6%

This area represents the transformation of business needs into technical specifications. Architecture describes the current and future state of data infrastructure. The Enterprise Data Model and Data Flow Diagram are key tools that form the backbone of Data Architecture.

An Enterprise Data Model is a holistic, enterprise-level, implementation-independent Conceptual or Logical data model that provides a common, consistent view of data across the enterprise.

A Data Flow Diagram defines the requirements and master blueprint for storage and processing across databases, platforms, and networks. Here is an example Data Flow Diagram presented as a Matrix. A matrix provides a clear overview of data interchange. You could add more detail to this layout, such as comments indicating storage system.

How to get started with Data Architecture:

  • Develop an Enterprise Data Model (EDM) that provides a consistent view of data across the organization
  • Develop a Data Flow Diagram that defines the requirements and a blueprint for storage and processing across databases, applications, platforms, and networks

Watch a video about this topic

Data Modeling

Percentage of the CDMP Fundamentals Exam: 11%

Modeling provides a blueprint for how data is connected.

  • Starting with the Conceptual Data Model, business concepts and activities are documented as entities and relationships
  • The Logical Data Model captures detailed requirements. This phase builds on requirements and existing documentation to add associative entities
  • Finally, the development of the Physical Data Model outlines how data will be stored in the enterprise system

Once the modeling process is complete, the team may choose to undertake a process of reverse engineering from Physical Model, to Logical Model, to Conceptual Model, in order to ensure requirements are met.

At large enterprises, there's often a tendency to jump right into Logical Model and skip the Conceptual Model entirely. However, this is a mistake — there's significant value in starting with the Conceptual Model as it is a significant aid to creating a common understanding across the team of data entities and relationships.

How to get started with Data Modeling:

  • Select scheme and notation
  • Gather entities and relationships
  • Utilize organization-specific terminology

Watch a video about this topic

Master & Reference Data Management

Percentage of the CDMP Fundamentals Exam: 10%

This was one of the concepts that I was most unfamiliar with when I started reading the DMBOK. So I hope to provide you with a more intuitive understanding of what this area entails.

Master Data is information about business entities. It is collected and preserved as a “source of truth” and a resource across the entire organization. This reduces variation in how critical entities are defined and identified, and allows for data to be shared across business functions and applications. It promotes standards of shared data models and integration patterns.

Similarly, Reference Data refers to data that should be shared across the organization. It's often gathered from external sources, and it's used to provide context to the organization's functions and activities.

How to get started:

  • Plan for storage in a system of record
  • Collect business entities and document data
  • Set up a process for ongoing monitoring

Master and Reference Data provide a common language and emphasize that data is a shared asset across the organization.

Watch a video about this topic

Data Warehousing & Business Intelligence

Percentage of the CDMP Fundamentals Exam: 10%

The data warehouse is a specific infrastructure element that provides data consumers, such as analysts and data scientists, with access to data that has been shaped to conform to business rules and is stored in an easy-to-query format.

The data warehouse typically connects information from multiple “source-of-truth” transactional databases. The contents of a data warehouse have been restructured for speed and ease of querying. This serves to increase performance by partitioning, indexing, and decreasing the complexity of table joins.

Business intelligence is typically represented by reports and dashboards that provide insight to business stakeholders.

"The only sins that ever get addressed interrupt the flow of money."

I really like this quote from business intelligence consultant Rob Collie because it illustrates the fact that business intelligence will often reveal all the data quality issues that persist in a business. Just because a data quality problem isn't interrupting the cash flow, that doesn't mean it isn't insidiously impeding the organization's strategic objectives.

This is why it is so important to start Data Strategy efforts by creating governance and promoting data quality that then will flow into effective advanced analytics like business intelligence and data science projects.

How to get started with Data Warehousing:

  • First, design with the end in mind, then build and deliver in agile sprints
  • Second, aggregate and optimize at the end of the implementation process
  • Third, promote self-service access to data through transparent communication of metadata and education for data consumers

How to get started with Business Intelligence:

  • Similarly, the first step is to design with the end in mind, then build and deliver in agile sprints
  • Second, aggregate and optimize at the end of the implementation process
  • Third, promote self-service access to data through transparent communication of metadata and education for data consumers

These elements of Data Strategy sit toward the top of the Aiken Pyramid, where we are starting to get into activities such as advanced analytics and machine learning that might be one step more complicated than BI.

Watch a video about this topic

Conclusion

Like Maslow's hierarchy of needs, Data Science actualization cannot be attained without first achieving the physiological and safety needs of Data Governance, Data Quality, and Data Architecture at the foundational levels of the Aiken Pyramid. Success in this area will set you up to pursue capabilities such as Data Modeling, Master & Reference Data Management, and Data Warehousing & Business Intelligence.

Understanding Data Strategy transforms you from a data consumer into an empowered advocate for better Data Management practices. You may continue to deepen your authority in this field by becoming recognized as a Certified Data Management Professional (CDMP). The knowledge gained from reading the DMBOK can help you set up a robust Data Strategy for your team.

To that end, you may also be interested in the Data Strategy Workbook, which contains 20 exercises to help your organization accelerate the development of its Data Strategy capabilities. The six Data Management capabilities discussed in this article are also featured in our Foundations of Data Strategy Poster, which can be a helpful tool for keeping this information top of mind.

Nicole Janeway Bills

Nicole Janeway Bills

Data Strategy Professionals Founder & CEO

Nicole offers a proven track record of applying Data Strategy and related disciplines to solve clients' most pressing challenges. She has worked as a Data Scientist and Project Manager for federal and commercial consulting teams. Her business experience includes natural language processing, cloud computing, statistical testing, pricing analysis, ETL processes, and web and application development.