Governed Data Discovery vs BI:
Is there the best strategy?
Governed Data Discovery is one of the trending buzzword since 2014.
The term made its debut in Gartner’s 2014 Magic Quadrant Report. Gartner first popularized the term data discovery and now is adding governed to the term.
In many cases even successfully realised Business Intelligence projects contribute company’s IT capability but don’t increase business capability. This happens mainly because resulting analytics doesn’t match the context of a business or simply saying reports and dashboards don’t answer business questions. Knowledge is presented the way wisdom cannot be extracted from it.
The GDD Solution
“Governed Data Discovery – the ability to meet the dual demands of enterprise IT and business users” – asserts Gartner.
GDD pretty much differs to BI. There are key points describing their major characteristics:
- Defining business goals metrics
- Analyzing business processes
- Identifying operational data
- Define business logic and rules
- Define mapping and quality rules
Analysis -> Design -> Develop
- Reports: Define what happened
- OLAP: Defined metrics/dimensions
- Dashboards: Defined metrics/goals
- Scorecards: Defined KPIs/variance
Governed Discovery Process
- Clear business goal (mission)
- Inherent business knowledge
- Access data and work with context
- Ability to iterate analytic models
- Ability to share and verify findings
Discover -> Verify -> Govern
- Discovered context to governed
- Discovered analytic models
- Data shows how the business works
- Data shows how the market works
To be a true Governed Data Discovery solution, the BI platform needs to offer four main components:
1) Self-Service Central Control
A built-in, self-service, and centralized administrative toolset to govern an organization’s BI. This robust and extensive administrative backend is required to provide the capability for managing every user’s experience, security, content, and data access from a singular, intuitive interface. PowerBI is a good approximation to a platform that can support GDD solution.
2) Data Governance
The data needs to remain centralised so to have one version of truth for data.
A Data Lineage capability that tracks the lifecycle of the model is also highly recommended especially in large-scale projects.
3) The Content Lifecycle
Content repository in a centralized, shared paradigm – that also tracks the content life-cycle is needed. This ensures content integrity and makes it easy to find and implement any changes or upgrades.
4) Secure & Protected
A strong security model is vital so that the data is not only secure from the external source, but also kept confidential internally.
All approaches in data management can be split up into 4 cases:
Despite theoretically good results are achievable in all the cases some of them seem much harder for realisation than others. It’s barely possible to imagine that Big Data (see on right-bottom variant) can be provided to end-users without cleaning and basic transformation, that obviously involves Semantic Integration factually adding context it. And realisation this way moves from variant in right-bottom corner to one in right-top.
For GDD the most realisable variant is a combination of well-prepared data (level of “information”) and full flexibility for end users to extract “wisdom” from it:
Concluding, there is no way in real world to avoid structuring data before delivering it to the business for decision support. The only choice is whether to do it rigid way (BI) or flexible (GDD).