Most businesses today are operating in a distributed computing environment with data spread across
multiple types of data stores on-premises, in multiple clouds, in SaaS applications and at the edge.
This makes data much harder to find and to govern. Yet, data governance is now very high priority in
most organisations not just to remain compliant with legal and regulatory obligations but also to
create a high quality, secure data foundation of reusable data products to underpin data and AI
initiatives now happening across every department in the enterprise. With data and AI now strategic in
the boardroom, data governance has become so important that companies classified as ‘leaders’
regard it as a strength that gives them commercial advantage and not just an initiative to remain
compliant with legislation like GDPR.
To date however, data governance in many organisations has been fractured with different tools being
used to support different data governance disciplines. This includes the use of different tools for data
quality, data privacy, data access security, data sharing and data retention. Also, data catalogs, that
can automatically discover and classify data in many different data stores are in often purchased
independently of other data governance tools. Therefore, while automated data discovery is possible,
to some extent, almost all data governance disciplines are still highly dependent on people manually
keying and re-keying policies and other metadata across many different tools to try to ensure data
remains governed and that governance policies are consistently enforced across the enterprise.
However, as more data is created and new data sources continue to appear, the challenge of manually
understanding all data relationships and manually governing data is becoming almost impossible. In
addition, doing this with multiple tools is also challenging as there are no industry standards to
exchange metadata across those tools which means data governance tasks often have to be repeated
again and again to ensure data is being correctly governed across a distributed data estate.
Given this increasingly difficult challenge, companies are looking for a more automated way to deal
with data governance. To do this requires taking data governance to a new level by introducing AIdriven,
active data governance. Active data governance is more than just keeping metadata up to date.
This is something that uses AI-driven automatic data discovery and data classification, tag-based
policy management, and an AI-driven data governance action framework to continuously govern data
more efficiently and effectively. An AI-driven data governance action framework needs to include data
governance health metrics to monitor progress, data governance events, automated data governance
issue detection, automated verification that actions have occurred, different types of governance
action services, data governance action processes and automated triggering of data governance
action processes to ensure the heath of your data continues to improve.
This two-day in-depth class looks at this problem and shows how to successfully implement AIdriven
active data governance across a distributed data estate. This includes AI-driven active
governance of data access security, data privacy, data sharing, data retention data quality and data
usage. The class looks at the business problems caused by poorly governed data and how it can
seriously impact business operations, cause unplanned operational costs, and destroy confidence in
accuracy of BI, machine learning model predictions and recommendations and Generative AI.
It also looks at requirements for AI-driven active data governance. Having understood the
requirements, you will learn what should be in a governance programme. This includes data
governance roles and responsibilities, processes, policies, technologies, and data governance
capabilities to govern data across a distributed data estate . It looks at how to implement AI-driven
active data governance by breaking the data governance problem down into a series of steps that
need to be implemented and looks how to take advantage of emerging AI-driven data governance
platforms to implement this. The class will cover:
This module looks at what active data governance is, how ungoverned data can impact business operations, decision making and increase risk, and where most companies are in terms of implementing data governance. It also looks at how this challenge has grown to encompass multiple different governance disciplines including data quality, data privacy, data access security, data sharing, data retention, and data usage across a hybrid, multi-cloud distributed data estate. Finally, it looks at the fractured approach many companies have been taking to tackle this challenge, the problems caused by using best of breed tools and why it is no longer practical to expect people to do everything manually when AI can help automate tasks.
• The ever-increasing distributed data landscape – on-premises, multiple clouds, SaaS applications and the edge
• The impact of ungoverned data on compliance with legislation and regulations, business profitability and ability to respond to competitive pressure
• Data Governance – Where are we?
• Beyond data quality to multi-disciplinary data governance
• Problems caused by a fractured approach to data governance using multiple best of breed tools
• Why AI is needed, and a data catalog is not enough
This module looks at what the requirements are to govern data in terms of people, process, technologies, policies and capabilities and what we need to change to move from manual data governance to AI-driven active data governance automation.
• Key requirements for governing data & content across a distributed data estate
• People
Data governance roles, responsibilities and groups
Core processes & tasks to standardise data governance activities, approvals and actions
• Technology requirements
Universal data governance platform
Data catalog
Trainable ML classifiers
Data Governance co-pilots & AI agents
Dynamic data masking
Universal data access control
Data marketplace
Data loss prevention software
Data governance observability
Data governance enforcement agents
The Data Catalog Marketplace
Alation, Ataccama, Atlan, AWS Glue Data Catalog, BigID, Cambridge Semantics Anzo Data Catalog, Collibra Data Catalog, data.world, Databricks Unity Catalog, Google Data Catalog, Hitachi Vantara Pentaho Data Catalog, IBM Knowledge Catalog, Informatica IDMC Data Governance and Catalog, Microsoft Purview, Oracle, SAP DataSphere, Qlik (Talend) Data Catalog, TopQuadrant TopBraid
Data governance platforms
Data access security tools
Foundational AI-assisted active data governance services
Data catalog business glossary
Data source discovery scanners
Data catalog data map and data relationship knowledge graph
Data governance classifiers
Generative AI for automated metadata enrichment and conversational data search
• Active data governance applications, policies & policy types needed to govern:
Data quality
Data access security
Data privacy
Data retention
Data loss prevention
Data sharing
Data use and maintenance
• Data governance action framework
Data governance health metrics to measure data health, how it’s processed, protected and used
Critical data elements
Example data governance events & incidents to be monitored
Likely triggers of governance actions
AI models for data governance alerting, recommendation, action and task automation
Data governance observability agents to monitor & detect incidents
Using metadata lineage to understand the impact of events and incidents
Data Governance action services
Data governance verification service to check that governance & data curation activities have occurred
Data governance rectification services to rectify metadata
Data governance action invocation service
Example actions to be performed on items (e.g. on a data source, a business term, a data asset, a policy, a data product)
Data governance action processes to ensure actions are standardized and consistently executed
Organising human actions using an in-box and human task monitoring
• Data standardisation using a common business vocabulary
• The purpose of a business glossary in data governance
• Business glossary capability in data catalogs
Alation, Amazon Glue, Collibra, Informatica IDMC Business Glossary, IBM Knowledge Catalog, Microsoft Purview, Qlik (Talend), TopQuadrant TopBraid
• Planning for a business glossary
• Glossary roles and responsibilities
• Glossary term submission, voting approval and dispute resolution processes
• Approaches to creating a common vocabulary
• Organising data definitions in a business glossary
• The role of a data concept model
• Utilising a common vocabulary in BI tools, semantic layers, data modelling, data fabric, MDM and APIs
Having defined your data, this module looks at automatically discovering what data you have, where it is and how it maps to your business glossary to provide a business understanding of your data estate.
• Understanding your data estate — the critical role of AI-driven data catalog software
• Registering data sources for discovery
• Automated data discovery and data quality profiling using a data catalog
• Automating metadata enrichment using Generative AI
• AI automated suggestion of business glossary terms during data discovery
• AI-assisted mapping of physical data assets to business glossary terms
• Setting policies and health controls to monitor business glossary creation and data curation
• Using AI agents to monitor data curation activity and automate curation actions
This module looks at manually and AI-driven automatic labelling of data and content to know how to govern it using predefined AI classifiers, user-defined classification schemes and trainable AI classifiers. It also looks at how AI-driven automatically classified data shows up in a data catalog and how policies can be assigned to labelled data to govern it across your data estate.
• What is AI-driven data classification?
• Automatically detecting and classifying sensitive structured data using predefined AI classifiers in a data catalog
• Creating your own data confidentiality and retention classification schemes
• Manually classifying content using your own classification scheme (e.g. Office documents, SharePoint, Email, Chat, Microsoft Teams or Zoom Meetings)
• Training and using AI classifiers to auto label content across your data estate
• Using classification insights to understand sensitive data proliferation and data redundancy across your estate
• Understanding policies, policy groups and tag-based policy management
Having classified the data and content in your data estate, this module looks at protecting data and content, focusing on sensitive or confidential data. It looks at setting and enforcing policies to govern data access and usage security, as well as governing data loss prevention.
• Data security objectives
• Key technologies in governing data security
Policy establishment
Policy enforcement
• Steps to implement data security
• Setting health controls and enterprise-wide and domain-specific policies in your data catalog to govern data access across your data estate
Attribute based access control
• Unifying data access control across multiple data stores
• Universal authorisation fabric software (e.g. IBM, Immuta, Databricks (Okera)) and how they integrate with data catalogs
• Using cloud application security brokers
Auto discovery of cloud app data usage
Setting policies to govern access to and use of sensitive data and content from applications
Monitoring cloud application activity
• Dealing with insider risk management and internal information barriers
This module looks at AI-assisted governance of personal and financially sensitive data across your data estate to remain compliant with legislation in multiple jurisdictions.
• Data privacy objectives
• Data privacy legislation — GDPR, CCPA, HIPAA and more
• Steps involved in enterprise-wide data privacy risk management
• Using AI to automatically identify where unprotected personal data is located
• Setting policies and health controls in your data catalog to govern data privacy across your data estate
• Data privacy insights on sensitive data location, how it moves and where you are at risk
• Data privacy policy enforcement across a distributed data landscape
Linking your data catalog to other technologies
Encrypting and de-identifying personal data
Using data loss prevention (DLP) to avoid loss of personal data
Protecting personal data in email, chat, documents, file shares, cloud storage and endpoints
• Using AI agents to monitor data privacy violations and automate actions
• Monitoring AI agent effectiveness
• Managing subject action requests
This module looks at governing the lifecycle of data across your data estate and how AI can automatically classify data to label it for retention, set policies to control retention duration, and manage expiry and legal holds.
• Creating a data retention classification scheme
• Complying with country and region-specific legislation
• Training AI classifiers to label your data
• Automatically classifying data and content using AI to create retention labels
• Setting policies and health controls in your data catalog to govern data retention
• Using AI agents to monitor data retention expiration and automate actions to destroy, archive and hold data
This module looks at producing trusted, compliant data products to be shared across the enterprise and beyond, and how data sharing can be governed.
• Data sharing objectives
• Key technologies to help produce high-quality, compliant data products for sharing
• Steps to creating data products using a data catalog business glossary, automated discovery/mapping and AI-assisted data engineering
• A unified approach to producing high-quality data products using Data Fabric and DataOps pipelines
• AI-assisted publishing of certified, high-quality, compliant data products in a data marketplace
• Potential metadata standards for data products e.g. DPROD
• AI-assisted governance of data sharing and consumption using data contracts in a data marketplace
• Consumer use of AI-assisted conversational data search in a data marketplace
• Creating a standard data sharing approval process for consumers
• Using AI agents to monitor and track shared data consumption and usage
This module looks at AI-assisted governance of data quality across your data estate.
• The business impact of bad quality data
• Common data quality health metrics
• AI-assisted creation of data quality validation, matching and survivorship rules in your data catalog using Generative AI
• Using your data catalog to automatically profile and validate your data quality
• Setting data quality health controls and thresholds in your data catalog to govern data quality
• Leveraging data catalog AI-suggested data quality rules
• Integrating data observability with data catalogs to monitor and report data quality issues
• Using AI to monitor data quality validation rules against health control thresholds
• Auto-generating validation rules when data quality thresholds are breached
• Using the data catalog for AI-assisted data cleansing and generation of data integration pipelines
• AI-assisted MDM and the data catalog
This module looks at AI governance to manage AI models and AI risk.
• AI Governance best practices including:
Creating an AI inventory and risk registry
Setting up accountability for AI
Evaluating and mitigating AI risk
Governing AI development
AI observability
• Formalising processes and enabling auditing
• Explainable AI
• AI observability
• Avoiding PII leakages when building vectors for Generative AI LLMs
• Establishing AI guardrails for GenAI
• AI governance tools
This seminar is intended for CDOs, CIO’s, Heads of Data Governance, CISOs, Business Analysts, data
scientists, BI Managers, data warehousing professionals, data architects, solution architects, Data
strategists, Database administrators, IT consultants.
Course Summary
Price: £995 + VAT (vat only charged if UK resident)
Instructor: Mike Ferguson
Duration: 2 Days
Time: 9 am – 5 pm BST
Language: English
Certification: Yes – Certificate of completion
Join one of our upcoming online course dates, giving you expert-led insights, practical skills, and the flexibility to learn at your own pace. Stay ahead in your industry, boost your career prospects, and gain valuable knowledge from top professionals. Invest in your future with high-quality training designed for real-world success.
Single Registration
Group Registration
Group Bookings
Enhance your team’s skills with our tailored in-house training! Designed to meet your specific needs, our expert-led sessions deliver practical insights and real-world solutions. Empower your workforce, boost performance, and drive business success. Enquire today to discuss a customised training programme that works for you!
We can help with Custom dates to suit you, Group/team bookings, Learning passports and pathways, Bespoke courses to suit your needs, and more!
This site uses cookies. Find out more about cookies and how you can refuse them.