Overview

This course explores the more advanced techniques for Data Modelling. In addition, techniques will be taught on how (and when) to create Data Models for non-relational solutions including Big Data together and the uses for data models beyond Relational DBMS development.

In the modern era, the volume of data we deal with has grown significantly. As the volume, variety, velocity and veracity of data keeps growing, the types of data generated by applications become richer than before. As a result, traditional relational databases are challenged to capture, store, search, share, analyse, and visualize data. Many companies attempt to manage big data challenges using a NoSQL (“Not only SQL”) database and may employ a distributed computing system such as Hadoop. NoSQL databases are typically key-value stores that are non-relational, distributed, horizontally scalable, and schema-free.#

Many organisations ask, “do we still need data modelling today?” Traditional data modelling focuses on resolving the complexity of relationships among schema-enabled data. However, these considerations do not apply to non-relational, schema-less databases. As a result, old ways of data modelling no longer apply.

This course will show Data modelling approaches that apply to not only Relational, but also to Big Data, NoSQL, XML, and other formats. In addition, the uses of data models beyond simply development of databases will be explored.

Prerequisite: Attendance at the Data Modelling Essentials class OR 3+ years of practical Data Modelling experience

Training Outline

Data Modelling Recap

Data modelling basics
Major constructs
Identifying entities
Data model types, and the linkage between them.

Levels of Models

Enterprise, Conceptual, Logical & Physical.
What is the purpose of each, do we need all of these in a Big Data world.
Where does Dimensional modelling fit in?

Data Modelling – Back to the Future?

Data Modelling didn’t start with relational! This may be a surprise to many people, but the first uses of data models were well before Relational data bases became the norm. The techniques are applicable to many of the modern non-relational formats we see today.

Modelling in the pre-relational days. We didn’t have RDBMS’s. We had Flat files, Sequential, VSAM, Hierarchical DBMS’s, Network DBMS’s, Inverted Architecture DBMS’s.

The techniques that were developed for these are directly appropriate to the NoSQL and Big Data world of today.

Data Modelling for Big Data & NoSQL

What has to change when we are developing data models for a Hadoop or other Big Data environment?

Do modelling tools support Big Data technologies, what are the restrictions and considerations?

What data modelling techniques are applicable when targeting a Big Data platform?

Does normalisation still have a place in the Big Data world?

Where’s our metadata in the model now?

In the age of big data, popular data modeling tools (eg ER/Studio, ERWin, PowerDesigner) continue to help us analyze and understand our data architectures by applying hybrid data modelling concepts. Instead of creating pure a relational data model, we now can embed NoSQL submodels within a relational data model. In general, data size and performance bottlenecks are the factors that help us decide which data goes to the NoSQL system.

Key Value Pairs: A common misconception is that using data structures like JavaScript Object Notation (JSON) prevents us from needing a data model; THIS IS WRONG. We’ll show several examples & conclude that a set of JSON files can be just as complicated as a 100 table 3rd Normal Form data model.

NoSQL & Hadoop: How the 4 types of NoSQL databases still need data models, and how the ACID vs BASE paradigm affects this.

Modelling for Hierarchic Systems & XML

What must change when developing data models for XML & Hierarchic systems?

Services Oriented Architecture (SOA)

Why data models are essential for success.

Massively Denormalised Files:

Is modelling needed?
How do we create data models for Data lakes?

Dimensional Data Models:

How do we create a dimensional model?
Converting an ER model to Dimensional.
Slowly changing dimensions, what types and when are they applicable.
Beyond the basics with conformed dimensions, bridges, junk dimensions & fact less facts.

Application Packages & Data Models

Do we need to develop data models when implementing a COTS package?
Uses and benefits.

Using Data Models for Data Integration & Lineage

How to exploit data models for design of data integration approaches and in data lineage.

Top Down Requirements Capture

When is it appropriate, what are the limitations.

Bottom Up Requirements Synthesis

When this works, where is it appropriate.
How do we cope with existing DBMS’s and systems.

How to Capture Requirements for Both Data and Process Needs

What comes first Data or Process – we’ll show the answer. The critical importance of understanding processes to get your data models right (and vice versa). Interaction between process and data models. Approaches for capturing Process AND Data Requirements.

Checking the Data vs the MetaData; Why Does It Matter?

Use of Standard Model Constructs and Pattern Models:

Understanding the Bill of materials (BOM) construct. Where can it be applied, why it’s one of the most powerful modelling constructs.
Party; Role; Relationship: Why mastering this construct can provide phenomenal flexibility.
Mastering Hierarchies: Different approaches for modelling hierarchies.

Different Data Modelling Notations & a Comparison Between Them

Normalisation: Progressing beyond 3NF. 4NF, 5NF Boyce-Codd, and why, and when to use them.

Objectives

At the end of the course, delegates would have gained the following:

Practical Application:

Build conceptual and logical data models, and know about compromises for physical design;

How to discover requirements for robust data models;
Understand where abstraction is valuable (and where it is risky);
Where industry data models can provide a kick start;
How (and where) to apply standard solutions to well-known data modelling business scenarios.

Level Set Understanding & Terminology:

Learn about the need for and application of Data Models in Big Data and NoSQL environments

See the areas where Data modelling adds value to Data Management activities beyond Relational Database design
Understand the critical role of Data models in other Data Management disciplines particularly Master Data Management and Data Governance.

Pragmatic Learning

Learn the best practices for developing Data models for Big Data and NoSQL environment

Understand how to create data models that can be easily read by humans
Recognise the difference between Enterprise, Conceptual, Logical, Physical and Dimensional Data models
Through practical examples, learn how to apply different Data modelling techniques

Who Is It For?

Practitioners who will need to read, consume or create data models, particularly for Big Data and non-RDBMS environments. Users who wish to gain a better understanding of data during Information Management initiatives including:

Business Intelligence & Data Warehouse developers & architects
Data Modellers
Developers
Data Architects
Data Analysts
Enterprise Architects
Solution Architects
Application Architects
Information Architects
Business Analysts
Database Administrators
Project / Programme Managers
IT Consultants
Data Governance Managers
Data Quality Managers
Information Quality Practitioners

Speaker

Chris Bradley

Information Management Strategist, Evangelist & Speaker

Data Management Advisors Ltd

Christopher Bradley has spent 39 years in the forefront of the Information Management field, working for International organisations in Information Management Strategy, Data Governance, Data Quality, Information Assurance, Master Data Management, Metadata Management, Data Warehouse and Business Intelligence. Chris is an Information Strategist and a recognised thought leader. He advises clients including, Alinma Bank, American Express, ANZ, British Gas, Bank of England, BP, Celgene, Cigna Insurance, EDP, Emirates NBD, Enterprise Oil, ExxonMobil, GSK, HSBC, NAB, National Grid, Riyad Bank, SABB, SAMA, Saudi NIC, Saudi Aramco, Shell, Statoil, and TOTAL. He is VP of Professional Development for DAMA-International, the inaugural Fellow of DAMA CDMP, past president of DAMA UK. He is an author of the DMBoK 2 and author and examiner for professional certifications. In 2016 Chris received the lifetime achievement award from DAMA International for exceptional services to furthering Data Management education & to the International Data Management community. Chris guides Global organizations on Information Strategy, Data Governance, Information Management best practice and how organisations can genuinely manage Information as a critical corporate asset. Frequently he is engaged to evangelise the Information Management and Data Governance message to Executive management, introduce data governance and new business processes for Information Management and to deliver training and mentoring. Chris is Director of the E&P standards committee “DMBoard”, sits on several International Data Standards committees, teaches at several Master’s Degree University Classes Internationally. He authored “Data Modelling for the Business”, is a primary author of DMBoK 2.0, a member of the Meta Data Professionals Organisation (MPO) and a holder at “Fellow” level of CDMP and examiner for several professional certifications. Chris is an acknowledged thought leader in Data Governance, author of several papers and books, and an expert judge on the annual Data Governance best practice awards.