Data Abstraction for Data Professionals
Executive Summary
Abstraction involves breaking down a model into smaller parts for presentation to different audiences, with an emphasis on reducing noise and clutter to focus on essential elements. There are different types of abstraction, including horizontal and vertical abstraction.
Data modelling involves creating a high-level diagram of the house, a data estate, or a data landscape and then breaking it down into conceptual, logical, and physical models. It also involves abstracting reality to present to users and hiding details to make it more communicable. When building data models, it is crucial to understand the difference between horizontal and vertical abstraction. Horizontal abstraction partitions the diagram by subject area, while vertical abstraction involves drilling down to detail.
Webinar Details
Title: DATA ABSTRACTION FOR DATA PROFESSIONALS
Date: 21 July 2023
Presenter: Howard Diesel
Meetup Group: Data Professionals
Write-up Author: Howard Diesel
Contents
Abstraction for Vertical and Horizontal Data Modelling.
Clarification on Horizontal and Vertical Abstraction for Presentation.
Abstraction in Data Management
The Importance of Precise Definition and Simple Presentation in Writing.
Abstraction in Data Modelling.
Vertical and Horizontal Abstraction in Data Modelling.
The Importance of Vertical and Horizontal Data Modelling in Enterprise Design.
The Role and Process of Data Modelling.
Importance of Clear Communication for Business Analysis.
Levels of Detail in Architectural Modelling.
The Importance and Use of Abstraction in Data Modelling.
The Importance of Operational and Specification Models in Enterprise Architecture.
Importance of Data Architecture in Data Management
Data Modelling Levels.
Understanding the Process of Data Modelling.
The Process of Data Modelling and Abstraction in Database Design.
Importance of Spatial Presentation in Data Modelling.
Defining Business Practices and Knowledge Management in an International Organization.
Concept Mapping and Graph Databases for Data Modelling.
Creating Subject Areas in the Data Landscape.
Communicating Concepts to Businesspeople
Communication relies heavily on understanding essential concepts, which can be achieved using the Zachmann ontology to specify and break down these concepts according to the audience. This process begins with the subject area model at the executive level, then moves to conceptual models at the management level, logical models at the architecture level, and finally, physical models at the engineering level. The gradual provision of the appropriate level of specification and abstraction is crucial, as it avoids the Waterfall model of defining the entire system upfront.
Enhancing understanding can also be achieved by considering different presentation formats and defining presentation and abstraction. Horizontal-level abstraction can be utilised to position models into smaller subparts, avoiding overwhelming data models with excessive details. Lastly, breaking down models into manageable sections can significantly improve comprehension.
Abstraction for Vertical and Horizontal Data Modelling
When it comes to understanding complex ideas, abstraction is key. This is a general concept that is used in a variety of contexts to simplify information and make it easier to grasp. There are different types of abstraction, such as vertical abstraction, which involves delving into details from higher-level entities and relationships to more detailed information. Reverse engineering is an example, which involves starting from the database and building logical and conceptual models. Another approach is the middle-out approach, which focuses on working with the conceptual model and subject areas, going from high-level to detailed information. Abstraction for presentation involves simplifying concepts, lines, and information for effective communication with the business. There are also abstractions for specification, common attributes, and subtyping. Ultimately, the role of abstraction is to help individuals understand complex ideas by focusing on essential elements and removing irrelevant details.
Clarification on Horizontal and Vertical Abstraction for Presentation
Different levels of abstraction include whole specification, executive management, architect-engineer, and data models. The discussion focuses on the various levels of a data model and how they play a role in communication.
JG raises a question about the connection between a data estate and subject area model and when horizontal transitions into vertical abstraction.
Abstraction in Data Management
An organisation's data estate encompasses a collection of crucial business entities. Within this estate, data areas assist in constructing a data strategy and identifying touch points and data ownership. Metadata is arranged according to subject areas, which are also utilised in master data domains. These subject areas are employed to assign ownership and guarantee security in data management. Abstraction in this context does not pertain to shared characteristics or connections but rather to presenting information to the appropriate audience. Three types of abstractions are utilised: specification, presentation, and commonality.
The Importance of Precise Definition and Simple Presentation in Writing
When discussing a conceptual model, we must be clear in our definitions and presentations to ensure readers understand and agree. Understanding the meaning of a concept is crucial in this process.
While some may consider concepts critical business entities, it's important to note that they differ from critical business or critical data elements. Concepts are refined notions developed through extensive analysis and discussion, while ideas are rough mental constructs.
Building accurate business definitions is essential before beginning to develop conceptual models. Ideas are typically individual efforts, while concepts require agreement from a group, particularly business departments.
Abstraction in Data Modelling
It's important to avoid relying solely on personal ideas or external models when developing a data model, as this can hinder agreement within a group. In business communication, everyone must recognise and agree on basic concepts.
It's important to avoid abstracting too high when creating a master data model, as this can lead to potential issues. Abstract patterns are necessary to handle multiple entities and information effectively.
Conceptual models in data architecture describe and evaluate objects or actions regarding business transactions or events. They can use abstract concepts to explain theories and relationships.
Conceptual models can be broken down into subject areas, which logical and physical models for each application then follow. There are two approaches to abstraction in data modelling: horizontal abstraction, which involves defining each subject area one by one, and vertical abstraction, which involves drilling down from conceptual to logical and physical models.
The horizontal abstraction approach, or middle-out abstraction, involves defining and evaluating subject areas horizontally for dependencies.
Vertical and Horizontal Abstraction in Data Modelling
When analysing a database's data model, there are two types of abstractions: vertical and horizontal. Vertical abstraction entails delving into the details, including reverse engineering. On the other hand, horizontal abstraction requires grouping data into distinct subject areas, such as product design, commercial offerings, or sales. These subjects can then be examined more closely for more detailed modelling. To create a commercial offering, defining both product design and sales concepts is necessary. The conceptual model is established by integrating the concepts from each subject area. Additionally, horizontal partitioning can be leveraged to focus on specific parts of the diagram for deeper analysis and comprehension.
The Importance of Vertical and Horizontal Data Modelling in Enterprise Design
Regarding data modelling, there are two main approaches: vertical and horizontal. Vertical data modelling involves focusing on a specific subject area or data model, while horizontal data modelling looks at the bigger picture of an Enterprise data warehouse.
Global design is a key aspect of data modelling, as it involves creating horizontal abstractions to understand all the elements of the data estate. Meanwhile, local implementation occurs when a specific area, such as product design, is chosen for vertical implementation.
During vertical implementation, conceptual, logical, and physical models are built for one subject area at a time, working from the broadest conceptual level to the most specific physical level. The ultimate goal of this process is to get the data model into the transactions and build solid data models.
It's worth noting that while conceptual, logical, and physical models are typically at an enterprise level, there are also lower-level models specific to individual applications or projects.
The Role and Process of Data Modelling
When it comes to data management, there are two important roles to consider: the data architect and the data modeller. While the data architect builds the Enterprise model, the data modeller focuses on creating a model for a specific application or data product.
Although the data modeller can also build a conceptual model, it's not always necessary if a good Enterprise data model is already in place. Regardless, the data modeller is responsible for constructing the logical and physical models for the application or data product.
At every level of the models, it's vital to consider Steve Hoberman's scorecard to ensure accuracy and effectiveness. Communication is also critical, especially when discussing scope and requirements at different levels.
Building a conceptual model involves highlighting concepts from business requirements in a Word document. However, it's crucial to ensure that any added concepts in the conceptual model fall within the scope of the requirements.
Importance of Clear Communication for Business Analysis
Clear communication is essential in business. It's important to provide a precise definition and layout of business models. Additionally, checking if the proposed model is understood and comfortable for businesspeople is crucial. Horizontal and vertical drill-down methods are helpful for deeper analysis. It's also useful to visually represent client requirements during the process. Examples can be used to illustrate the steps of the analysis process. Lastly, it's important to ensure agreement at a high level before diving into detailed analysis.
Levels of Detail in Architectural Modelling
The architect develops a conceptual model to showcase the design's overall look and feel. This model is then transformed into a logical model that includes details like room layouts and door placements. The logical model is then turned into a physical model that guides the electrician in implementing the lighting aspects. This progression ensures a thorough understanding and execution of the design.
If a technical implementation misses certain concepts, it may need to be reverse engineered to ensure alignment with the initial requirements. Depending on the requirements, various databases can support different physical implementations, like JSON, relational, or graph.
In this context, abstraction refers to presenting data at different levels of resolution rather than achieving generality or commonality.
The Importance and Use of Abstraction in Data Modelling
It's essential to understand the context in which abstraction is used to avoid confusion. Different models are used in data modelling to present and communicate with the audience effectively. During the data modelling exam, questions about super type subtype, generic generalisation, and horizontal and vertical abstraction were especially important. Abstraction can be applied in various areas beyond just modelling. When it comes to specification, it involves creating a data sheet with specific information for a particular purpose. Vertical abstraction illustrates high-level subject areas, conceptual, logical, and physical models. It's necessary to link the conceptual model to the business process model, logistics, and workflow in data modelling.
The Importance of Operational and Specification Models in Enterprise Architecture
Understanding the annual operations model is crucial for your timing model and business plan to succeed. TOGAF creates various processes and deliverables for enterprise architecture. The Zachman framework emphasises the core elements of enterprise architecture through the Periodic Table of models. The framework covers different areas and processes. The deliverables should align with the vertical organisation's needs and specification model. The system logic involves inventory, process, and organisational representation, which work together. Different levels of detail are presented to ensure understanding, and irrelevant information is removed. The specification element confirms the client's needs. Stripping the data vertically or horizontally provides the necessary detail for implementation.
Importance of Data Architecture in Data Management
Effective data management in an organisation requires a well-designed data architecture. This means representing the data at different levels of abstraction to handle large amounts of data.
One way to ensure responsibility allocation, such as data ownership and stewardship, is through using the subject area model. This model provides a comprehensive view of the core business concepts without gaps or overlaps.
When building an Enterprise data warehouse, balancing a global perspective with local implementation is essential. To aid comprehension, the terms "subject area," "data estate," or "landscape" can be used interchangeably.
However, executives may have difficulty understanding the subject area model, so that alternative terminology can be used. A high-level data diagram can also help visualise data hierarchy, but confusion may arise between conceptual and logical terms.
Data Modelling Levels
In this presentation, we are given a high-level view of a house through various data models. The first is a subject area model or data estate, which provides an overview of the entire house. A simplified high-level data model and basic definitions are presented, showing the house, porch, and bedroom. As the discussion progresses, a more detailed conceptual data model is introduced, which includes specific business rules and definitions.
A logical data model is presented as a floor plan, with attributes like doors and wash basins included. Finally, a physical data model represented by a wiring diagram is introduced, showing information reduction at the appropriate level.
Understanding the Process of Data Modelling
As we move towards higher levels of abstraction, we discuss the presentation and imported elements. On the other hand, as we move down the levels, we gather more details while retaining vertical abstraction.
The concept is essentially an idea or notion that has been developed collectively. The process of conceptualisation involves defining relationships and abstraction.
Logical data models are designed to adhere to logic and sound reasoning principles. Models should have a clear rationale behind their design.
Specific details may be concealed during the presentation phase to communicate with users efficiently.
The data modelling process begins with comprehending the business scope and gathering user narratives. These narratives are converted into sentences or facts, emphasising nouns and verbs.
Business fact modelling entails remodelling the facts about the business, while ER modelling focuses on primary keys, column names, attributes, normalisation, constraints, and foreign keys.
Relational modelling is just one aspect of the overall data modelling process.
The Process of Data Modelling and Abstraction in Database Design
Our process starts with gathering information from the business, including facts, nouns, constraints, and types. We then create a relational model by Denormalising, resolving keys, and defining attributes. The following steps involve creating a business data model and multiple conceptual views for approval from the business. We also provide developers and DBAs with SQL code and data definition language (DDL) for internal use. We use logical, physical, and conceptual modelling to ensure clear communication. Finally, we select appropriate views for the business based on their specific needs.
Our ultimate goal is to effectively represent the business requirements and add value to their operations. The levels of modelling in database design are similar to spatial representation in past architecture.
Importance of Spatial Presentation in Data Modelling
When creating data models, having a well-defined spatial layout is important, similar to how rooms are laid out in a house. However, it is even more crucial to establish clear relationships between the data models. The lower-level diagram models should be consistent and free of conflicts. Data moves horizontally and vertically in data modelling, so it is essential to have traceability from the data dictionary to the physical and logical models. To ensure easy readability, it is crucial to have clear diagramming when structuring logical models. Following specific rules, such as starting with essential points on the left-hand side and reading clockwise, can improve the layout and understanding of data models. To achieve diagramming clarity, Data Architects must define the practice of data modelling.
Defining Business Practices and Knowledge Management in an International Organization
Every business has to establish its approach to operations. Steve Hoberman proposes a method of outlining and communicating business practices. The emphasis is clearly defining an organisation's models to ensure accurate interpretation. By using camera settings as an analogy, standards and focus can be established. Similarly, artists can direct viewers' attention to the subject of interest by extending the analogy to the layout. Categorising data states helps to systematise and present data efficiently. Visual tools can assist in simplifying complicated concepts, much like constructing a house.
Concept Mapping and Graph Databases for Data Modelling
By utilising Power BI in data flow development, it becomes possible to represent data sources' effect on reports visually. Tools such as concept mapping and graph databases are available to simplify the process of comprehending and mapping data relationships. Concept mapping involves working dynamically with a concept map to visualise and understand business relationships. Graph databases highlight related nodes to a particular concept and can be utilised to apply normalisation. Concept mapping modelling is an effective way to build data models and interact with data visually. It is crucial to constantly evaluate both top-down and bottom-up approaches to data modelling to gain new insights and understand the relationships between elements.
Creating Subject Areas in the Data Landscape
Mark Atkins has developed a tool that utilises a graph database to link business definitions with terms, resulting in the creation of subject areas. Unlike Informatica Collibra and similar tools, Atkins' tool offers visualisation capabilities, which is a significant advantage. When generating new ideas for modelling data and concept maps, the approach taken by standardised data modelling tools is vastly different. Marco Woburn's fact modelling technique is centred around linking facts with entities and relationships to represent the data landscape visually. Grafana is another tool that is frequently mentioned in the context of visualising subject areas. Additionally, business capability models can be used to identify subject areas in the data landscape.