
It can also include business metadata such as the definition, associated business terminology, and metrics. So, each data source - a warehouse, lake, or lakehouse - will have a data dictionary.Īn enterprise data dictionary is a compilation of metadata such as object name, data type, size, classification, and relationships with other data assets. Where do data dictionaries fit in your stack?Īccording to data governance coach Nicola Askham, you can have multiple data dictionaries as it has details of the systems hosting or holding data assets. It helps in the creation of authentic, transparent, and consistent data throughout the organization. Meanwhile, DAMA UK (Data Management Association, UK chapter) definesĪ data dictionary as “software in which metadata is stored, manipulated, and defined.” What is a data dictionary used for?Ī data dictionary is used by data administrators, analysts, and engineers to understand and trust data assets. It assists management, database administrators, system analysts, and application programmers in planning, controlling, and evaluating the collection, storage, and use of data.” The primary goal of a data dictionary is to help data teams understand data assets.Īccording to IBM’s Computer Terminology Dictionary,Ī data dictionary is a “centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format. Think of it as a list along with a description of tables, fields, and columns.


That is the purpose of the data dictionary. That’s where a repository of all data assets - column descriptions, metrics, measurement units, and more - can help. A data dictionary acts as a reference guide on a dataset.Ĩ0% of a data scientist’s valuable time is spent simply finding, cleaning, and organizing data, leaving only 20% to perform analysis, according to HBR. A data dictionary is a collection of metadata such as object name, data type, size, classification, and relationships with other data assets.
