Robert S Seiner
The answer depends on what you are cataloging. Here is a set of bullets I used in my webinar today
First identify the information that you will catalog …
- Data Inventory – Metadata specific to data resources
- Data Ownership – Metadata specific to owners and SMEs
- Data Classification – Metadata associated with protection
- Reports – Metadata about available reports and logistics
- Critical Data Resources – Specific to each organization.
Also think about the functionality that you want from the tool …
- Metamodels and Software Releases
- Self-Defined Loads
- Role Representation
- Process Integration
- Change Control and Versioning
- End User Requirements
- Training and Education
- Resource Requirements
This is just a starter list. I hope it helps. Anybody else have anything to add?
Robert S. Seiner
[login to unmask email] @RSeiner @TDAN_com
First and foremost you will need flexibility as you will be cataloging more than just data. You will need to capture where it came from, how it is produced, who uses it, what requires it, etc. There are a lot of "nouns" or dimensions involved.
Second, you will want this approach to be collaborative. You will need input from the business. So the vendor will need to provide a simple to use interface that allows non-technical users to be part of the conversation.
Data modeling is included in this. You do not want to have to train the business on how to model information so this component must be drop dead simple.
Finally, I have found a number of organizations who have tried to catalog information using spreadsheets. The challenges they run into is reporting and analytics. Once you collect all of this data you will want to analyze it. The multi-dimensional data involved is not easily maintained in rows and columns. You will need a more robust solution, one that provides easy reporting and built-in analytics.
Send me a note if you are interested in understanding more.
Ray Diaz, CBIP, CDP, CSM, ICP-ATF
The critical path is for the Data Catalog be able to connect to the majority of your data sources and tools to automate the metadata ingestion. There is no way to manage the catalog if its a mostly manual process.
Also if the catalog contains AI/Machine Learning and Crowdsourcing to help suggest the linking the metadata, is an amazing help. The data catalog is the most important module in my view to managing data assets and their interrelationships are extremely important.