Data Lineage - How detailed are your data lineage views. Do you include staging tables and views? We are trying to determine how detailed we should get with our lineage. Should we skip staging and show only source application? Should we show aggregated views as a part of the lineage?
I'm the only one in my organization working on reporting data ETL and modeling, so take this with a grain of salt.
If you have the overall structure / flow of data documented, a good business glossary, and good naming practices I think you can avoid documenting the field level lineage. So I document that data is copied into the "staging" schema, then prepared in the "prep" schema and fact tables are combined in the "report" schema. Anyone with knowledge of the business could look at the table & view names and figure out what is going on really quickly.
Your other option is a linage tool that does everything automatically.
I take data lineage to the data element level and include all hops the data takes. However you decide to do it, you have to think about the reason you're doing it. This is not an academic exercise. You're doing it so that if you have a problem, you can easily trace back to find out where the problem occurred.
Thanks for your input. In this case we are not looking to solve an issue with the report but to help the report user verify the source and possible enhancements.
Thanks. We are trying to show the report user the source of the data for validation and also possible enhancement possibilities.