We're using Informatica's DQ tool but are researching an additional UI/dashboarding interface to make it easier for data users to understand what's going on.
Do you tend to focus on key metrics, distribution of values for a data element or both? For example, if widget counts decline in a statistically significantly manner, a steward is alerted OR if value "X" begins to decline in volume (statistically significant) for data element A vs. the other values for data element A an alert is raised.
We are using IBM InfoSphere Information Server products--Information Analyzer and Information Governance Catalog. There are around a couple hundred built in data rules you can use or modify. You can also build your own. We are really just getting started with Data Quality and I have experimented with creating custom data rules. We plan to start by taking a baseline of several systems using these tools. The newest release has some dashboards built in as well, such as for monitoring governance and curation. I am really excited about using these tools.
I tried IDQ in the past as well. What I found (in the old version anyways) was that I couldn't do complex comparisons. Things like flag loan status 'Active' where account balance = 0.
I ended up working with the business to develop the list of data quality rules and then coded those rules in Informatica mapping(s). If a DQ rule failed, I inserted the result into a fact table that had a dashboard sitting on top of it. Now, for better or worse, everyone could see the data quality issues in the data. What was amazing was how quickly these issues were cleaned up once they become publicly available!
We used IDQ primarily to identify records that had data not within range or that had entries not found in reference tables. Ditto on some limitations on complex rule-building. It's also very IT-centric and we want to engage the business in the DQ process, thus the desire for a sold dashboarding/reporting tool.
Shailesh B Nimbalkar, Data Architect
We use the tool "Truesight" to monitor DQ related issues. The tool has a capacity to run desired queries against source and target databases and raise alarm if values differ from desired output. All critical applications are being monitored using this tool. With this critical application DQ issues are known to IT before hand.
We used ETL to populate data in a data mart from different data sources to showcase data quality issues from Summary level all the way to detailed level for stakeholders/ Business to understand data issues and how they can leverage this data for data cleansing.
Coming up with a data mart and building rules was bit time consuming but end of the day it is helping businesses to resolve data issues and have governance across organisation.