Cask Tracker is a CDAP Extension that provides the ability to track data ingested either through Cask Hydrator or Custom CDAP Application and provide input to data governance process on cluster. It includes following data about "the data" :
- Includes Tags, Properties, Schema for CDAP Datasets and Programs
- System and User
- Data Quality
- Metadata that include Feed-level and Field-level quality metrics of datasets
- Data Usage Statistics
- Usage statistics of dataset and programs.
Implementation has been broken down into sub-tasks as shown below: