We have a pipeline with 2 sources, 2 transforms (1 for each source) feeding into a shared sink. We also have an error collector for each source and transform stage that flow into the same sink. We notice that the metrics show inconsistent counts for the number of records flowing out of the error collector. For this particular test I am expecting 40 error records. Some runs show it correctly
While the next run on the same dataset shows it incorrectly (24 in this case)
The ProgramClient.getAllProgramRuns API returns records.out metric as whatever you see above, so I am not sure if the issue lies in the API or the backend collecting the metrics.
If I check the underlying error dataset, I see 40 records. So, it's just the metric showing the incorrect count.