Uploaded image for project: 'CDAP'
  1. CDAP
  2. CDAP-12569

When metrics publishing fails, the error message should contain more information

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 4.3.0
    • Fix Version/s: 4.3.1
    • Component/s: Log, Metrics
    • Labels:
    • Release Notes:
      Improved error messaging when there is error while in publishing metrics in MetricsCollection service.
    • Rank:
      1|i007p3:

      Description

      Currently, we just log this:

      2017-08-24 04:06:12,177 - ERROR [MessagingMetricsCollectionService:s.n.w.p.h.HttpURLConnection$StreamingOutputStream@3501] - Failed in publishing metrics for timestamp 1503565572. 
      java.io.IOException: insufficient data written 
      at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.close(HttpURLConnection.java:3501) 
      at co.cask.common.http.HttpRequests.execute(HttpRequests.java:111) 
      at co.cask.cdap.common.internal.remote.RemoteClient.execute(RemoteClient.java:92) 
      at co.cask.cdap.messaging.client.ClientMessagingService.performWriteRequest(ClientMessagingService.java:251) 
      at co.cask.cdap.messaging.client.ClientMessagingService.publish(ClientMessagingService.java:182) 
      at co.cask.cdap.metrics.collect.MessagingMetricsCollectionService$TopicPayload.publish(MessagingMetricsCollectionService.java:134) 
      at co.cask.cdap.metrics.collect.MessagingMetricsCollectionService.publishMetric(MessagingMetricsCollectionService.java:102) 
      at co.cask.cdap.metrics.collect.MessagingMetricsCollectionService.publish(MessagingMetricsCollectionService.java:97) 
      at co.cask.cdap.metrics.collect.AggregatedMetricsCollectionService.publishMetrics(AggregatedMetricsCollectionService.java:133) 
      at co.cask.cdap.metrics.collect.AggregatedMetricsCollectionService.run(AggregatedMetricsCollectionService.java:117) 
      at com.google.common.util.concurrent.AbstractExecutionThreadService$1$1.run(AbstractExecutionThreadService.java:52) 
      at java.lang.Thread.run(Thread.java:745) 
      

      This happens when the data being sent exceeds the limit for the size of a request. However, the message does not reveal how many bytes it tried to send, how many metrics were included, what metric context (porgram id, run id) it was for, etc. Without that it is very hard to find out why the size was so large, and by how much the limit needs to be increased.

        Attachments

          Activity

            People

            • Assignee:
              shankar Shankar Selvam
              Reporter:
              andreas Andreas Neumann
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: