DOCC Lab Reading Group

Sieve: Attention-based Sampling of End-to-End Trace Data in Distributed Microservice Systems

Research Question: How can we effectively reduce the volume of trace data in distributed microservice systems while preserving the quality of information necessary for monitoring, diagnosing, and analyzing system behavior?

Key Contributions: Sieve introduces an attention-based sampling mechanism that prioritizes and samples trace data by focusing on significant traces likely to provide valuable insights. It implements a dynamic sampling probability for each trace, determined by analyzing various features extracted from the trace data to assess its informativeness. This approach allows Sieve to significantly reduce the volume of trace data while maintaining the essential information necessary for effective system monitoring and analysis.

We identified several areas where further clarifications would be helpful, highlighting potential weaknesses in their approach. Firstly, we questioned the timing of Sieve’s sampling rate changes. Given that traces cannot be processed until their collection is complete, the method appears tail-based rather than real-time, raising concerns about its classification as an “online sampler.” Secondly, we noted that Sieve employs a sliding window function for sampling rate adjustments, which means it may take time to reduce the sampling rate after encountering an uncommon trace. This delay could result in an excessive amount of data being sampled immediately afterward. Additionally, the paper frequently mentions API evaluations but lacks sufficient description, leading us to infer that sampling rate increases might be limited to specific APIs. Lastly, the evaluation section refers to API groups without detailed specifications, leaving us unclear about their definitions and implications. Addressing these points would enhance understanding of Sieve’s practical applications and limitations.

Opportunities for future work: Given uncleared questions above and quality of paper, I’ll leave direction for future work empty.

Presenter: Zhaoqi Zhang