Research Question(s): How do we define end-to-end? Is it conflict with the reliability measures at lower level? What degree of instrumentations is necessary on the edges and in between?
Key Contributions: 1) A high-level definition that confirms whether programs actually work, where end-to-end guarantees that success has been achieved 2) With better failure recovery mechanisms, we can endure more errors and reduce costs by avoiding building a perfect system, and 3) It’s an engineering trade-off to add checking and recovery measures in the lower-system for performance rather than correctness. It’s more of a way of abstract thinking: with more attention to the end-to-end argument, we can avoid creating a perfect system or solution, but focused on relatively inexpensive ways to achieve the same goal.
Opportunities for future work: How the ancient theories apply to the cloud nowadays? How do the cloud providers applying end-to-end to their system designs? Is end-to-end in some ways conflict with observability?
Presenter: Mona Ma