Large Scale Distributed Systems

Objectives

  • Understand the importance of logical time and know how to apply appropriate logical clocks to each specific context.
  • Know the taxonomy of the main consistency models for distributed systems, the role of different orders in their definition and the boundary defined by the CAP theorem.
  • Describe different strategies for guaranteeing delivery order, namely causal dissemination, and implement algorithms based on version vectors.
  • Know and be able to implement the most common replicated data types (CRDTs), in operation-based and state-based approaches.
  • Describe classical network types and the most important graph metrics for large-scale systems; explain distributed dissemination and aggregation algorithms and the creation of overlay networks.
  • Implement distributed research in large-scale decentralized systems through DHTs.
  • Understand the compromises inherent in geo-replicated systems, the notion of highly available transactions (HAT), the limit given by transactional causal coherence, and the main techniques and algorithms used in implementing these systems.

Program

  • Time and logical clocks.
  • Consistency models.
  • Message sorting algorithms.
  • Replicated data types (CRDTs) based on operations and state.
  • Distributed dissemination and aggregation.
  • Algorithms for creating and maintaining overlay networks.
  • Distributed hash tables.
  • Geo-replicated databases.

Bibliography

  • Distributed Systems. George Coulouris, Jean Dollimore, Tim Kindberg and Gordon Blair. Fifth Edition, Addison Wesley, 2011.
  • Distributed Algorithms. Nancy Lynch. Morgan Kaufmann Publishers, 1996.
  • Optimistic Replication. Yasushi Saito, Marc Shapiro. ACM Computing Surveys, 2005.
  • A selection of scientific papers.

Updated: