Wait Free Coordination Mechanism in Zookkeeper (PART-1)

Introduction

Learner101
11 min readJun 13, 2024

Large-scale distributed applications, which involve many computers working together, need various ways to coordinate their actions. Here are some basic forms of coordination:

  1. Configuration: This is like setting up the rules or settings that all the computers in the system will follow. It can be as simple as a list of settings or as complex as settings that change automatically based on different conditions.
  2. Group Membership and Leader Election: In these systems, it’s important for each computer to know which other computers are currently working and what tasks they are handling. Sometimes, one computer needs to be chosen as the leader to make decisions for the group.
  3. Locks: Locks are used to make sure that only one computer at a time can access certain important resources, preventing conflicts and ensuring smooth operation. This is similar to how a lock on a door ensures that only one person can enter a room at a time.

These coordination methods help ensure that all parts of a distributed system work together efficiently and without conflicts.

To coordinate the different tasks in a large system, you can create specific services for each type of coordination. Here’s a breakdown in simple terms:

  1. Different Services for Different Needs: Just like in a kitchen, you have different tools for different tasks (like a knife for chopping and a whisk for mixing), in a computer system, you have different services for different coordination tasks.
  • Queuing Service: For example, Amazon Simple Queue Service (SQS) is like a waiting line at a store. It helps manage tasks by putting them in order and processing them one by one.
  • Leader Election Service: This service helps decide which computer should be the leader, similar to how a group of friends might vote to choose a team captain.
  • Configuration Service: This service handles the setup rules and settings for the system, like how a recipe provides instructions for cooking.

2. Using Powerful Services for Multiple Tasks: Sometimes, a single, powerful tool can handle multiple tasks.

  • Locking Service: Chubby is an example of a locking service, which is like a key that ensures only one person can enter a room at a time. This strong control can be used for various coordination needs, such as:
  • Leader Election: Deciding who should be the leader (like choosing a team captain).
  • Group Membership: Keeping track of which computers are part of the system (like a class roster in school).
  • Other Coordination Tasks: Handling different needs by ensuring tasks are done in a controlled, organized way.

In summary, by using specialized services or a powerful one that can handle multiple coordination tasks, a system can ensure all its parts work together smoothly and efficiently.

How Zookeeper is Different

When designing Zookeeper coordination service, they chose to move away from implementing specific coordination tools (primitives) directly on the server. Instead, they decided to provide an API that allows application developers to create their own tools as needed. Here’s a breakdown of this approach:

Traditional Approach vs. New Approach

Traditional Approach:

  • Server-Side Primitives: In this approach, the server implements specific coordination tools like leader election, group membership, or locking.
  • Limitations: Developers are constrained to using only the tools provided by the server, which might not fit every application’s needs.

New Approach:

  • API for Developers: Instead of providing fixed tools, they offer an API. This API allows developers to build the specific coordination tools (primitives) their applications need.
  • Coordination Kernel: they designed a coordination kernel that supports the creation of new tools without needing to change the core service.
  • Flexibility: This allows for multiple forms of coordination tailored to the specific requirements of each application, providing greater flexibility.

Benefits of the New Approach

  1. Customization:
  • Developers can implement exactly the tools they need for their specific use cases, rather than adapting their applications to fit pre-defined tools.

2. Scalability:

  • The coordination kernel can support new tools and methods as needed without overhauling the core service, making it easier to scale and evolve.

3. Innovation:

  • By providing a flexible API, developers can innovate and create new coordination strategies that were not initially envisioned, fostering a more creative and adaptable development environment.

Let’s use a simple example to illustrate the benefits of our new approach:

Example Scenario: Organizing a Group Project

Imagine a group of students working on a group project. They need to coordinate their work, but each project might have different coordination needs. Here are two approaches to organizing their coordination:

Traditional Approach

Teacher-Defined Rules:

  • The teacher gives the group a set of predefined rules to follow. For example:
  • Rule 1: Only one student can work on the project document at a time (like a locking mechanism).
  • Rule 2: One student is chosen as the leader and makes all decisions (leader election).
  • Rule 3: Each student has to check in with the leader before starting any task (group membership).

Limitations:

  • These rules might not fit all projects. For instance, some projects might benefit from having multiple students work on different sections of the document simultaneously or rotating leadership roles.

New Approach

Flexible Guidelines:

  • Instead of fixed rules, the teacher provides a set of tools and guidelines:
  • Toolbox: Access to tools like document editing permissions, communication channels, and task assignment boards.
  • API (Guidelines): Instructions on how to use these tools to create their own coordination rules.

Coordination Kernel:

  • The group can use the provided tools to create their own coordination system that best fits their project needs. For example:
  • Custom Rule 1: Two students can work on different sections of the document at the same time, using a shared editing tool (customized locking mechanism).
  • Custom Rule 2: Leadership roles rotate every week to ensure everyone gets a chance to lead (custom leader election).
  • Custom Rule 3: Students update a shared task board to show what they’re working on, without needing to check in with a leader (custom group membership).

Avoid Traditional Blocking Primitives for Coordination

When designing the API for ZooKeeper, they decided to avoid using blocking tools like locks. Here’s why and how we approached it differently

Why Avoid Blocking Primitives (Locks)

  1. Performance Issues:
  • Slow or Faulty Clients: If one client (computer in the system) is slow or faulty, it can hold a lock and delay others, impacting overall performance.
  • Complexity: The system becomes more complicated if it has to wait for responses from other clients and handle failures.

Zookeeper Approach: Wait-Free Data Objects

Instead of using locks, ZooKeeper uses simple data objects that don’t require waiting (wait-free). These objects are organized like files and folders in a file system.

Example:

  • File System Organization: Imagine a shared drive where everyone can read and write files. ZooKeeper’s API works similarly, but without locks.

Key Features:

  1. Wait-Free:
  • Clients don’t wait for others to release locks. They read or write data directly, improving performance and fault tolerance.

2. Order Guarantees:

  • FIFO (First-In-First-Out) Ordering: Operations are processed in the order they are received from each client.
  • Linearizable Writes: When a client writes data, all clients see the write as happening at the same moment in time.

Why These Features Matter:

  • Performance: Wait-free operations avoid delays caused by locks.
  • Fault Tolerance: The system continues to work smoothly even if some clients fail.
  • Coordination: Despite being wait-free, the order guarantees ensure reliable coordination among clients.

Implementing Coordination Primitives:

ZooKeeper can still implement important coordination tools (primitives) using its API:

  • Leader Election: Clients can use ZooKeeper to create a node (file) that indicates the leader. The first client to create the node becomes the leader.
  • Group Membership: Clients can register themselves by creating nodes in a specific folder (like a class roster).
  • Consensus: Clients agree on decisions by reading and writing data in an ordered, wait-free manner.

Universal Object:

According to Herlihy’s hierarchy, ZooKeeper’s API can be used to implement any coordination primitive, making it a universal object.

Example Use Case: Online Auction System

Imagine we are designing an online auction system where users can place bids on items. We need to ensure that the system handles bids efficiently and fairly, without delays and with a consistent order.

Wait-Free Operations

In a traditional system using locks:

  • When a user places a bid, the system might lock the item to prevent other users from bidding at the same time.
  • If a user’s connection is slow, the lock might be held longer, causing delays for other bidders.

With ZooKeeper’s wait-free approach:

  • Users place bids by directly writing their bid data to a shared data structure (like a node in a file system).
  • No locks are used, so each user’s bid is processed immediately, without waiting for others.

Example:

  • User A places a bid at 10:00 AM.
  • User B places a bid at 10:00:01 AM, right after User A.
  • Both bids are processed immediately as they arrive, without waiting for locks to be released.

Order Guarantees

Even without locks, we need to ensure that bids are processed in the correct order and consistently seen by all users.

FIFO (First-In-First-Out) Ordering

  • FIFO Ordering ensures that if User A places a bid before User B, User A’s bid is processed first.
  • Example:
  • User A’s bid (placed at 10:00 AM) is recorded first.
  • User B’s bid (placed at 10:00:01 AM) is recorded next.
  • This order is maintained, ensuring fairness.

Linearizable Writes

  • Linearizable Writes ensure that when a bid is placed, all users see the bid as occurring at the exact moment it is written, with no delays or inconsistencies.
  • Example:
  • User A places a bid of $100.
  • At the same moment, User B and User C check the current highest bid.
  • Both User B and User C see User A’s $100 bid immediately, as if it happened instantaneously.

How ZooKeeper Ensures This

Using ZooKeeper:

  • Each bid is written to a node in a hierarchical structure.
  • ZooKeeper ensures FIFO ordering, so bids are recorded in the order they are received.
  • ZooKeeper ensures linearizable writes, so all users see the bids consistently and at the same moment in time.

Summary

In this online auction system:

  • Wait-Free Operations allow users to place bids without waiting for locks, improving performance and avoiding delays.
  • Order Guarantees (FIFO and Linearizable Writes) ensure that bids are processed in the correct order and consistently seen by all users, maintaining fairness and consistency.

By using ZooKeeper’s wait-free and order-guaranteed operations, our auction system can handle high volumes of bids efficiently and fairly.

ZooKeeper is designed to be highly available and perform efficiently, even for applications with many processes. Here’s a breakdown of how it works and its benefits, explained with simple examples

Key Concepts

  1. Ensemble of Servers:
  • ZooKeeper consists of multiple servers (an ensemble) that replicate data to ensure high availability and performance.

2. Pipelined Architecture:

  • ZooKeeper processes many requests simultaneously (in a pipeline), allowing for high throughput and low latency.

3. FIFO Order and Asynchronous Operations:

  • Operations from a single client are executed in the order they are sent (FIFO).
  • Clients can send multiple operations at once without waiting for each one to complete (asynchronous operations).

Example Scenarios

Scenario 1: High Availability and Performance

Situation: Imagine a social media application where many users are uploading photos simultaneously.

Traditional Approach:

  • A single server might handle all uploads, causing delays and potential downtime if the server fails.

ZooKeeper Approach:

  • Ensemble of Servers: Multiple servers handle uploads, ensuring that if one server fails, others continue to provide the service without interruption.
  • Replication: Each upload is copied across several servers, so the data is always available.

Result:

  • High availability: Users can continue uploading photos even if some servers fail.
  • High performance: Multiple servers handle requests, reducing delays.

Scenario 2: Pipelined Requests for Low Latency

Situation: An online shopping website needs to process thousands of orders quickly during a sale.

Traditional Approach:

  • Each order is processed one by one, leading to high latency (delays).

ZooKeeper Approach:

  • Pipelined Architecture: Many orders are processed simultaneously in a pipeline.
  • Low Latency: Orders are handled quickly, even if there are many at once.

Result:

  • The website can process thousands of orders rapidly, providing a smooth shopping experience for customers.

Scenario 3: Asynchronous Operations and FIFO Order

Situation: A distributed application needs to update its configuration settings across all servers when a new leader is elected.

Traditional Approach:

  • The new leader sends updates one by one, waiting for each to complete before sending the next. This can take several seconds.

ZooKeeper Approach:

  • Asynchronous Operations: The new leader sends multiple update requests at once, without waiting for each to complete.
  • FIFO Order: ZooKeeper ensures the updates are applied in the order they were sent.

Example:

  • The new leader needs to update metadata:
  • Update 1: Change server role.
  • Update 2: Adjust load balancing settings.
  • Update 3: Update version number.

Result:

  • The new leader sends all updates simultaneously.
  • ZooKeeper processes them in the correct order (FIFO).
  • Initialization is completed in sub-seconds instead of several seconds.

Summary

By using an ensemble of servers, a pipelined architecture, and supporting asynchronous operations with FIFO order, ZooKeeper ensures:

  • High Availability: The system continues to operate even if some servers fail.
  • High Performance: Many requests are handled simultaneously, reducing delays.
  • Efficient Coordination: Clients can send multiple requests at once, and ZooKeeper processes them in the correct order quickly.

These features make ZooKeeper an ideal coordination service for large, distributed applications, enabling them to manage coordination tasks efficiently and reliably.

Linearizability and Zab Protocol

  • Linearizability: This ensures that update operations in ZooKeeper are seen by all clients at the same moment in time, as if they happened in a sequential order. This consistency is crucial for maintaining accurate data across distributed systems.
  • Zab Protocol: ZooKeeper uses a protocol called Zab (ZooKeeper Atomic Broadcast) to achieve linearizability for update operations. This protocol ensures that updates are broadcasted to all servers in a way that guarantees they are processed in the same order everywhere.
    will explain details in part 2

Scaling Read Operations

  • Read Dominance: In ZooKeeper applications, read operations are more frequent than write operations. To handle this efficiently, ZooKeeper allows servers to process read operations locally without needing to enforce total ordering (like Zab does for writes).
  • Client-Side Caching: To improve read performance, clients can cache data locally. For example, instead of checking with ZooKeeper every time, a client can cache the identifier of the current leader. This reduces the latency of read operations.
  • Watch Mechanism: ZooKeeper employs a watch mechanism where clients can register to receive notifications when specific data changes. This allows clients to cache data effectively without having to manage the cache themselves. If the data they are interested in changes, they receive an immediate update notification.

Comparison with Chubby

  • Chubby: Another coordination service that manages client-side caching differently. Chubby directly manages client caches and blocks updates to invalidate caches across all clients when data changes. This approach can delay updates if any client is slow or faulty.
  • Leases: Chubby uses leases to prevent faulty clients from indefinitely blocking updates. Leases set a time limit on how long a client can hold onto cached data before it must re-validate with the server.
  • ZooKeeper’s Advantage: ZooKeeper avoids the delay problem altogether with its watch mechanism. Updates are notified immediately to clients, ensuring that slow or faulty clients do not disrupt the system’s performance.

Coordination Primitives and Contributions

  • Coordination Kernel: ZooKeeper provides a wait-free coordination service with relaxed consistency guarantees. This means it efficiently handles coordination tasks across distributed systems without requiring clients to wait for each other.
  • Coordination Recipes: ZooKeeper allows developers to build higher-level coordination tools using its primitives. These tools can include both blocking and strongly consistent primitives needed for complex distributed applications.

Summary

In summary, ZooKeeper ensures linearizability for writes using the Zab protocol, scales read operations by allowing local processing and client-side caching with a watch mechanism, and provides a robust coordination service with various primitives for distributed applications. Its design focuses on high performance, reliability, and efficient coordination across large-scale systems.

--

--

Learner101
Learner101

No responses yet