Wait Free Coordination -Zookeeper(Part 3)
ZooKeeper Client API Explained with Examples
Here’s a detailed explanation of the ZooKeeper API with simple examples to help you understand how each request works and the semantics of the operations:
1. create(path, data, flags)
- Description: Creates a znode at the specified path, stores the provided data, and returns the name of the new znode. The
flags
parameter allows you to specify the type of znode (regular, ephemeral) and whether it should be sequential. - Example:
String path = "/app1/config"; byte[] data = "config data".getBytes();
CreateMode flags = CreateMode.PERSISTENT; zooKeeper.create(path, data, flags);
- This creates a persistent znode at
/app1/config
with the data "config data".
2. delete(path, version)
- Description: Deletes the znode at the specified path if its version matches the provided version.
- Example:
String path = "/app1/config";
int version = 1; // Expected version zooKeeper.delete(path, version);
- This deletes the znode at
/app1/config
if its version is 1.
3. exists(path, watch)
- Description: Checks if a znode exists at the specified path. If
watch
is true, a watch is set on the znode. - Example:
String path = "/app1/config";
boolean watch = true;
Stat stat = zooKeeper.exists(path, watch);
- This checks if the znode
/app1/config
exists and sets a watch on it.
4. getData(path, watch)
- Description: Retrieves the data and metadata (e.g., version information) of the znode at the specified path. If
watch
is true, a watch is set on the znode. - Example:
String path = "/app1/config";
boolean watch = true;
byte[] data = zooKeeper.getData(path, watch, null);
- This gets the data of the znode
/app1/config
and sets a watch on it.
5. setData(path, data, version)
- Description: Updates the data of the znode at the specified path if its version matches the provided version.
- Example:
String path = "/app1/config";
byte[] newData = "new config data".getBytes();
int version = 1; // Expected version zooKeeper.setData(path, newData, version);
- This sets new data for the znode
/app1/config
if its version is 1.
6. getChildren(path, watch)
- Description: Retrieves the names of the children of the znode at the specified path. If
watch
is true, a watch is set on the znode. - Example:
String path = "/app1";
boolean watch = true;
List<String> children = zooKeeper.getChildren(path, watch);
- This gets the children of the znode
/app1
and sets a watch on it.
7. sync(path)
- Description: Waits for all updates pending at the start of the operation to propagate to the server that the client is connected to.
- Example:
String path = "/app1/config"; zooKeeper.sync(path);
- This synchronizes the state of the znode
/app1/config
across the ensemble.
Synchronous vs. Asynchronous API
- Synchronous API:
- Used when an application needs to execute a single ZooKeeper operation and has no concurrent tasks. The method call blocks until the operation completes.
- Example:
zooKeeper.create(path, data, flags); // Blocks until the znode is created
- Asynchronous API:
- Allows the application to perform multiple ZooKeeper operations and other tasks concurrently. The method call returns immediately, and a callback is invoked when the operation completes.
- Example:
zooKeeper.create(path, data, flags, new AsyncCallback.StringCallback() {
public void processResult(int rc, String path, Object ctx, String name) {
// Callback code here } }, null);
API Design
- No Handles for Znodes:
- Each request includes the full path of the znode, simplifying the API and reducing server state.
- Example:
- Instead of using
open()
andclose()
, you directly use the path:zooKeeper.getData("/app1/config", true);
- Conditional Updates:
- Each update method takes an expected version number to support conditional updates.
- Example:
zooKeeper.setData("/app1/config", data, 1);
will only succeed if the version of/app1/config
is 1.
Watches
One-Time Triggers:
- Watches notify clients of changes but are unregistered after being triggered.
- Example:
Stat stat = zooKeeper.exists("/app1/config", true);
// Watch is set. The client will be notified if /app1/config changes.
This overview provides a detailed yet simplified explanation of the ZooKeeper API, including its synchronous and asynchronous versions, the semantics of each request, and practical examples.
Used when an application needs to execute a single ZooKeeper operation and has no concurrent tasks. The method call blocks until
Why Use sync(path)
Even When You Can Set a Watch
- Immediate Read After Write: If you need to read the most current data right after an update,
sync(path)
ensures you are seeing the latest state. Watches notify you of future changes, but they do not guarantee that previous updates have been fully propagated. - Consistency Across Servers: When your application involves multiple ZooKeeper servers,
sync(path)
ensures that the server your client is connected to has the most recent updates. Watches only notify you when changes occur and do not address propagation delays between servers. - Initialization and Bootstrapping: When a client first starts and needs to ensure it is working with the latest state before proceeding, using
sync(path)
can be critical.
Note: They can be used in combination as well, they are not mutually exclusive. Usecase could be as followes:
- Initialize with Latest Data: Use
sync(path)
to ensure the client starts with the latest data. - Monitor for Changes: Set a watch to get notified of any future changes.
Let’s break down the key points about ZooKeeper’s ordering guarantees based on the information provided:
ZooKeeper’s Ordering Guarantees
- Linearizable Writes:
- ZooKeeper ensures that all requests that update the state (write operations) are serializable and respect precedence. This means write operations appear to be instantaneous and are globally ordered across the distributed system.
2. FIFO Client Order:
- ZooKeeper guarantees that all requests from a given client are executed in the exact order they were sent by the client. This ensures that operations issued by a client are processed sequentially, maintaining the order of operations as perceived by that client.
A-linearizability (Asynchronous Linearizability)
- Definition:
- ZooKeeper introduces a concept similar to linearizability but with a slight relaxation regarding the ordering of operations from the same client.
- In A-linearizability (asynchronous linearizability), a client can have multiple outstanding operations concurrently. This contrasts with the original strict linearizability definition by Herlihy, where each client (thread) can only have one outstanding operation at a time.
2. Ordering Choices:
- With A-linearizability, ZooKeeper provides flexibility regarding the ordering of operations from the same client:
- It can either guarantee no specific order for outstanding operations from the same client.
- Or, as chosen in ZooKeeper’s design, it can guarantee FIFO order for operations from the same client.
3. Compatibility with Linearizability:
- A-linearizability ensures that while allowing multiple outstanding operations per client, the system still satisfies the fundamental properties of linearizability. Therefore, any guarantees and properties that hold true for linearizable objects also hold true for A-linearizable objects.
4. Handling Read Requests:
- In ZooKeeper, read requests are processed locally at each replica. This local processing of read requests allows the service to scale linearly as additional servers (replicas) are added to the system. This approach helps in distributing the read load efficiently across the system.
Conclusion
ZooKeeper’s design of A-linearizability allows it to maintain strong consistency guarantees while providing flexibility in handling concurrent operations from the same client. By choosing to guarantee FIFO client order, ZooKeeper ensures that clients observe operations in the order they were initiated, thus simplifying the programming model for applications built on top of ZooKeeper. This approach contributes to ZooKeeper’s ability to scale and maintain consistency across distributed environments effectively.
Let’s walk through an Example
In the scenario described, where a new leader in a distributed system needs to change a large number of configuration parameters and ensure both consistency during updates and reliability in the face of failures, the properties of linearizability and FIFO client order play crucial roles. Let’s break down how these guarantees interact and how ZooKeeper facilitates achieving these requirements:
- Consistency During Updates:
- As the new leader makes changes to configuration parameters, other processes should not use the partially updated configuration.
2. Reliability in Face of Leader Failure:
- If the new leader fails before completing the configuration update, other processes should not use the incomplete or partial configuration.
How ZooKeeper Addresses These Requirements:
1. Linearizable Writes
- Definition: All updates to ZooKeeper are linearizable, ensuring that each update operation appears to occur instantaneously and in a globally consistent order.
- Application: When the new leader updates configuration parameters, ZooKeeper ensures that these updates are seen by all processes in a consistent order. This prevents any process from seeing a partially updated configuration or an inconsistent state.
2. FIFO Client Order
- Definition: ZooKeeper guarantees that all requests from a given client are executed in the order they were sent by that client.
- Application: The new leader can issue a sequence of operations asynchronously to update multiple configuration znodes. Even though these updates are pipelined and issued asynchronously for efficiency (to reduce latency), ZooKeeper ensures that the order of these operations as perceived by clients (and other processes) follows the sequence in which they were initiated by the leader.
How ZooKeeper Implements the Configuration Update:
- Designating a Ready Znode:
- Before updating configuration znodes, the new leader designates a specific znode (let’s call it
ready
) which signals that the configuration update is complete and consistent.
2. Updating Configuration Znodes:
- The new leader proceeds to update various configuration znodes by deleting and recreating them with updated parameters.
3. Creating the Ready Znode:
- Once all configuration updates are complete, the new leader creates the
ready
znode. This step ensures that other processes can safely begin using the updated configuration once they observe the existence of theready
znode.
Guarantees Provided by ZooKeeper:
- Ordering Guarantees: ZooKeeper ensures that if a process sees the
ready
znode, it must also see all configuration changes made by the new leader. This consistency is crucial for ensuring that processes do not use partial or inconsistent configurations. - Failure Handling: If the new leader fails before creating the
ready
znode, ZooKeeper ensures that other processes do not use the incomplete configuration. This reliability is maintained because processes rely on the existence of theready
znode as a signal that the configuration update is complete and consistent.
Let’s illustrate the problem and how ZooKeeper’s mechanisms address it using a simple example involving a distributed system and configuration updates.
Problem:
Scenario: Imagine a distributed system where multiple servers (nodes) rely on a shared configuration stored in ZooKeeper. Node A is responsible for updating this configuration periodically.
- Configuration Update Process:
- Node A begins updating the configuration stored in ZooKeeper.
- As part of the update:
- Node A deletes and recreates several configuration znodes to reflect new settings.
- Node A updates a special
ready
znode to signal that the configuration update is in progress or complete.
2. Node B’s Behavior:
- Node B, another server in the system, monitors the
ready
znode to determine when the configuration update is finished. - Upon detecting the existence of the
ready
znode, Node B assumes the configuration update is complete and proceeds to read the updated configuration settings from ZooKeeper.
3. Premature Reading Issue:
- Due to the asynchronous nature of distributed systems:
- Node A’s update process may not have completed when Node B reads the configuration.
- Node B might fetch and use the configuration before all znodes are fully updated by Node A.
- This can lead Node B to operate based on incomplete or outdated configuration settings, potentially causing errors or inconsistencies in the system.
Solution with ZooKeeper:
Notification Before State Change:
- ZooKeeper ensures that Node B receives notifications about changes (like updates to the
ready
znode) before it sees the new state of the system. - When Node A updates the
ready
znode: - ZooKeeper notifies Node B that the
ready
znode has changed or been recreated. - Node B receives this notification before it accesses the updated configuration data.
- This mechanism prevents Node B from prematurely reading incomplete or outdated configurations.
Watch Mechanism:
- Node B can register a watch on the
ready
znode. - By doing so, Node B receives a notification whenever the
ready
znode changes its state (created, deleted, or modified). - When Node B receives a watch notification:
- It knows that the configuration update process has reached a certain point.
- Node B can then safely proceed to read the configuration, knowing that it reflects the latest and complete updates made by Node A.
Consistency Guarantees:
- ZooKeeper ensures strong consistency across all operations.
- All nodes, including Node B, observe a consistent view of the configuration state at any given time.
- This consistency prevents nodes from accessing conflicting or outdated information during configuration updates.
Example Illustration:
Example:
- Node A starts updating the configuration in ZooKeeper, which involves modifying multiple znodes.
- Node A updates the
ready
znode once all configurations are updated. - Node B, monitoring the
ready
znode, receives a notification that theready
znode has been updated. - Node B waits for this notification before it reads the configuration data.
- Once notified, Node B fetches the configuration from ZooKeeper, ensuring it has the most up-to-date and consistent settings.
Another problem scenario
- Clients A and B:
- Both clients have a shared configuration stored in ZooKeeper.
- They also have a separate communication channel outside of ZooKeeper to inform each other about changes.
2. Change Notification:
- Client A updates the shared configuration in ZooKeeper.
- Client A notifies Client B about this change through their shared communication channel.
3. Potential Issue:
- Client B expects to see the updated configuration when it re-reads the configuration from ZooKeeper.
- However, if Client B’s ZooKeeper replica is slightly behind Client A’s (due to network latency or other factors), Client B might not immediately see the new configuration after being notified by Client A.
Solution:
To ensure that Client B sees the most up-to-date information after being notified of a change by Client A, ZooKeeper provides a mechanism called sync:
- Sync Request:
- When Client B issues a sync request followed by a read operation, it ensures that the ZooKeeper server applies all pending write requests (including updates to the shared configuration) before processing the read operation.
- This sync request ensures that Client B’s read operation reflects the most recent state of the shared configuration, even if there was a delay in receiving the update notification from Client A.
- Efficiency:
- Unlike a full write operation, which involves transferring data and metadata, a sync request is a lightweight operation. It directs the ZooKeeper server to ensure consistency without the overhead associated with full data transfers.
- Similarity to Flush Operation:
- The sync request in ZooKeeper is conceptually similar to a flush operation in other systems (such as the flush primitive in ISIS). It ensures that all updates are applied to the persistent storage before subsequent operations are processed, thereby maintaining consistency.
Next part will cover Primitives implented using Zookeeper and More