Many people know zookeeper, which is a widely used open source project. But few people know chubby, which is a service only used internally in Google and public can only know its detail through paper published by Google. Zookeeper provides almost the same functionality as chubby, but there’re a few subtly differences between them, and I will discuss the differences in this post. So later time when you read a Google paper saying they used chubby as a underlaying infrastructure, do not treat it as a zookeeper equivalent.
Zookeeper is a distributed process coordinator, as O’Reilly puts it. Chubby is a distributed lock service provides strong consistency. Although both of them have a file system like API from user’s perspective, they provide different level of consistency, you can get a clue from their descriptions: coordinator is a much weaker word compare to lock service.
Like most other systems, there are few write operations in chubby, so its network flow is dominated by read operations and heartbeat between client and server, to optimize chubby, it’s necessary to use cache, also since chubby is intended to provide strong consistency, it needs to ensure that the client should never see staled data. It achieve these by cache data in library, and if anyone want to change the data, chubby ensures that all the caches are invalidated before the change take effect. From the paper, chubby has a rather heavy client library, and the library is the essential part of the system, provides not only connection and session management but also cache management as we discussed earlier. And it is the cache management that makes huge difference between zookeeper and chubby. Most users thought zookeeper provides a strong consistency but find out they’re wrong later.
Not like chubby, zookeeper do not provides such strong consistency, actually, native client library of zookeeper do not have cache at all. What the difference can make by cache, you asked. Well, for example: if two zookeeper clients want to agree on a single volatile value, they should use the same path to write file, and content of that file signify the value of given key. If client A want to know the latest value, and client B want to change it, client A can either poll the file regularly or use watch mechanism provided by native zookeeper library, let’s assume A uses watch since it is more efficient. If B changed the value, zookeeper would respond to the change call before notifying client A, once B got response, it may assume any other clients who watched that value has been notified, but it is not the case: A may not received notification timely due to slow network, so it is possible that two clients see different value at the same time. This usually is not a problem if users do not require strong consistency, but if they do, they have to invent some buggy ad-hoc solution to get around of this. The author of zookeeper book argued that it would have made the design of zookeeper more complex if the library manage cache, and could cause zookeeper operations to stall while waiting for a client to acknowledge cache invalidation request. It is true that strong consistency come at cost, but people would find it is easier to use if strong consistency is guaranteed.
To remedy this, Netflix created curator library which later moved to Apache foundation, this library provides the commonly used functionality and cache management. This additional layer to zookeeper allows it providing strong consistency needed by some users. So whenever you want to use zookeeper, use curator library instead of native library unless you know what you are doing.