Chapter 13: Scalability and ZEO

When a web site gets more requests than it can handle it can become slow and unresponsive. In the worst case too many requests to a web site can cause the server to completely overload, stop handling requests and possibly even crash. This can be a problem for any kind of server application, not just Zope. The obvious solution to this problem is to use more than one computer, so in case one computer fails, another computer can continue to serve up your web site.

Using multiple computers has obvious benefits, but it also has some drawbacks. For example, if you had five computers running Zope then you must ensure that all five Zope installations have the same information on them. This is not a very hard task if you're the only user and you have only a few static objects, but for large organizations with thousands of rapidly changing objects, keeping five separate Zope installations synchronized manually would be a nightmare. To solve this problem, Zope Corporation created Zope Enterprise Objects, or ZEO. This chapter gives you a brief overview on installing ZEO, but there are many other options we don't cover. For more in-depth information, see the documentation that comes with the ZEO package, and also take a look at the ZEO discussion area.

What is ZEO?

ZEO is a system that allows you to run your site on more than one computer. This is often called clustering and load balancing. By running Zope on multiple computers, you can spread the requests evenly around and add more computers as the number of requests grows. Further, if one computer fails or crashes, other computers can still service requests while you fix the broken one.

ZEO runs Zope on multiple computers and takes care of making sure all the Zope installations share the exact same database at all times. ZEO uses a client/server architecture. The Zope installations on multiple computers are the ZEO Clients. All of the clients connect to one, central ZEO Storage Server, as shown in Figure 11-1.

Simple ZEO illustration

Figure 11-1 Simple ZEO illustration

The terminology can be a bit confusing, because normally you think of Zope as a server, not a client. When using ZEO, your Zope processes act as both servers (for web requests) and clients (for data from the ZEO server).

ZEO clients and servers communicate using standard Internet protocols, so they can be in the same room or in different countries. ZEO, in fact, can distribute a Zope site all over the world. In this chapter we'll explore some interesting ways you can distribute your ZEO clients.

When you should use ZEO

ZEO serves many hits in a fail-safe way. If your site does not get millions of hits, then you probably don't need ZEO. There is no hard-and-fast rule about when you should and should not use ZEO, but for the most part you should not need to run ZEO unless:

All of these cases are fairly advanced, high-end uses of Zope. Installing, configuring, and maintaining systems like these requires advanced system administration knowledge and resources. Most Zope users will not need ZEO, or may not have the expertise necessary to maintain a distributed server system like ZEO. ZEO is fun, and can be very useful, but before jumping head-first and installing ZEO in your system you should weigh the extra administrative burden ZEO creates against the simplicity of running just a simple, stand-alone Zope.

Installing and Running ZEO

The most common ZEO setup is one ZEO server and multiple ZEO clients. Before installing and configuring ZEO though, consider the following issues:

ZEO is not distributed with Zope, you must download it from the Products Section of

Installing ZEO requires a little bit of manual preparation. To install ZEO, download the ZEO-1.0.tgz from the web site and place it in your Zope installation directory. Now, unpack the tarball. On Unix, this can be done with the following command:

      $ tar -zxf ZEO-1.0.tgz

On Windows, you can unpack the archive with WinZip. Before installing ZEO, make sure you back up your Zope system first.

Now you should have a ZEO-1.0 directory. Next, you have to copy some files into your Zope top level lib/python directory. This can be done on UNIX with:

      $ cp -R ZEO-1.0/ZEO lib/python

If you're running windows, you can use the following DOS commands to copy your ZEO files:

      C:\...Zope\>xcopy ZEO-1.0\* lib\python /S

Now, you have to create a special file in your Zope root directory called In that file, put the following python code:

      import ZEO.ClientStorage

This will configure your Zope to run as a ZEO client. If you pass ClientStorage a tuple, as this code does, the tuple must have two elements, a string which contains the address to the server, and the port that the server is listening on. In this example, we're going to show you how to run both the clients and the servers on the same machine, so the machine name is set to localhost.

Now, you have ZEO properly configured to run on one computer. Try it out by first starting the server. Go to your Zope top level directory in a terminal window or DOS box and type:

      python lib/python/ZEO/ -p 7700

This will start the ZEO server listening on port 7700 on your computer. Now, in another window, start up Zope like you normally would, with the script:

      $ python -D

      2000-10-04T20:43:11 INFO(0) client Trying to connect to server
      2000-10-04T20:43:11 INFO(0) ClientStorage Connected to storage
      2000-10-04T20:43:12 PROBLEM(100) ZServer Computing default pinky
      2000-10-04T20:43:12 INFO(0) ZServer Medusa (V1.19) started at Wed Oct  4 15:43:12 2000

Notice how in the above example, Zope tells you client Trying to connect to server and then ClientStorage Connected to storage. This means your ZEO client has successfully connected to your ZEO server. Now, you can visit http://localhost:8080/manage (or whatever URL your ZEO client is listening on) and log into Zope as usual.

As you can see, everything looks the same. Go to the Control Panel and click on Database Managment. Here, you see that Zope is connected to a ZEO Storage and that its state is connected.

Running ZEO on one computer is a great way to familiarize yourself with ZEO and how it works. Running ZEO on one computer does not, however, improve the speed of your site, and in fact, it may slow it down just a little. To really get the speed benefits that ZEO provides, you need to run ZEO on several computers, which is explained in the next section.

How to Run ZEO on Many Computers

Setting up ZEO to run on multiple computers is very similar to running ZEO on one computer. There are generally two steps, the first step is to start the ZEO server, and the second step is to start one or more ZEO clients.

For example, let's say you have four computers. One computer named zooserver will be your ZEO server, and the other three computers, named zeoclient1, zeoclient2 and zeoclient3, will be your ZEO clients.

The first step is to run the server on zooserver. To tell your ZEO server to listen on the tcp socket at port 9999 on the zooserver interface, run the server with the script like this:

      $ python lib/python/ZEO/ -p 9999 -h

This will start the ZEO server. Now, you can start up your clients by going to each client and configuring each of them with the following

      import ZEO.ClientStorage

Now, you can start each client's script as shown in the previous section, Installing and Running ZEO. Notice how the host and port for each client is the same, this is so they all connect to the same server. By following this procedure for each of your three clients you will have three different Zope's all serving the same Zope site. You can verify this by going visiting port 8080 on all three of your ZEO client machines.

You probably want to run ZEO on more than one computer so that you can take advantage of the speed increase this gives you. Running more computers means that you can serve more hits per second than with just one computer. Distributing the load of your web site's visitors however does require a bit more elaboration in your system. The next section describes why, and how, you distribute the load of your visitors among many computers.

How to Distribute Load

In the previous example you have a ZEO server named zooServer and three ZEO clients named zeoclient1, zeoclient2, and zeoclient3. The three ZEO clients are connected to the ZEO server and each client is verified to work properly.

Now you have three computers that serve content to your users. The next problem is how to actually spread the incoming web requests evenly among the three ZEO clients. Your users only know about, not zeoclient1, zeoclient2 or zeoclient3. It would be a hassle to tell only some users to use zeoclient1, and others to use zeoclient3, and it wouldn't be very good use of your computing resources. You want to automate, or at least make very easy, the process of evenly distributing requests to your various ZEO clients.

There are a number of solutions to this problem, some easy, some advanced, and some expensive. The next section goes over the more common ways of spreading web requests around various computers using different kinds of technology, some of them based on freely-available or commercial software, and some of them based on special hardware.

User Chooses a Mirror

The easiest way to distribute requests across many web servers is to pick from a list of mirrored sites, each of which is a ZEO client. Using this method requires no extra software or hardware, it just requires the maintenance of a list of mirror servers. By presenting your users with a menu of mirrors, they can use to choose which server to use.

Note that this method of distributing requests is passive (you have no active control over which clients are used) and voluntary (your users need to make a voluntary choice to use another ZEO client). If your users do not use a mirror, then the requests will go to your ZEO client that serves

If you do not have any administrative control over your mirrors, then this can be a pretty easy solution. If your mirrors go off-line, your users can always choose to come back to the master site which you do have administrative control over and choose a different mirror.

On a global level, this method improves performance. Your users can choose to use a server that is geographically closer to them, which probably results in faster access. For example, if your main server was in Portland, Oregon on the west coast of the USA and you had users in London, England, they could choose your London mirror and their request would not have to go half-way across the world and back.

To use this method, create a property in your root folder of type lines named "mirror_servers". On each line of this property, put the URL to your various ZEO clients, as shown in Figure 11-2.

Figure of property with URLs to mirrors

Figure 11-2 Figure of property with URLs to mirrors

Now, add some simple DTML to your site to display a list of your mirrors:

        <h2>Please choose from the following mirrors:
          <dtml-in mirror_servers>
          <li><a href="&dtml-sequence-item;"><dtml-var

This DTML displays a list of all mirrors your users can choose from. When using this model, it is good to name your computers in ways that assist your users in their choice of mirror. For example, if you spread the load geographically, then choose names of countries for your computer names.

Alternatively, if you do not want users voluntarily choosing a mirror, you can have the index_html method of your site issue HTTP redirects. For example, use the following code in your site's index_html method:

        <dtml-call expr="RESPONSE.redirect(_.whrandom.choice(mirror_servers))">

This code will redirect any visitors to to a random mirror server.

Using Round-robin DNS to Distribute Load

The Domain Name System, or DNS, is the Internet mechanism that translates computer names (like "") into numeric addresses. This mechanism can map one name to many addresses.

The simplest method for load-balancing is to use round-robin DNS, as illustrated in Figure 11-3.

Load balancing with round-robin DNS.

Figure 11-3 Load balancing with round-robin DNS.

When gets resolved, BIND answers with the address of either zeoclient1, zeoclient2, or zeoclient3 - but in a rotated order every time. For example, one user may resolve and get the address for zeoclient1, and another user may resolve and get the address for zeoclient2. This way your users are spread over the various ZEO clients.

This not a perfect load balancing scheme, because DNS resolve information gets cached by the other nameservers on the net. Once a user has resolved to a particular ZEO client, all subsequent requests for that user also go to the same ZEO client. The final result is generally alright, because the total sum of the requests are really spread over your various ZEO clients.

One down-side to this solution is that it can take from hours to days for name servers to refresh their cached copy of what they think the address of is. If you are not responsible for the maintenance of your ZEO clients and one fails, then 1/Nth of your users (where N is the number of ZEO clients) will not be able to reach your site until their name server cache refreshes.

Configuring your DNS server to do round-robin name resolution is a pretty advanced technique that is not covered in this book. A good reference on how to do this can be found in the Apache Documentation.

Distributing the load with round-robin DNS is useful, and cheap, but not 100% effective. DNS servers can have strange caching policies, and you are relying on a particular quirk in the way DNS works to distribute the load. The next section describes a more complex, but much more powerful way of distributing load called Layer 4 Switching.

Using Layer 4 Switching to Distribute Load

Layer 4 switching lets one computer transparently hand requests to a farm of computers. This is a pretty advanced technique that is beyond the scope of this book, but it is worth pointing out several products that do Layer 4 switching for you.

Layer 4 switching involves a switch that, according to your preferences, chooses from a group of ZEO clients whenever a request comes in, as shown in Figure 11-4.

Illustration of Layer 4 switching

Figure 11-4 Illustration of Layer 4 switching

There are hardware and software Layer 4 switches. There are a number of software solutions, but one in general that stands out is the Linux Virtual Server (LVS). This is an extension to the free Linux operating system that lets you turn a Linux computer into a Layer 4 switch. More information on the LVS can be found on its web site.

There are also a number of hardware solutions that claim higher performance than software based solutions like LVS. Cisco Systems has a hardware router called LocalDirector that works as a Layer 4 switch, and Alteon also makes a popular Layer 4 switch.

Dealing with a Single Point of Failure

Without ZEO, your entire Zope system is a single point of failure. ZEO allows you to spread that point of failure around to many different computers. If one of your ZEO clients fails, other clients can answer requests on the failed clients behalf.

Note that as of this writing, the single point of failure can't be entirely eliminated, because there is still one central storage server. The methods described in this section, however, do minimize the risks of failure by spreading most of Zope across many computers.

What this means is that, while this does remove a lot of risk away from your web servers as a single point of failure, it does not eliminate all risk because now the ZEO server is a single point of failure. There are several ways of dealing with this issue.

One popular method is to accept the single point of failure risk and mitigate that risk as much as possible by using very high-end, reliable equipment for your ZEO server, frequently backing up your data, and using inexpensive, off-the-shelf hardware for your ZEO clients. By investing the bulk of your infrastructure budget on making your ZEO server rock solid (redundant power supplies, RAID, and other fail-safe methods) you can be pretty well assured that your ZEO server will remain up, even if a handful of your inexpensive ZEO clients fail.

Some applications, however, require absolute 100% up-time. There is still a chance, with the solution described above, that your ZEO server will fail. If this happens, you want a backup ZEO server to jump in and take over for the failed server right away.

Like Layer 4 switching, there are a number of products, software and hardware, that help you mitigate this kind of risk. One popular software solution for linux is called fake. Fake is a Linux based utility that can make a backup computer take over for a failed primary computer by "faking out" network addresses. When used in conjunction with monitoring utilities like mon or heartbeat, fake can guarantee almost 100% up-time of your ZEO server and Layer 4 switches. Using fake in this way is beyond the scope of this book.

So far, we've explained these techniques for mitigating a single point of failure:

The final piece of the puzzle is the ZEO server itself, and where it stores its information. If your primary ZEO server fails, how can your backup ZEO server ensure it has the most recent information that was contained in the primary server? As usual, there are several ways to solve this problem, and they are covered in the next section.

ZEO Server Details

Before explaining the details of how the ZEO server works, it is worth understanding some details about how Zope storages work in general.

Zope does not save any of its object or information directly to disk. Instead, Zope uses a storage component that takes care of all the details of where objects should be saved.

This is a very flexible model, because Zope no longer needs to be concerned about opening files, or reading and writing from databases, or sending data across a network (in the case of ZEO). Each particular storage takes care of that task on Zope's behalf.

For example, a plain, stand-alone Zope system can be illustrated in Figure 11-5.

Zope connected to a filestorage

Figure 11-5 Zope connected to a filestorage

You can see there is one Zope application which plugs into a FileStorage. This storage, as its name implies, saves all of its information to a file on the computer's filesystem.

When using ZEO, you simple replace the FileStorage with a ClientStorage, as illustrated in Figure 11-6.

Zope with a Client Storage and Storage server

Figure 11-6 Zope with a Client Storage and Storage server

Instead of saving objects to a file, a ClientStorage sends objects over a network connection to a Storage Server. As you can see in the illustration, the Storage Server uses a FileStorage to save that information to a file on the ZEO server's filesystem.

Storages are interchangeable and easy to implement. Because of their interchangeable nature, ZEO Storage Servers can use ZEO ClientStorages to pass on object data to yet another ZEO Storage Server. This is illustrated in Figure 11-7.

Multi-tiered ZEO system

Figure 11-7 Multi-tiered ZEO system

Here, you can see a number of ZEO clients funnel down through three ZEO servers, which in turn act as ZEO clients themselves and funnel down into the final, central ZEO server than saves its information in a FileStorage. Now, that central ZEO server is the single point of failure in the system. If any of your other clients, or intermediate servers fail, the system will still continue to work, but if the central server fails, then you need an alternative.

Using fake you can have a back-up storage server strategy, but this method is not very well proven and hasn't been explored by the authors. In the future, ZEO will have a "multiple-server" feature, that allows a group of storage servers to act as a quorum, so if one or more storage servers fail, the remaining servers in the quorum can continue to serve objects.

There are a number of advantages to an approaches like these, especially if you are interested in creating a massively distributed network object database. Of course, with any system of advantages, there are some drawbacks as well, which are discussed in the next section.

ZEO Caveats

For the most part, running ZEO is exactly like running Zope by itself, but there are a few issues to keep in mind.

First, it takes longer for information to be written to the Zope object database. This does not slow down your ability to use Zope (because Zope does not block you during this write operation) but it does increase your chances of getting a ConflictError. Conflict errors happen when two ZEO clients try to write to the same object at the same time. One of the ZEO clients wins the conflict and continues on normally. The other ZEO client looses the conflict and has to try again.

Conflict errors should be as infrequent as possible because they could slow down your system. While it's normal to have a few conflict errors (due to the concurrent nature of Zope) it is abnormal to have a lot of conflict errors. The pathological case is when more than one ZEO client tries to write to the same object over and over again very quickly. In this case, there will be lots of conflict errors, and therefore lots of retries. If a ZEO client tries to write to the database three times and gets three conflict errors in a row, then the request is aborted and the data is not written.

Because ZEO takes longer to write this information, the chances of getting a ConflictError are higher than if you are not running ZEO. Because of this, ZEO is more write sensitive than running Zope without ZEO. You may have to keep this in mind when you are designing your network or application. As a rule of thumb, more and more frequent writes to the database increase your chances of getting a ConflictError. On the flip side, faster and more reliable network connections and computers lower your chances of getting a ConflictError. By taking these two factors into account, conflict errors can be mostly avoided.

Finally, as of this writing, there is no built in encryption or authentication between ZEO servers and clients. This means that you must be very careful about who you expose your ZEO servers to. If you leave your ZEO servers open to the whole Internet, then anyone can connect to your ZEO server and write data into your database, and that can be bad news.

This is not an unsolveable problem however, because you can use other tools, like firewalls, to protect your ZEO servers. If you are running a ZEO client/server connection over an unsecure network and you want guarantee that your information is kept private, you can use tools like OpenSSH and stunnel to set up secure, encrypted communication channels between your ZEO clients and servers. How these tools work and how to set them up is beyond the scope of this book, but both packages are adequately documented on their web sites. For more information on firewalls, with Linux in particular, we recommend the book "Linux Firewalls" by Robert Ziegler, which is published by New Riders.


In this chapter we looked at ZEO, and how ZEO can substantially increases the capacity of your website. In addition to running ZEO on one computer to get familiarized, we looked at running ZEO on many computers, and various techniques for spreading the load of your visitors among those many computers.

ZEO is not a magic bullet solution, and like other system designed to work with many computers, it adds another level of complexity to your web site. This complexity pays off however when you need to serve up lots of dynamic content to your audience.