tweaks to asyn{core,chat}

When juggling lots and lots of connections, the overhead of the select() call becomes important. Imagine you have 1000 clients connected to your server, but they're slow. A single client sends you a request - which you handle, and now you're at the top of poll() again. This causes all 1000 objects to have their readable() and writable() methods called. The end result of this is that after a few hundred connections, your CPU usage will be pegged because you're churning the polling loop wastefully.

So I've tweaked a few things to cut down this overhead...

The test is medusa/test/test_lb.py, with a 64-byte 'packet', 20 packets per connection. Server and client on the same machine, using the loopback address (127.0.0.1). The number on the graph is 'transactions/second'. The tests were run on RH6.1,linux 2.2.12, K6-2/350.

dist - asyncore/asynchat distributed with Python 1.5.2
egp - used at eGroups, uses filter with lambda to build read and write sets rather than for/append.
fdcache - rearrangement of asyncore.socket_map.
Old Map : { <object>: 1, <object>:1, ... }
New Map : { <file-descriptor>: <object>, ... }
This avoids the overhead of having select.select() translate between descriptor<->object every time.
refill - fdcache tweak, along with a change to dispatcher.refill_buffer(). refill_buffer() now continues to add to the outgoing buffer until it reaches ac_out_buffer_size. The old code would churn if hit with lots of small data.
big-bsd3 - This test was run on freebsd 3.2-stable on a P3 @ 600MHz (I think 600, not sure). The machine and python exe are configured for up to 16K descriptors, so I may run with more to see if the line stays that flat. 8^)

The tweaks appear to be a win... maybe 15%-20% or so? The bsd performance is pretty bad (especially considering the difference in CPU's) but is probably an artifact of the loopback implementation.

There still remains the overhead of the readable() and writable() methods, I'm not sure what can be done about it... ideally we would like to have the object carefully manage the state of an attribute rather than requiring a method call. But I don't think this can be retrofitted without breaking everything.

At eGroups we have a coroutine-based system (that I will release any day now, I promise!) that doesn't have this overhead, because each coroutine 'blocks' on a specific condition, they're not polled by the loop. We also have the asyncore-like bits implemented in C and talking to poll(2) rather than select(2).

BTW, the graph was generated using StarOffice. Way Cool.

Samual M. Rushing

Last modified: Fri Nov 19 03:32:34 PST 1999