Friday, July 23, 2010

The Headaches of Blocking Sockets

In Brunet, ACIS Lab's P2P software, we make heavy use of sockets, no doubt. Because UDP is more capable of traversing NATs than TCP, much of our focus has been on optimizing the UDP stack in the code. The concerns and issues we have had in using UDP include:
  • Safety of sending and receiving on a UDP socket at the same time
  • Do UDP sockets block
  • Can there be concurrent senders / receivers on a UDP socket at the same time
On to the findings!
  • Safety of sending and receiving on a UDP socket at the same time
The fear with sending and receiving at the same time is primarily based on lack of knowledge, the limited information provided on the Internet, and that you're dealing with what is presented as a single state. But let's think about it for a second, a socket is really like a pipe, well, rather two pipes, one for sending and one for receiving or better yet two queues, one queue waiting for the userspace to read the packet and another queue to send a packet across the network. Sure we multiplex the actual Ethernet device, well actually, we don't that's done by the OS. So why do we really need to be concerned about ordering when writing to a UDP socket? Maybe with a TCP socket, since that actually involves state, but UDP is stateless.

As it turns out, one need not worry about reading and writing from a socket at the same time. The conclusion was reached based upon two pieces of evidence, empirical evidence arrived through experimentation and and reading the grand TCP/IP Illustrated. With just this alone, we were able to boost the speed of our system 10x, reducing code complexity, and improving idle time execution.
  • Do UDP sockets block
Once upon a time, I was a young naive boy working on a P2P VPN called IPOP. IPOP was working so very well, but one day out of the blue IPOP would keep freezing up. What was to blame? Was it latent bugs that were showing themselves at an inopportune time or did we make a mistake somewhere in our logic? So I searched high and far, I even did inspection of packets to find which one was causing the freezing of the code. It turned out to be the sending of a packet across a transport, strange, I thought to myself. We have been using this code for years and I have never seen this issue!

After some reading around, I discovered that I didn't know everything about UDP sockets, the worst of it was that UDP sockets actually have a send buffer!!! When the send buffer fills, it blocks! This is absolute nonsense, but what can you do? The only solution was to make UDP non-blocking somehow.

The two choices were to do it in the main loop of the system or alternatively to put it into another thread. Well, we were already paying a lock penalty to prevent multiple writers at the same time, so maybe its safest to just put everything into another thread. We made that choice, because that way, we could provide a slightly larger user space buffer. Guess what... the system doesn't stall any more! Quick caution, though, sometimes that queue can become very large, so its important to limit its size. We chose the modest value of only allowing 256 messages be stored into the queue and it seems to be working well for a node that receives messages from over 50 remote nodes somewhat aggressively.
  • Can there be concurrent senders / receivers on a UDP socket at the same time
This is actually a very good question, which I can only base upon reading and not much more. According to Unix Network Programming, another of Stevens books, you can at least have multiple readers, but we haven't tested it as of now, the receiver is in a single, dedicated, thread.
Sending is kind of confusing and unaddressed in what I've read so far, in the next paragraph, I describe one issue, Connect, which makes it seem somewhat conclusive why it wouldn't work. Though it would be interesting to see what if anything happens if we allowed multiples of each, but unfortunately, we haven't spent much time focusing on the performance scalability of our P2P system.

At this point, I've addressed most of our concerns, I am still very interested in understanding better practices with sockets and how we could potentially improve our system. For example, one tidbit from Unix Network Programming was that a UDP socket actually calls Connect when sending a packet, apparently, this overhead is 1/3 of the actual time spent using the socket during a send. So if a user were to have a socket for each remote peer, performance could potentially be significantly better.