41. pingcat

Most data on the Internet is transferred over TCP or UDP. The former works best when the transport needs to be reliable, and the latter is for when lower latency is more important than data loss or packet reordering. However, for a laugh, you can also transfer it over ICMP, aka ping.

As a quick refresher, the ping(8) command works by sending ICMP echo-request packets with some random data, then waiting for ICMP echo-reply packets with the same data. The key point is that the data doesn’t really have to be random.

A                       B
|  echo-request(data)   |
| --------------------> |
|                       |
| <-------------------- |
|                       |

Here, we’ll only be focusing on the interesting bits; the full code listing for pingcat is available here. Be aware that the program only works when run by root (or by messing with the sticky bit or capabilities). For the complete ping(8) implementation, see the iputils package.

The first problem is that we need to send ICMP packets. Looking at the docs for socket(2), we see it’s easy to create TCP (SOCK_STREAM) or UDP (SOCK_DGRAM) sockets. With ICMP, it’s a bit trickier: we have to use a raw socket (SOCK_RAW), and compose the packets manually.

int icmp_sock = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP);

Composing ICMP packets sounds a bit daunting, but it’s particularly easy in C. All we need to do is allocate a chunk of memory, cast it to the right type, and fill in the fields.

u_char buf[0x10000];
struct icmphdr *icp = (struct icmphdr *)buf;
icp->type = ICMP_ECHO;
icp->code = 0;
icp->checksum = 0;
icp->un.echo.sequence = count;  // an incrementing number
icp->un.echo.id = id;           // something to identify this process
// the first 8 bytes are the ICMP packet, the rest is the data
memcpy(buf + 8, data, datalen);
icp->checksum = in_cksum((u_short*)icp, datalen + 8, 0);

We still need to send out the packet. Strictly speaking, all we need to do is write it to the socket, but because we dislike copying data around, we’ll do it the hard way and use scatter-gather network IO.

struct sockaddr_in addr;
struct iovec iov = { buf, 0 };
iov.iov_len = datalen + 8;
struct msghdr hdr = { &addr, sizeof(addr),
&iov, 1, 0, 0, 0 };
sendmsg(icmp_sock, &hdr, 0);

That’s it for the sending side. As for receiving, it’s almost the inverse process: we open a raw socket, read packets from it, and parse them.

int icmp_sock = socket(AF_INET, SOCK_RAW, IPPROTO_ICMP);
u_char buf[0x10000];
struct iovec iov = { buf, sizeof(buf) };
struct iphdr *ip = (struct iphdr *)buf;
struct msghdr msg;
memset(&msg, 0, sizeof(msg));
msg.msg_name = 0;
msg.msg_namelen = 0;
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
int datalen = recvmsg(icmp_sock, &msg, MSG_DONTWAIT);

Parsing is a bit more complex because we get IP packets, so we first need to extract the ICMP part.

int hlen = ip->ihl*4;
datalen -= hlen;
struct icmphdr *icp = (struct icmphdr *)(buf + hlen);

Next, we need to filter out the noise. Our ICMP socket will get a copy of every ICMP packet received by the host. Since we only care about echo-reply packets, we need to filter by icp->type == ICMP_ECHOREPLY. We also don’t care about packets from other processes, so we need to filter by icp->un.echo.id == id. Since ICMP packets may be reordered, dropped, or duplicated, we also need to inspect icp->un.echo.sequence to make sure it’s the reply to the right packet. Finally, to get to the data, we just skip the 8 bytes of ICMP header: (char*)icp + 8.

And that’s all there is to it. Of course, the devil’s in details, as iputils’ ping demonstrates. For pingcat, the difficulty lies in that we have to implement reliable transmission on top of an unreliable protocol, so we have to deal with the usual issues of missing, reordered, and duplicate packets. To keep the code clear, pingcat does the simple thing and sends data synchronously: it sends some data, waits for the reply, maybe it resends the data if the reply was corrupted or lost, but it never has more than one data chunk in-flight.

One last interesting point about echo-reply packets is that they contain the full data sent. This essentially means we can use the network as a storage medium. The back of the envelope calculation for me goes like this: the ICMP round-trip time between me and www.tv-tokyo.co.jp is about 0.3s; my upload speed is about 6Mb/s; this means I can have about 1.8 Mb, or one copy of Hamlet by William Shakespeare, stored in the very links of the Internet at any given time.