Superpatterns Pat Patterson on the Cloud, Identity and Single Malt Scotch

13Nov/101

WordPress Tip – Redirect to Main Page on 404

If you're using the Postalicious WordPress plugin to post your del.icio.us links to your blog, you might have noticed that it doesn't always handle the occasional 500 errors from del.icio.us very well, and you end up with a bogus 'links' entry on your blog with a link to '500 Server Error'.

In itself, it's not that big a deal; I usually notice the bogus post pretty quickly and just delete it, but, by then, it's been tweeted by Twitterfeed, pushed to Facebook, and folks have it in their RSS stream, so they hit the 'links for the day' link and get the default '404 page not found' message. In fact, if you ever delete a post for any reason, you're in the same situation - the link is out there, you can't call it back (even if you go delete it from Twitter and Facebook, it's still out there somewhere!), and people are going to land on that ugly page.

So, I got thinking... That default 404 page isn't really good for much... What if I could just send people to the main page of my blog? Well, with a couple of minutes googling I found a useful blog post on the subject and the WordPress docs for get_bloginfo(), and came up with the following replacement for the default 404 page:

<?php
   header("Status: 301 Moved Permanently");
   header("Location: ".get_bloginfo('url'));
?>

You could do fancier things with a JavaScript redirect that shows a 'page not found' message then redirects after a few seconds, but I prefer the more direct approach :-)

Filed under: Uncategorized 1 Comment
1Nov/100

Salesforce.com – Two Weeks In

Behind the Cloud

I'm currently reading 'Behind the Cloud'

It's the end of my second week at Salesforce.com, and I seem to have hit the ground running... A day of orientation, a couple of days working through the Force.com and Chatter developer tutorials, then head down on a guide to Getting Started with the Force.com REST API, published alongside the REST API Developer Preview Webinar last Tuesday (the webinar replay is online now).

The getting started guide featured a sample Java web app that acted as an OAuth 2.0 client, redirecting the user to login at Salesforce.com and obtaining an access token with which to interact with the Force.com REST API. Cool stuff, but there were a couple of questions on the webinar asking how to do the same thing from other languages. It took just a few hours to rework the sample web app, first in Ruby, then in PHP. I've also noticed a .NET implementation, by Dan Boris - cool stuff!

I'm commuting up the peninsula about three days a week on Caltrain, which is working out pretty well - there's a station less than three miles from my house, and I can change to the Baby Bullet in San Jose, with the ride to San Francisco taking about an hour. I actually enjoy the time on the train - I just get my laptop and 3G card out and tap away - in fact, I'm on the train right now, somewhere near Palo Alto. :-)

So - two weeks in, I've published three pieces on *force.com, seen some very cool ISV demos at the second AppQuest judging round, and I'm off to Internet Identity Workshop XI tomorrow. If this sounds like your idea of fun, take a look at the Salesforce.com careers page. Lots of opportunities there, and, if you see something you like, don't forget to tell them that sent you!

1Nov/102

Bookmarks for October 31st 2010

These are my links for October 31st 2010:

29Oct/100

Bookmarks for October 29th 2010

These are my links for October 29th 2010:

27Oct/100

Bookmarks for October 26th 2010

These are my links for October 26th 2010:

15Oct/1012

Moving On From Huawei

After just over a year at Huawei, it's time to move on... Later today I'll be handing back my Huawei laptop and badge; on Monday I'll be attending orientation at Salesforce.com, where I'll be joining the developer evangelism team.

It's been an interesting and productive year at Huawei - if you've been following my blog, you'll know that I've been doing some pretty low level stuff - Linux kernel drivers and server daemons, and I've learned lots about zero-copy technology and semaphores. All fascinating stuff, working with great people and visiting some cool places, but, still, I missed the interaction with a developer community that I enjoyed as Sun's 'OpenSSO Community Guy'.

As is traditional, I've trawled YouTube for an appropriate song to mark the occasion. I almost settled for Led Zep's Babe I'm Gonna Leave You, but stumbled across the Quivver remix of the same song - a very different take on the classic, and well worth a listen. If you like what you hear, you can pick it up on Perfecto Presents Another World.

14Jul/1016

Semaphores on Linux – sem_init() vs sem_open()

Credit to Denelson83 for the image - click for the original

Regular readers will know that I'm working on a Linux server daemon that, amongst other things, moves data back and forth between sockets and files without it appearing in user space, and even 'tees' that data to a second destination, again without a copy to a user space buffer. Now I have multiple instances of my server running, and they need to synchronize access to shared data structures.

The standard mechanism for this is the semaphore. I won't get into a deep discussion of semaphores here, the Wikipedia article linked in the preceding sentence gives a good description. Basically, if you want to ensure that no more than one thread (ok, 'n' threads in the general case) has access to some resource concurrently, you use a semaphore.

Looking for an example of semaphores on Linux, I found the aptly named Semaphores in Linux, by Vikram Shukla, on the O'Reilly Linux DevCenter. This is a very useful article, explaining the general semaphore concept and comparing the System V and POSIX semaphore implementations.

Guided by the article, in particular, the 'Related Process' example, which closely matched my use case, I wrote a quick test program using the POSIX sem_init() call to initialize a semaphore and sem_wait()/sem_post() to decrement/increment the semaphore respectively. Only one problem. It didn't work - my processes had concurrent access to the shared resource!

Going back to Vikram's example, and reading the sem_init() man page very carefully, the issue seems to be that the semaphore is created on the stack of the parent process. When the child is forked, it gets a copy of the semaphore, not a reference to the parent's semaphore. Adding a few sleep()'s and printf()'s to the example highlights the problem:

#include <semaphore.h>
#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>

#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>

int main(int argc, char **argv)
{
  int fd, i,count=0,nloop=10,zero=0,*ptr;
  sem_t mutex;

  //open a file and map it into memory

  fd = open("log.txt",O_RDWR|O_CREAT,S_IRWXU);
  write(fd,&zero,sizeof(int));
  ptr = mmap(NULL,sizeof(int),PROT_READ |PROT_WRITE,MAP_SHARED,fd,0);
  close(fd);

  /* create, initialize semaphore */
  if( sem_init(&mutex,1,1) < 0)
    {
      perror("semaphore initilization");
      exit(0);
    }
  if (fork() == 0) { /* child process*/
    for (i = 0; i < nloop; i++) {
      sem_wait(&mutex);
      printf("child entered crititical section: %d\n", (*ptr)++);
      sleep(2);
      printf("child leaving critical section\n");
      sem_post(&mutex);
      sleep(1);
    }
    exit(0);
  }
  /* back to parent process */
  for (i = 0; i < nloop; i++) {
    sem_wait(&mutex);
    printf("parent entered critical section: %d\n", (*ptr)++);
    sleep(2);
    printf("parent leaving critical section\n");
    sem_post(&mutex);
    sleep(1);
  }
  exit(0);
}

Running this shows that both the parent and the child are in the critical section at the same time:

child entered critical section: 0
parent entered critical section: 1
parent leaving critical section
child leaving critical section
parent entered critical section: 2
child entered critical section: 3
...

The explanation is in the sem_init() man page:

If pshared is non-zero, then the semaphore is shared between processes, and should be located in a region of shared memory (see shm_open(3), mmap(2), and shmget(2)). (Since a child created by fork(2) inherits its parent's memory mappings, it can also access the semaphore.) Any process that can access the shared memory region can operate on the semaphore using sem_post(3), sem_wait(3), etc.

The key here is that the semaphore must be in a region of shared memory, even if you're accessing it from related processes such as a parent and its child.

There are two ways of fixing the problem. The first is to use shm_open(), ftruncate() and mmap() to create a shared memory region and obtain a pointer to it:

  int shm;
  sem_t * mutex;

  ...

  if ((shm = shm_open("myshm", O_RDWR | O_CREAT, S_IRWXU))   0) {
    perror("shm_open");
    exit(1);
  }

  if ( ftruncate(shm, sizeof(sem_t)) < 0 ) {
    perror("ftruncate");
    exit(1);
  }

  if ((mutex = mmap(NULL, sizeof(sem_t), PROT_READ | PROT_WRITE,
      MAP_SHARED, shm, 0)) == MAP_FAILED) {
    perror("mmap");
    exit(1);
  }

  if (sem_init(mutex, 1, 1) < 0) {
    perror("semaphore initialization");
    exit(1);
  }

  ...

The other, simpler, solution is to just use sem_open(), which Vikram describes in the next section of the article:

  if ((mutex = sem_open("mysemaphore", O_CREAT, 0644, 1)) == SEM_FAILED) {
    perror("semaphore initilization");
    exit(1);
  }

Either of these approaches gives the desired result:

child entered crit section: 0
child leaving crit section
parent entered crit section: 1
parent leaving crit section
child entered crit section: 2
child leaving crit section
parent entered crit section: 3
...

Postscript: this is a minor flaw in an otherwise excellent and very useful article. I address it here, rather than in a comment on the article, due to the amount of space required for a full explanation.

8Jul/100

A Cup of tee() and a splice() of Cake

Credit to dajobe for the photo - click the image for the original

Apologies for the terrible pun in the title - I just couldn't resist :-)

I was hard at work on my current project the other day, a user-mode Linux server daemon, when I realized that I would need to both copy incoming data to disk and forward it to another daemon via a socket. This caused me a moment's consternation, since I was using splice() to move incoming data from a socket to a file without needing an intermediate copy in a user-mode buffer, but then I remembered mention of tee(), a companion to splice().

Where splice() moves data directly from a socket (or file) to a pipe (or vice versa), tee() copies data from one pipe to another leaving the data intact in the source pipe. You can then use splice() again to move the data from tee()'s destination pipe to another file descriptor.

It was the work of a few minutes to code up a quick sample app to test this. Since it's short, and there seems to be a dearth of tee()/splice() examples, here it is in its entirety:

#define _GNU_SOURCE // needed for splice
#include <fcntl.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

static void test_splice(int in, int out, int out2, int number_of_bytes) {
    int rcvd = 0, sent = 0, teed = 0, remaining = number_of_bytes;
    int pipe1[2], pipe2[2];

    if (pipe(pipe1) < 0) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    if (pipe(pipe2) < 0) {
        perror("pipe");
        exit(EXIT_FAILURE);
    }

    while (remaining > 0) {
        if ((rcvd = splice(in, NULL, pipe1[1], NULL, remaining,
                SPLICE_F_MORE | SPLICE_F_MOVE)) < 0) {
            perror("splice");
            exit(EXIT_FAILURE);
        }

        if (rcvd == 0) {
            printf("Reached end of input file\n");
            break;
        }

        printf("Wrote %d bytes to pipe1\n", rcvd);

        if ((teed = tee(pipe1[0], pipe2[1], rcvd, 0)) < 0) {
            perror("tee");
            exit(EXIT_FAILURE);
        }

        printf("Copied %d bytes from pipe1 to pipe2\n", teed);

        if ((sent = splice(pipe1[0], NULL, out, NULL, rcvd, SPLICE_F_MORE
                | SPLICE_F_MOVE)) < 0) {
            perror("splice");
            exit(EXIT_FAILURE);
        }

        printf("Read %d bytes from pipe1\n", sent);

        if ((sent = splice(pipe2[0], NULL, out2, NULL, teed, SPLICE_F_MORE
                | SPLICE_F_MOVE)) < 0) {
            perror("splice");
            exit(EXIT_FAILURE);
        }

        printf("Read %d bytes from pipe2\n", sent);

        remaining -= rcvd;
    }
}

int main(int argc, char *argv[]) {
    int infile, outfile1, outfile2, number_of_bytes = INT_MAX;

    if (argc < 4) {
        fprintf(stderr,
                "Usage: %s infile outfile1 outfile2 [number_of_bytes]\n",
                argv[0]);
        exit(EXIT_FAILURE);
    }

    if ((infile = open(argv[1], O_RDONLY)) < 0) {
        fprintf(stderr, "Can't open %s for reading\n", argv[1]);
        exit(EXIT_FAILURE);
    }

    if ((outfile1 = open(argv[2], O_WRONLY | O_CREAT | O_TRUNC, 0644)) < 0) {
        fprintf(stderr, "Can't create/open %s for writing\n", argv[2]);
        exit(EXIT_FAILURE);
    }

    if ((outfile2 = open(argv[3], O_WRONLY | O_CREAT | O_TRUNC, 0644)) < 0) {
        fprintf(stderr, "Can't create/open %s for writing\n", argv[3]);
        exit(EXIT_FAILURE);
    }

    if (argc > 4) {
        number_of_bytes = atoi(argv[4]);
    }

    test_splice(infile, outfile1, outfile2, number_of_bytes);

    return EXIT_SUCCESS;
}

The example doesn't need much explanation; I added the 'number_of_bytes' parameter so that you can copy a limited amount of data from an infinite source such as /dev/zero or /dev/urandom. Note that a 'real' implementation needs a bit more code, since it's not safe to assume that all the bytes get moved to the destination pipes in one hit, but that would obscure the example :-)

1Jun/1042

Zero-Copy in Linux with sendfile() and splice()

A Splice

After my recent excursion to Kernelspace, I'm back in Userland working on a server process that copies data back and forth between a file and a socket. The traditional way to do this is to copy data from the source file descriptor to a buffer, then from the buffer to the destination file descriptor - like this:

// do_read and do_write are simple wrappers on the read() and 
// write() functions that keep reading/writing the file descriptor
// until all the data is processed.
do_read(source_fd, buffer, len);
do_write(destination_fd, buffer, len);

While this is very simple and straightforward, it is somewhat inefficient - we are copying data from the kernel buffer for source_fd into a buffer located in user space, then immediately copying it from that buffer to the kernel buffers for destination_fd. We aren't examining or altering the data in any way - buffer is just a bit bucket we use to get data from a socket to a file or vice versa. While working on this code, a colleague clued me in to a better way of doing this - zero-copy.

As its name implies, zero-copy allows us to operate on data without copying it, or, at least, by minimizing the amount of copying going on. Zero Copy I: User-Mode Perspective describes the technique, with some nice diagrams and a description of the sendfile() system call.

Rewriting my example above with sendfile() gives us the following:

ssize_t do_sendfile(int out_fd, int in_fd, off_t offset, size_t count) {
    ssize_t bytes_sent;
    size_t total_bytes_sent = 0;
    while (total_bytes_sent < count) {
        if ((bytes_sent = sendfile(out_fd, in_fd, &offset,
                count - total_bytes_sent)) <= 0) {
            if (errno == EINTR || errno == EAGAIN) {
                // Interrupted system call/try again
                // Just skip to the top of the loop and try again
                continue;
            }
            perror("sendfile");
            return -1;
        }
        total_bytes_sent += bytes_sent;
    }
    return total_bytes_sent;
}

//...

// Send 'len' bytes starting at 'offset' from 'file_fd' to 'socket_fd'
do_sendfile(socket_fd, file_fd, offset, len);

Now, as the man page states, there's a limitation here: "Presently (Linux 2.6.9 [and, in fact, as of this writing in June 2010]): in_fd, must correspond to a file which supports mmap()-like operations (i.e., it cannot be a socket); and out_fd must refer to a socket.". So, we can only use sendfile() for reading data from our file and sending it to the socket.

It turns out that sendfile() significantly outperforms read()/write() - I was seeing about 8% higher throughput on a fairly informal read test. Great stuff, but our write operations are still bouncing unnecessarily through userland. After some googling around, I came across splice(), which turns out to be the primitive underlying sendfile(). An lkml thread back in 2006 carries a detailed explanation of splice() from Linus himself, but the basic gist is that splice() allows you to move data between kernel buffers (via a pipe) with no copy to userland. It's a more primitive (and therefore flexible) system call than sendfile(), and requires a bit of wrapping to be useful - here's my first attempt to write data from a socket to a file:


// Our pipe - a pair of file descriptors in an array - see pipe()
static int pipefd[2];

//...

ssize_t do_recvfile(int out_fd, int in_fd, off_t offset, size_t count) {
    ssize_t bytes, bytes_sent, bytes_in_pipe;
    size_t total_bytes_sent = 0;

    // Splice the data from in_fd into the pipe
    while (total_bytes_sent < count) {
        if ((bytes_sent = splice(in_fd, NULL, pipefd[1], NULL,
                count - total_bytes_sent, 
                SPLICE_F_MORE | SPLICE_F_MOVE)) <= 0) {
            if (errno == EINTR || errno == EAGAIN) {
                // Interrupted system call/try again
                // Just skip to the top of the loop and try again
                continue;
            }
            perror("splice");
            return -1;
        }

        // Splice the data from the pipe into out_fd
        bytes_in_pipe = bytes_sent;
        while (bytes_in_pipe > 0) {
            if ((bytes = splice(pipefd[0], NULL, out_fd, &offset, bytes_in_pipe,
                    SPLICE_F_MORE | SPLICE_F_MOVE)) <= 0) {
                if (errno == EINTR || errno == EAGAIN) {
                    // Interrupted system call/try again
                    // Just skip to the top of the loop and try again
                    continue;
                }
                perror("splice");
                return -1;
            }
            bytes_in_pipe -= bytes;
        }
        total_bytes_sent += bytes_sent;
    }
    return total_bytes_sent;
}

//...

// Setup the pipe at initialization time
if ( pipe(pipefd) < 0 ) {
    perror("pipe");
    exit(1);
}

//...

// Send 'len' bytes from 'socket_fd' to 'offset' in 'file_fd'
do_recvfile(file_fd, socket_fd, offset, len);

This almost worked on my system, and it may work fine on yours, but there is a bug in kernel 2.6.31 that makes the first splice() call hang when you ask for all of the data on the socket. The Samba guys worked around this by simply limiting the data read from the socket to 16k. Modifying our first splice call similarly fixes the issue:

    if ((bytes_sent = splice(in_fd, NULL, pipefd[1], NULL,
            MIN(count - total_bytes_sent, 16384), 
            SPLICE_F_MORE | SPLICE_F_MOVE)) <= 0) {

I haven't benchmarked the 'write' speed yet, but, on reads, splice() performed just a little slower than sendfile(), which I attribute to the additional user/kernel context switching, but, again, significantly faster than read()/write().

As is often the case, I'm merely standing on the shoulders of giants here, collating hints and fragments, but I hope you find this post useful!

7May/100

Bookmarks for May 6th 2010

These are my links for May 6th 2010: