Belay Lessons: Smarter Web Programming

Posted on 18 December 2011.

This post comes from the keyboard of Matt Carroll, who has worked with us for the past two years. He's the main implementer of desugaring for S5, and spent this semester rebuilding and improving in-house Brown PLT web applications. He writes about his experience here.

The Brown computer science department uses a home-grown web application called Resume to conduct its faculty recruitment process. This semester, Joe and I re-wrote Resume with Belay. Belay is the product of Joe and Arjun's summer research at Google: it's an ongoing inquiry into web development best practices, specifically concerning identity, account management, and security. From my perspective (that of a novice web programmer), getting to grips with the Belay philosophy was a thought-provoking experience, and a great education in the pitfalls that a web developer must (unfortunately) bear in mind.

I Am Not My Cookies

Standard web applications make use of cookies for authentication. When you visit a site and enter your credentials, the site's response sets a session cookie in your browser. Subsequent requests to the site use the information in the cookie to determine 'who you are' and whether 'you' are allowed to do what 'your' request is trying to do. I use quotations in the prior sentence to highlight the fact that HTTP cookies are a poor method of establishing user identity. If another, malicious, web site you visit manages to trick you into sending a request to the original site, that request will contain your cookie, and the good site may treat that request as legitimate and execute it. This is the infamous cross-site request forgery (CSRF) attack.

Belay applications eschew the use of cookies, especially for authentication, and thus they are secure by design against this type of vulnerability. This begs the question: without cookies, how do Belay applications decide whether a request is authenticated? The answer may shock you (as it did me): all requests that reach request handler code are treated as legitimate. At this point, we must examine the server-side of Belay apps in greater detail.

Web Capabilities

Your everyday possibly-CSRF-vulnerable site probably has a URL scheme with well-known endpoints that lead directly to application functionality. For example, to post to your blog, you (typically via your browser) send a POST request to www.blog.com/post with your cookies and the blog body's text. The server-side handler finds your account in the database using your cookie, checks that your account can post to that blog, and adds a new post. If the whole surface of the site's URL space is well-known, a CSRF-ing attacker can excercise the entirety of a user's view of the site with one compromised cookie.

In contrast, Belay applications have few well-known URLs, corresponding to the public entry points to the site (the login and contact pages, for instance). Instead, Belay's libraries allow server-side code to dynamically generate random unique URLs and map them to request handler functions. Each of these handlers services a particular type of request for a particular set of data. The randomly generated "capability" urls are embedded in the JavaScript or markup returned to the browser. In a well-designed Belay application, each page has the minimum necessary set of capabilities to carry out its mission, and the capabilities are scoped to the minimum set of data with which they need concern themselves. After you successfully log in to a Belay site, the response will contain the set of capabilities needed by the page, and scoped to only that data which is needed by the page's functionality and associated with your user account. No cookies are necessary to identify you as a user or to authenticate your requests.

A Belay app uses its limited URL scheme as its primary security mechanism, ignoring requests unless they come along trusted capability URLs created by a prior, explicit grant. As long as we can rely on our platform's ability to generate unguessable large random numbers, attackers are out of luck. And, even if a capability URL is leaked from its page, it is scoped to only a small set of data on the server, so the vulnerability is limited. This is a much-improved situation compared to a site using cookie-based authentication---leaking a cookie leaks access to the user's entire view of the site.

Grants and Cap Handlers

Here's a Belay request handler, taken from Resume:

class GetLetterHandler(bcap.CapHandler):
  def get(self, reference):
    filename = get_letter_filename(reference)
    return file_response(filename, 'letter')

This handler simply looks up the filename associated with a reference and returns it (using a few helper functions). Accessing a letter written by an applicant's reference is quite a sensitive operation---letting the wrong person access a letter would be a serious security bug. Yet, GetLetterHandler is a two-liner with no apparent security checks or guards. How can this be safe?

To answer this, we need to consider how a client can cause GetLetterHandler to be invoked. The Belay server library will only invoke this handler via capability URLs created with a grant to GetLetterHandler. So, we can search the codebase for code that granted such access. A quick search shows one spot:

class GetApplicantsHandler(bcap.CapHandler):
  def get(self, reviewer):
    applicants_json = []
    for applicant in reviewer.get_applicants(): 
      # ... some processing
      refs_json = []
      for ref in applicants.get_references():
        refs_json.append({
          'refName': ref.name,
          'getLetter': bcap.grant(GetLetterHandler, ref))
        })
      # ... add some things to applicants_json

    return bcap.bcapResponse(applicants_json)

When GetApplicantsHandler is invoked, it will return a structure that, for each applicant, shows something like:

{
name: 'Theodore Roosevelt',
getLetter:
'https://resume.cs.brown.edu/cap/f7327056-4b91-ad57-e5e4f6c514b6'
}

On the server, the string f7327056-4b91-ad57-e5e4f6c514b6 was created and mapped to the pair of GetLetterHandler and the Reference database item for Theodore Roosevelt. A GET request to the URL above will return the reference letter. Note a nice feature of this setup: the server doesn't use any information from the client, other than the capability URL, to decide which reference's letter to return. Thus, a client cannot try providing different id's or other parameters to explore which letters they have access to. Only those explicitly granted are accessible.

Poking around in the codebase more, we can see that GetApplicantsHandler is only granted to reviewers, who can only create accounts via an email from the administrator. This reasoning is how we convince ourselves, as developers, that we haven't screwed up and given away the ability to see a letter to the wrong user. We do all of this without worrying about a check on accessing the letter, instead relying on the unguessability of the URLs generated by grant to enforce our access restrictions.

This may seem like a new-concept overload, and indeed, I had that exact reaction at first. Over time I gained familiarity with the Belay style, and I became more and more convinced by the benefits it offers. Porting Resume became a fairly straightforward process of identifying each server-side request handler, converting it to a Belay handler, and ensuring that whatever pages needed that functionality received grants to call the handler. There were wrinkles, many due to the fact that Resume also uses Flapjax (a language/library for reactive programming in the browser). Flapjax is another Brown PLT product and it is certainly worthy of its own blog post. We had to account for the interaction between Belay's client-side library and Flapjax.

Note that Belay isn't the first place these ideas have surfaced. Belay builds on foundational research: Waterken and PLT Web Server both support cookie-less, capability-based web interactions. The Belay project addresses broader goals in identity management and sharing on the web, but we've leveraged its libraries to build a more robust system for ourselves.

At the end, the benefits of the redesigned Resume are numerous. Cookies are no longer involved. JavaScript code doesn't know or care about unique IDs for picking items out of the database. Random HTTP request probes result in a 404 response and a line in the server's log, instead of a possible data corruption. You can open as many tabs as you like, with each one logged into its own Resume account, and experience no unwanted interference. We were able to realize these improvements while re-using a significant portion of the original Resume code, unchanged.

After my experience with the Resume port, I'm certainly a Belay fan. The project has more to say about topics such as cross-site authorization, sharing, and multi-site identity management, so check out their site and stay tuned for future updates:

Belay Research