OAuth and JavaScript

A couple of days ago, Alex Bilbie posted OAuth and JavaScript about why you shouldn’t use OAuth APIs directly from JavaScript applications.

I may have missed the point — and that’s a distinct possibility — but I’m not sure I agree.

As far as I can tell, the security for web applications using OAuth 2 comes entirely from the redirect URL. There’s no way an attacker can get any tokens sent to them without either:

  1. being able to intercept and decrypt the HTTPS traffic, or
  2. being able to fiddle with the DNS of the client to get it to request the URL for the redirect from a web server under the attacker’s control (and have a valid HTTPS certificate for that domain)

In either of these situations, you probably have bigger problems.

In our pure JS web app world, we store the access token in the browser. Were an attacker to get hold of it, they’d be able to do anything the user has granted permission for the original client to do. Alex’s post suggests that you should use a thin API proxy hides away all the OAuth stuff, and the client then authenticates against that with an encrypted cookie. But in each case, the exposure is the same. If an attacker can get access to local storage, they can get access to the cookie. (Okay, not quite true. If there’s a JavaScript injection vulnerability in your web app, the cookie can still be protected with the HttpOnly flag. But the attacker can access all the data that comes back, so you’re not that much better off.)

The client secret is a red herring when it comes to web applications. The user can extract it, as can anyone else who wants to use it. It’s pointless in this context, and that’s presumably why the implicit grant exists. The implicit grant still has the protection of the redirect URL, and that’s enough.

Posted in Uncategorized | 4 Comments

Cross-domain OAuth2 in JavaScript, and Internet Explorer 8

I apologise for the rant that follows.

I’m building a pure JavaScript web application that needs the user to authenticate to access a non-public API. The cool part of all this is that we can loosely couple the API and the UI presented through the client; they can be hosted on separate domains, and there can be multiple versions of the client floating around the web.

To do this, I’ve implemented OAuth2 on the server side, and I’m using CORS from JavaScript to prod the API. For a previously unauthenticated user, the whole thing looks like this:

  1. User visits the client
  2. Client attempts to access the protected resource with an Accept header to match the type of content they want back (e.g. XML or JSON)
  3. The server responds with a 401 Unauthorized response, and a WWW-Authenticate header giving details of how to authenticate using OAuth2. It also includes the right Access-Control-Allow-* headers so the client can get at the headers and make non-simple CORS requests (if the client can’t see the headers or the response, it assumes a 401)
  4. The client catches the 401 and sends the user off to the OAuth2 authorize endpoint
  5. The authorize endpoint needs the user to be authenticated, so redirects them to a login page with a query parameter so they can get back again
  6. The authorize endpoint then asks the user whether they want to grant permission to the client
  7. The user agrees, and the service sends them back to the OAuth2 return URL for the client with a token
  8. The client takes that token and exchanges it for access and refresh tokens using the token endpoint
  9. The client can then replay the original request, but this time it attaches the access token
  10. The API can now use the provided token to authorize the request on behalf of the user, and grants access to the resource

Simples. So what’s the problem?

Notice that in both steps 3 and 5, the remote service is presented with a request that needs the agent to be authenticated. In the first instance it’s a robot making the request, so it responds with a 401 Unauthorized. In the second it’s a human using a browser, so it responds with a 302 Found redirect to the login page. How does it tell them apart?

So far, the service has been using the Accept header. Requests made by the client use XMLHttpRequest, and say “XML please!”. Requests made directly by the browser have the browser-default Accept header that says “HTML please, or whatever else you’ve got available”. It would be fair to assume that requests for HTML are being made to be shown to humans, and so a login form is most appropriate.

Internet Explorer is where this all falls down. Firstly, it doesn’t support CORS until IE10. Instead, it uses XDomainRequest, which is intended to have feature-parity with HTML forms (e.g. only GET or PUT, can’t set or view headers). Secondly, IE8’s Accept header doesn’t ask for HTML, but rather “I’ll have a Word document, or an Excel spreadsheet, or a JPEG image, or …, or whatever else you’ve got”.

This all means that we can’t tell the two types of request apart without doing something bespoke and non-standard. It also means that instead of being sent to the login page, the user gets a page saying “You need to authenticate” but no obvious way to do so.

It’s all a bit of a mess.

Posted in Uncategorized | 1 Comment

Test

Foo

Foo

Foo

Foo
Bar
Foo
Bar
Foo
Bar
Posted in Uncategorized | Leave a comment

Accessing SharePoint with WebDAV, without files being returned as HTML

When mounting a SharePoint form library using gvfs over WebDAV, files stored as XML on the server are returned as the HTML presented to browsers. Other WebDAV clients on Mac and Windows successfully retrieve XML versions. What’s going on‽

Well, SharePoint supports a proprietary Translate: f request header to tell the server that it should return files verbatim. This is sent by Mac and Windows WebDAV clients, but not gvfs.

To test using curl, compare:

curl https://sharepoint.example.com/path/to/site/Library/foo.xml \
    -u"username:password"

and

curl https://sharepoint.example.com/path/to/site/Library/foo.xml \
    -u"username:password" -H"Translate: f"

I’ve created bug 688045 in the GNOME Bugzilla to ask that gvfs add this header to all requests. In the meantime, we’ll probably have to proxy SharePoint and add the header ourselves, or patch gvfs. Yay.

Posted in SharePoint and Exchange | 1 Comment

Proxying HTTPS sites for debugging using Wireshark

I’m currently trying to work out how to poke the University’s SharePoint instance to get data out in useful formats. Often, it’s useful to see how other tools (e.g. WebDAV clients and SharePoint Designer) do it. As SharePoint is (sensibly) only available over HTTPS, I’ve had to set up a local Apache instance to act as a proxy.

Here’s the config:

SSLProxyEngine on
ProxyRequests on
ProxyPass / https://sharepoint.nexus.ox.ac.uk/
ProxyPassReverse / https://sharepoint.nexus.ox.ac.uk/
ProxyPassReverseCookieDomain .nexus.ox.ac.uk localhost
Header edit Set-Cookie secure ""

Line-by-line (almost):

SSLProxyEngine on
ProxyRequests on
Make sure we can proxy, and that we can do so to HTTPS sites.
ProxyPass / https://sharepoint.nexus.ox.ac.uk/
ProxyPassReverse / https://sharepoint.nexus.ox.ac.uk/
ProxyPass sets up a forwarding proxy, and ProxyPassReverse makes sure that redirects are rewritten.
ProxyPassReverseCookieDomain .nexus.ox.ac.uk 192.168.122.1
Rewrites cookies set using the Set-Cookie header so that they’ll be sent on subsequent requests. The IP address is that of the host on my virtual machine bridge network.
Header edit Set-Cookie secure ""
Removes the “secure” flag from cookies, so that clients send them on the HTTP leg to the Apache server.

This allows me to run clients against the local URL with traffic in the clear for me to snoop using WireShark. It’s worked for gvfs, now to check SharePoint Designer (running on a local Windows 7 VM).

Posted in SharePoint and Exchange | Leave a comment

Programmatic access to Exchange 2010 using EWS, SOAP, and Python

I’ve previously blogged about accessing Exchange (2007) using suds and Python. Turns out that things have changed slightly in Exchange 2010, so here’s an update.

First, you’ll need to use Alex Koshelev’s EWS-specific fork of suds, which you can grab from BitBucket. Next, you’ll need code a little like this:

import urllib2

from suds.client import Client
from suds.sax.element import Element
from suds.transport.http import HttpTransport

class Transport(HttpTransport):
    def __init__(self, **kwargs):
        realm, uri = kwargs.pop('realm'), kwargs.pop('uri')
        HttpTransport.__init__(self, **kwargs)
        self.handler = urllib2.HTTPBasicAuthHandler()
        self.handler.add_password(realm=realm,
                                  user=self.options.username,
                                  passwd=self.options.password,
                                  uri=uri)
        self.urlopener = urllib2.build_opener(self.handler)

transport = Transport(realm='nexus.ox.ac.uk',
                      uri='https://nexus.ox.ac.uk/',
                      username='abcd0123',
                      password='secret')
client = Client("https://nexus.ox.ac.uk/EWS/Services.wsdl",
                transport=transport)

ns = ('t', 'http://schemas.microsoft.com/exchange/services/2006/types')
soap_headers = Element('RequestServerVersion', ns=ns)
soap_headers.attributes.append('Version="Exchange2010_SP1"')
client.set_options(soapheaders=soap_headers)

address = client.factory.create('t:EmailAddress')
address.Address = 'first.last@unit.ox.ac.uk'

client.service.GetUserOofSettings(address)

Differences from the previous post are:

  • Passing SOAP headers seems to be necessary in some circumstances. This post on StackOverflow came in handy in working out what to do. The MSDN documentation (e.g. this page about GetRoomLists) tells you which SOAP headers you can send.
  • Namespaces seem to be working better. We’ve created a t:EmailAddress element this time, not a ns1:EmailAddress.
  • The patch mentioned in my previous blog post has been applied, so there’s now no need to apply it yourself.
Posted in Exchange Web Services, SharePoint and Exchange | Leave a comment

Cookie-like behaviour, without cookies

I was at the University’s Webmasters’ Workshop event at the OeRC on Friday, and got talking to Dan Q of the Bodleian Libraries about the soon-to-be-enforced ‘cookie law’. We realised that it’s possible to achieve cookie-like behaviour without actually setting a cookie. We’d initially thought that this would circumvent the ‘cookie law’, but having looked at the text of the legislation as quoted in the ICO’s guidance on cookies it appears that this cookie-less approach would also be unlawful, and is certainly against the spirit of the law. I present the idea here as a thought experiment, and to point out that one might need to be careful before implementing any ‘workarounds’ to continue to track visitors.

The idea

A cookie is simply an arbitrary bit of data handed to a browser that it will then hand back on subsequent requests. The cookie can be used to store a (semi-)permanent identifier that can be used to track the user, and it’s this functionality we want to duplicate.

In this approach, each page on a site pulls in a bit of JavaScript that uses XMLHttpRequest to retrieve /track/. This returns a never-expiring 301 Moved permanently response with a redirect to a URL containing a tracking identifier, say /track/sgnklsfg/. The browser retrieves this URL, and receives another never-expiring document. The document is a bit of XML containing the identifier, which can be retrieved using from the original XMLHttpRequest object.

This uses the browser’s caching to maintain the identifier unchanged indefinitely. With the onset of Cross-Origin Resource Sharing, this would also allow the site owner to track users across domains. Dan Q also reckons it could be used to implement a shim around Google Analytics to eschew the use of cookies, which woud be useful were the cookie law only about cookies.

Update: Dave King points out that similar functionality could be acheived using web storage.

Further update: The redirect is probably unnecessary. There’s also the possibility that the cached resource containing the identifier might drop off the bottom of the browser cache after a relatively short time. In this case, Dave’s suggestion is probably a more reliable way to track a user.

The legislation

The law is complicated, and I am not a lawyer. This is my interpretation of the law, and it is liable to differ from that of professionals.

The relevant section of the Privacy and Electronic
Communications Regulations Act 2003, as ammended, is:

    1. Subject to paragraph (D), a person shall not store or gain access to information stored, in the terminal equipment of a subscriber or user unless the requirements of paragraph (B) are met.
    2. The requirements are that the subscriber or user of that terminal
      equipment–

      1. is provided with clear and comprehensive information about the purposes of the storage of, or access to, that information; and
      2. has given his or her consent.
    3. Where an electronic communications network is used by the same person to store or access information in the terminal equipment of a subscriber or user on more than one occasion, it is sufficient for the purposes of this regulation that the requirements of paragraph (B) are met in respect of the initial use.
      For the purposes of paragraph (B), consent may be signified by a subscriber who amends or sets controls on the internet browser which the subscriber uses or by using another application or programme to signify consent.
    4. Paragraph (A) shall not apply to the technical storage of, or access to, information–
      1. for the sole purpose of carrying out the transmission of a communication over an electronic communications network; or
      2. where such storage or access is strictly necessary for the provision of an information society service requested by the subscriber or user.

This doesn’t mention cookies by name, only the act of causing to be stored or retrieving information from the user’s browser without consent unless it is necessary in order to provide the requested service. A broad interpretation might be that as CSS generally contains no semantic content then it is not strictly necessary, and so requires the permission of the user. Likewise advertising. Other techniques for identifying the user, such as browser fingerprinting access information stored in the terminal equipment without permission, and so are presumably unlawful. Likewise subscribing to orientation events would be forbidden as it isn’t “strictly necessary” for providing a service, just convenient. It all seems a bit too woolly and all-encompassing. You might be interested in Silktide’s page on what is affected by the “Cookie Law”.

As mentioned earlier. the wording of the legislation would seems to suggest that this cookie-less approach would still be as unlawful as the equivalent using cookies.

Posted in Musings | 2 Comments

Python decorators with optional arguments

Ever seen decorators which can be used like this?

@baked
def get_cake(flavour):
    # …

@baked(temperature=180, duration=25)
def get_cake(flavour):
    # …

Django’s template filter registration decorator is a good example of this, where it can be called as either a decorator, or a function that returns a decorator (specifically, a function that returns a function that takes a function and returns a function :D).

All these levels of indirection can get a little confusing. First, lets look at a simple decorator function:

import functools

def baked(method):
    @functools.wraps(method)
    def f(*args, **kwargs):
        thing_to_be_baked = method(*args, **kwargs)
        return bake(thing_to_be_baked)
    return f

functools.partial is a useful utility function that copies attributes from the wrapped function to the wrapping function.

Next, here’s one that takes additional arguments:

import functools

def baked(temperature=None, duration=None):
    def decorator(method):
        @functools.wraps(method)
        def f(*args, **kwargs):
            thing_to_be_baked = method(*args, **kwargs)
            return bake(thing_to_be_baked, temperature, duration)
        return f
    return decorator

Here, baked is the function that returns a function (decorator) that takes a function (method) and returns another function (f). This is a lot of nesting, and still doesn’t handle the case where the user doesn’t want to supply the optional arguments.

We can reduce the nesting using functools.partial, while at the same time making the arguments optional:

import functools

def baked(method=None, temperature=None, duration=None):
    # If called without method, we've been called with optional arguments.
    # We return a decorator with the optional arguments filled in.
    # Next time round we'll be decorating method.
    if method is None:
        return functools.partial(baked, temperature=temperature, duration=duration)
    @functools.wraps(method)
    def f(*args, **kwargs):
        thing_to_be_baked = method(*args, **kwargs)
        return bake(thing_to_be_baked)
    return f
Posted in Python | Leave a comment

DevXS

I spent the weekend at DevXS a student developer event hosted by the lovely people at the University of Lincoln.

All in all, it was a great event, and I look forward to there being more of them. Joss Winn commented afterwards that it’s also quite likely a good way to encourage young developers to work in higher education. In the very least it’s going the attendees more aware that there are things they can build that will improve the student experience for them and their peers.

Want to know more? Tony Hirst has penned a blog post with his thoughts, and the official blog has a closing video, the list of winners — there were £1500 worth of prizes(!) — and loads more stuff about the event.

Arduino and JeeNode lightning talk

While there I knocked together a lightning talk about how easy it is to build stuff that interfaces with the physical world, based on my very positive experience with JeeNodes. The slides are available as a PDF.

Posted in Conferences | 1 Comment

Quick comments on using pylibacl on Mac OS X

This is mostly a note to myself, though might be useful for anyone else trying to get it working.

pylibacl is a Python module for accessing and modifying POSIX.1e Access Control Lists. We’re using these ACLs in the DataFlow project as we need finer-grained access control than is afforded by the standard Unix permissions model.

One of our developers was trying to get pylibacl working on Mac OS X, and ran into a bit of trouble when compiling. Basically, it isn’t supported, and won’t work. The pylibacl homepage says:

Todo: while Linux support is quite good, other OSes are not; this should be remedied…

The longer explanation is that OS X has a set of permissions that are different to those that pylibacl expects:

// </usr/include/sys/acl.h> on Max OS X 10.6.7

typedef enum {
    ACL_READ_DATA = …,
    ACL_LIST_DIRECTORY = …,
    ACL_WRITE_DATA = …,
    …
} acl_perm_t;

On my GNU/Linux 2.6.35.14 box:

// </usr/include/sys/acl.h> on GNU/Linux 2.6.35.14

#define ACL_READ        (0x04)
#define ACL_WRITE       (0x02)
#define ACL_EXECUTE     (0x01)

This leads to the following fun:

snow-leopard:~ root# pip install pylibacl
Downloading/unpacking pylibacl
  Downloading pylibacl-0.4.0.tar.gz
  Running setup.py egg_info for package pylibacl
    warning: no files found matching 'MANIFEST'
Installing collected packages: pylibacl
  Running setup.py install for pylibacl
    building 'posix1e' extension
    gcc-4.2 -fno-strict-aliasing -fno-common -dynamic -DNDEBUG -g -fwrapv -Os -Wall -Wstrict-prototypes -DENABLE_DTRACE -arch i386 -arch ppc -arch x86_64 -pipe -I/System/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 -c acl.c -o build/temp.macosx-10.6-universal-2.6/acl.o
    acl.c:52: error: ‘ACL_READ’ undeclared here (not in a function)
    acl.c:53: error: ‘ACL_WRITE’ undeclared here (not in a function)
    /usr/libexec/gcc/powerpc-apple-darwin10/4.2.1/as: assembler (/usr/bin/../libexec/gcc/darwin/ppc/as or /usr/bin/../local/libexec/gcc/darwin/ppc/as) for architecture ppc not installed
    Installed assemblers are:
    /usr/bin/../libexec/gcc/darwin/x86_64/as for architecture x86_64
    /usr/bin/../libexec/gcc/darwin/i386/as for architecture i386
    acl.c:52: error: ‘ACL_READ’ undeclared here (not in a function)
    acl.c:53: error: ‘ACL_WRITE’ undeclared here (not in a function)
    acl.c:1615: fatal error: error closing -: Broken pipe
    compilation terminated.
    acl.c:52: error: ‘ACL_READ’ undeclared here (not in a function)
    acl.c:53: error: ‘ACL_WRITE’ undeclared here (not in a function)
    lipo: can't open input file: /var/tmp//ccWGzg6m.out (No such file or directory)
    error: command 'gcc-4.2' failed with exit status 1

Wikipedia also claims that OS X supports NFSv4 ACLs, which may also offer some explanation; I don’t know anything about NFSv4 ACLs to be able to tell!

In the longer term an interested party could probably fix pylibacl to work on OS X without too much difficulty — their code is available in a Git repository at git://git.k1024.org/pylibacl.git, and it already does a bit of platform-specific stuff. However, C isn’t exactly my area of expertise, Mac OS X isn’t one of our target platforms, and we’ve got plenty of other things to be getting on with.

Posted in Uncategorized | Leave a comment