{"id":798,"date":"2016-04-04T13:07:21","date_gmt":"2016-04-04T12:07:21","guid":{"rendered":"http:\/\/blogs.it.ox.ac.uk\/syslog\/?p=798"},"modified":"2016-12-10T10:38:04","modified_gmt":"2016-12-10T10:38:04","slug":"shibboleth-identity-provider-upgrades","status":"publish","type":"post","link":"https:\/\/blogs.it.ox.ac.uk\/syslog\/2016\/04\/04\/shibboleth-identity-provider-upgrades\/","title":{"rendered":"Shibboleth Identity Provider upgrades"},"content":{"rendered":"<p>After some slight prompting by both the Networks team and colleagues in Sysdev, the IAM team felt that we should write some blog posts of our own about our own work to upgrade the University&#8217;s authentication infrastructure.\u00a0 The first of these is on our work to upgrade the Shibboleth service.\u00a0 This work ensures that we are running a fully-supported version of Shibboleth, as well as enabling new features in the future, such as single log-out.\u00a0 The upgrade will also make our Shibboleth servers highly available, which should improve service reliability, and allow us to consolidate our existing servers to an extent.<\/p>\n<p>The upgraded service will go live on the 5th April after much testing over the past few months.\u00a0 No Shibboleth-protected services should be affected by this work, and the upgrade should be transparent to end users.<\/p>\n<h2>What is Shibboleth?<\/h2>\n<p>Shibboleth currently sits at the top of Oxford&#8217;s Single Sign-On (SSO) stack, on top of both Kerberos and Webauth.\u00a0 The original purpose of Shibboleth was to extend SSO to services outside the University, such as journal access.\u00a0 However, Shibboleth is also frequently used for services within the University as well, not least to provide SSO to systems that lack support for Webauth.\u00a0 Although Windows servers are the most common case of servers without Webauth support, other systems such as Bradford Campus Manager also fall within this group.\u00a0 Shibboleth is based on the idea of &#8220;claims-based authentication&#8221; using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Security_Assertion_Markup_Language\">SAML<\/a>, where a Service Provider (or SP) is given a signed &#8220;assertion&#8221; from a trusted Identity Provider (IdP).\u00a0 This assertion contains details (known as attributes) about the end-user such as username, name and email address that can then be used to make decisions about access.<\/p>\n<p>For Shibboleth to work, the IdP and SP need to know certain details about each other, such as where they may be found and the certificates used for signing assertions.\u00a0 This information is known as the metadata for a given server.\u00a0 It is possible to share this manually between the two servers if needed, but when this is scaled up to a large number of services and identity providers it becomes unwieldy to manage the metadata swapping.\u00a0 To solve this problem, Shibboleth servers generally have their metadata published by one or more &#8220;federations&#8221;, which act as a single trusted source of metadata.\u00a0 The individual Shibboleth servers then fetch signed metadata from the federations they trust.<\/p>\n<p>Since Shibboleth may be used to login to many different SPs with varying levels of trust, the software is privacy-preserving by default.\u00a0 This means that attributes that could be used to identify end users must be explicitly &#8220;released&#8221; to a given service provider.\u00a0 This means that instead of a normal username, services are typically presented with an opaque persistent ID, which is generated by a one-way hash of the service provider&#8217;s &#8220;entityID&#8221; (an identifier for that particular identity or service provider) and the Oxford SSO username.\u00a0 This prevents separate SPs working together to de-anonymise users.<\/p>\n<h2>Why upgrade Shibboleth?<\/h2>\n<p>About a year ago, we received the news that updates and support for version 2 of the <a href=\"http:\/\/shibboleth.net\">Shibboleth Identity Provider<\/a> (IdP) server would be discontinued by July 2016.\u00a0 This meant that we had to start work on migrating to the new version of the software (IdP v3), since running supported software is a good idea.<\/p>\n<p>In addition to the obvious desire to run a supported version of the IdP software, the upgrade also means we can make resiliency improvements.\u00a0 At present, almost all Oxford Shibboleth authentication is handled by a single server.\u00a0 This is mostly down to the difficulties in setting up an IdP v2 cluster, but is also down to avoidance of load-balancers in the past.\u00a0 (For historical reasons, there is also a completely separate IdP pair that is used for some internal business systems, with manual switching between the two servers.)\u00a0 However, the popularity of Shibboleth for new services means that the current single point of failure is no longer a sensible option today.\u00a0 The IdP v3\u00a0 software is also rather easier to cluster than the previous version, and no longer requires a complicated state-sharing mechanism for clustering.<\/p>\n<p>Finally, the upgrade process provides an opportunity to consolidate our existing Shibboleth environments.\u00a0 Currently, we have three environments, which look like the following:<\/p>\n<ul>\n<li>Main IdP\n<ul>\n<li>Live (1 server)<\/li>\n<li>Test IdP (1 server)<\/li>\n<li>Development (1 server)<\/li>\n<\/ul>\n<\/li>\n<li>Business Systems IdP\n<ul>\n<li>Live (2 servers)<\/li>\n<li>Test (1 server)<\/li>\n<\/ul>\n<\/li>\n<li>IAM test stack (1 server)<\/li>\n<\/ul>\n<p>As mentioned earlier, we have historically run a separate IdP for business systems that required a high-availability authentication service.\u00a0 However, as the upgrade will bring high-availability features to the main IdP, we should be able to remove the additional environment:<\/p>\n<ul>\n<li>Main IdP\n<ul>\n<li>Live (3 servers)<\/li>\n<li>Test IdP (2 servers)<\/li>\n<li>Development (1 server)<\/li>\n<\/ul>\n<\/li>\n<li>IAM test stack (1 server)<\/li>\n<\/ul>\n<p>While the total number of servers is identical, the elimination of the two business systems environments improves manageability of the service.<\/p>\n<h2>Load balancing and improving resiliency<\/h2>\n<p>The new service uses the Netscaler load-balancing device run by the Business Systems Operations Team, which is also used by WebLearn and other services.\u00a0 The Netscaler supports both session stickiness (necessary for avoiding server switches mid-authentication) and content-based switching, which is useful for allowing users to choose between old and new servers as well as separating out SAML1 and SAML2 requests for testing.\u00a0 For services using SAML2, the attributes are transferred between the IdP and SP via the end-user&#8217;s browser.\u00a0 However, in the case of SPs using SAML1, the SP must contact the IdP directly via a back-channel to obtain attributes.\u00a0 All the necessary state is stored on the client side, so no shared server state is required.\u00a0 The only exception to this is the authentication process, which must be performed on a single server.<\/p>\n<p>One interesting question is how the IdP maps an attribute query to the back-channel to SAML1 authentication request to the front-channel.\u00a0 The answer is that the front-channel returns a transient ID which is reversibly encrypted.\u00a0 The back-channel process then decrypts this transient ID to find out which user the request applies to.<\/p>\n<h2>Problems we saw<\/h2>\n<p>While the process of upgrading was slow, there were relatively few problems during the upgrade process.\u00a0 In several cases, the upgrade to IdP v3 improved compatibility with external services.\u00a0 For example, some service providers require particular types of authentication or require certain forms of user identifier to define the &#8220;subject&#8221; of an assertion.\u00a0 However, there were some problems that we saw during the upgrade.<\/p>\n<h3>SAML1<\/h3>\n<p>The first problem was how to test service providers that still use the old SAML1 protocol.\u00a0 Because these servers communicate directly with the IdP to retrieve attributes, it is generally difficult to test whether these behave as intended with the new service.\u00a0 The solution we came up with was to test specific development servers against the new IdP cluster, before testing external systems later in the rollout process.\u00a0 Ideally, we would have tested external sites with a separate test IdP.\u00a0 Unfortunately, some providers set strict limits on the number of IdPs that can be trusted (often 1) for a given organization, which makes this impossible.<\/p>\n<h3>Assertion signature algorithm<\/h3>\n<p>Another problem we saw was a lack of support for assertion signatures based on SHA-2.\u00a0 This is fairly rare, but affected one relatively important Service Provider: the cloud-based software used by our centralised helpdesk.\u00a0 While some may consider a lack of visible queries to answer a good thing at times, the Service Desk team may beg to differ!\u00a0 We fixed this by modifying relying-party.xml, as documented in the <a href=\"https:\/\/wiki.shibboleth.net\/confluence\/display\/IDP30\/SecurityConfiguration#SecurityConfiguration-SigningandEncryptionConfiguration\">Shibboleth wiki<\/a>:<\/p>\n<pre style=\"font-size: 80%\">&lt;!-- SHA-1 support bean --&gt;\r\n&lt;bean id=\"SHA1SecurityConfig\" parent=\"shibboleth.DefaultSecurityConfiguration\"\r\n  p:signatureSigningConfiguration-ref=\"shibboleth.SigningConfiguration.SHA1\" \/&gt;\r\n\r\n&lt;util:list id=\"shibboleth.RelyingPartyOverrides\"&gt;\r\n  &lt;bean parent=\"RelyingPartyByName\" c:relyingPartyIds=\"<strong>entityID here<\/strong>\"&gt;\r\n    &lt;property name=\"profileConfigurations\"&gt;\r\n      &lt;list&gt;\r\n        &lt;bean parent=\"SAML2.SSO\" p:securityConfiguration-ref=\"SHA1SecurityConfig\" \/&gt;\r\n      &lt;\/list&gt;\r\n    &lt;\/property&gt;\r\n  &lt;\/bean&gt;\r\n&lt;\/util:list&gt;<\/pre>\n<h3>Persistent IDs<\/h3>\n<p>The third issue we saw concerned our generation of opaque persistent IDs, which include an IdP-specific salt value.\u00a0 This is needed so that SPs cannot trivially reverse the persistent ID by brute force.\u00a0 For historical reasons, we use a random binary salt as opposed to the text-based salt more typically used, and accommodating this required some minor modifications to the IdP software.<\/p>\n<h3>Additional Verification<\/h3>\n<p>The final problem we saw was with the Additional Verification service, which provides multi-factor authentication.\u00a0 Although this service is rather limited at present, Additional Verification is currently used by WebLearn to protect examination setting and marking.\u00a0 The service is currently based on a custom-written Java servlet that sends one-time codes via text message.\u00a0 As the new IdP version changed the authentication interfaces used, the servlet required some modifications to work correctly.\u00a0 As a side-effect, the service was also restyled to match the current Webauth service.<\/p>\n<h2>The roll-out process<\/h2>\n<p>We started the process on the 8th March by placing our existing IdP behind the Netscaler load balancer.\u00a0 The existing server kept its IP address, but the DNS entries were modified to point at the load balancer.\u00a0 The reason we did this was to avoid problems with SPs that use the older SAML1 protocol, which include several journals and library resources, along with the Bodleian&#8217;s SOLO portal and this blog.\u00a0 Since some SPs cache DNS responses for up to seven days, a grace period is needed to make sure that the back-channel and front-channel connections both use the load balancer.<\/p>\n<div id=\"attachment_823\" style=\"width: 610px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-4.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-823\" class=\"size-medium wp-image-823\" title=\"Netscaler setup before IdP v3 go-live (click to enlarge)\" alt=\"Netscaler setup before IdP v3 go-live\" src=\"http:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-4.png\" width=\"600\" height=\"384\" srcset=\"https:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-4.png 951w, https:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-4-300x192.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><p id=\"caption-attachment-823\" class=\"wp-caption-text\">Netscaler setup before IdP v3 go-live (courtesy of Julian)<\/p><\/div>\n<p>The next step was to test that services using the old SAML1 protocol still worked using the new servers.\u00a0 On the 22nd March, we temporarily switched requests for SAML1 authentication (including back-channel requests) to the new servers during the maintenance window.\u00a0 This let us test that the new servers worked as intended with external journals, and confirmed that the sites worked.<\/p>\n<p>The final step will be to switch traffic from the old IdP server to the new cluster.\u00a0 Barring any last-minute problems, this will happen on the 5th April during the 7 a.m.-9 a.m. maintenance window, which will allow us time to test the new service and revert back if anything does go wrong. The resulting Netscaler setup will look like this:<\/p>\n<div id=\"attachment_824\" style=\"width: 610px\" class=\"wp-caption alignnone\"><a href=\"http:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-5.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-824\" class=\"size-medium wp-image-824\" title=\"Netscaler setup after IdP v3 go-live (click to enlarge)\" alt=\"Netscaler setup after IdP v3 go-live\" src=\"http:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-5.png\" width=\"600\" height=\"384\" srcset=\"https:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-5.png 951w, https:\/\/blogs.it.ox.ac.uk\/syslog\/files\/2016\/04\/idp3-5-300x192.png 300w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/a><p id=\"caption-attachment-824\" class=\"wp-caption-text\">Netscaler setup after IdP v3 go-live (courtesy of Julian)<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>After some slight prompting by both the Networks team and colleagues in Sysdev, the IAM team felt that we should write some blog posts of our own about our own work to upgrade the University&#8217;s authentication infrastructure.\u00a0 The first of &hellip; <a href=\"https:\/\/blogs.it.ox.ac.uk\/syslog\/2016\/04\/04\/shibboleth-identity-provider-upgrades\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":244,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[38583],"tags":[38586,38582,38585],"class_list":["post-798","post","type-post","status-publish","format-standard","hentry","category-service-improvement","tag-idp","tag-shibboleth","tag-sso"],"_links":{"self":[{"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/posts\/798","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/users\/244"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/comments?post=798"}],"version-history":[{"count":34,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/posts\/798\/revisions"}],"predecessor-version":[{"id":837,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/posts\/798\/revisions\/837"}],"wp:attachment":[{"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/media?parent=798"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/categories?post=798"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.it.ox.ac.uk\/syslog\/wp-json\/wp\/v2\/tags?post=798"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}