laststation.net také česky.

Amazon EC2 vulnerable to UDP flood attack [FIXED]

UPDATE 12 October 2009: I’m happy to let you know this post is not longer relevant. Amazon AWS team successfully deployed the fix and the scenario used to simulate Denial of Service attack using UDP flood isn’t applicable anymore. All that in less than 24 hours after publishing the link on Twitter. Good job!

Original post follows.

Unfortunate events surrounding the DDoS attack against BitBucket kicked-off heated discussions about the nature of this vulnerability. Where Amazon officially acknowledged this to be a single isolated incident, many others started asking questions why did it happen in first place?

Both personal and professional interest led me to find out more. Having designed series of tests how to replicate this scenario, I’ve started first instance and set up the target environment.

instance : c1.medium (us-east-1d)
EBS volume : 200 GB attached to (/dev/sdf)
monitoring : vmstat, netstat, iptraf, Amazon CloudWatch
security group : allowed SSH only (port 22/TCP)

UDP flood set up to be generated from the second instance (c1.medium) using simple Perl script, managing to generate whopping traffic of 650mbit per second (according to iptraf) using 1KB packets to random ports on the target IP.

Test 1. Let it run has been successful in a way there was no visibility on target machine. Still surprised by the traffic level generated on the source box, I’ve pointed the UDP flood to another machine – with security group allowing UDP traffic (ports 0 – 65535) – to check if the network traffic is able to reach another box. And it was. Not only from the same availability zone, but even from the different ones (tested us-east-1c and us-east-1b).

Test 2. consisted of formatting the prepared EBS, 5 samples for both scenario with and without UDP flood.

No traffic (1m15s)
UDP Flood (2m54s)

Test 3. Bonnie++ performance test of the EBS volume. Running with no incoming traffic, it took around 8 minutes to produce quite reasonable report. Having switched on the UDP flood I’ve repeated the same tests and my expectation was to see some results in similar time. Fifteen minutes later and bonnie still haven’t even finished third step (rewriting). Another 10 minutes without any significant progress pointed me to do some research what’s going on. The box wasn’t performing virtually any IO operations, and time spent waiting for IO topped 100% every second reading (1s delay). Bingo!

To verify if the problem is really caused by incoming UDP flood, I’ve stopped the traffic for a brief interval (around 7 seconds) and monitored using vmstat:

vmstats

As you can see on line 5 the IO traffic resumed, roughly correlating to the time incoming traffic stopped. Seven seconds later with the UDP traffic back on the box tried to keep up for another quarter of minute before giving it up.

Nothing! Based on my notes the first bonnie run occured at 10:40, switched on the UDP flood at 10:50, and started second bonnie run at 10:52. My patience ran out before 11:30 where there’s small peak caused by interactive iptraf session.

At this point there were no reasons to continue testing. All IO operations to/from EBS volume seemed to be blocked by UDP traffic generated by a single instance!

Conclusion

BitBucket guys had every reason to be angry. Blocking UDP in the security group configuration only hides the problem. Contraindicating the Jesper Nøhr statement, during this experiment there were no peaks visible using paid monitoring service – Amazon CloudWatch (see above). Which was probably the amount of information available to AWS 1st line of support.

This corresponds to the ‘black box’ described by Jesper. Looking back on the results it’s obvious that

To be fair, it’s been the first incident of such a magnitude. Let’s hope Amazon AWS team will come up with the architecture fix before somebody use the vulnerability in much wider and devastating attack. In mean time, the only workaround we can apply is to hide our instances as much as we can. Load-balancers and proxies in front of the worker instances should be enough, as long as you don’t share the same host machine.

Have a good weekend and best of luck protecting your instance’s IPs!

Published on Sunday, 11 October 2009   - Permalink
blog comments powered by Disqus