About this Document
Starting on January 25th, 2004, the number of hits per day at the www.fourmilab.ch exploded from the typical weekday level of around 650,000 first to 823,000 on the 25th, then 1,051,992 on the 26th and comparable levels on subsequent days (with the typical drop-off expected on the week-end). The anomalous jump in hits is immediately apparent from the daily usage chart for January 2004.
Examination of the Web access logs revealed that the increment in accesses were composed entirely of requests to the site's home page, most occurring over and over from a given IP address at intervals of four minutes (or, from proxy servers relaying such requests from hosts behind them, but I didn't understand that until later in the analysis process). These hosts requested nothing but the home page, which is highly anomalous, since that page on this site consists of nothing but a <frameset> container for which any browser will, immediately upon receiving it, request the content pages to which it links. This can easily be seen in the daily usage chart above in that the kilobytes transferred per day (red bar chart at bottom) was barely affected at all by the attack, as opposed to the other measures which do not take into account the volume of data transferred.
This pattern of accesses was highly suspicious, since at the very time the attack erupted against this site, the Mydoom.A worm was spreading across the Internet, and had been determined to contain code which would launch a distributed denial of service attack against the www.sco.com site on February 1, 2004 with precisely the form of requests I was seeing—repeated hits to the home page without any other requests to the site. However, my understanding was that Mydoom hit the SCO site as quickly as it could, not at the modest rate of one hit per four minutes I was seeing. This raised suspicion in my mind that, given the similarity of the pattern and the fact that I'd been hit several days before SCO was to be targeted, that my site might be the victim of an early test version of the worm, deployed to check it out prior to release as the fully-virulent Mydoom.
After mentioning this on a discussion list relating to the Mydoom worm, I was contacted by the administrator of another site who reported precisely the same pattern of hits: requests to his site's home page over and over, four minutes apart, totaling on the order of 100,000 per day (lower than the 400,000 I was seeing). However, this site had been under a consistent attack of this kind for almost a year, with two inexplicable intervals during which the attack stopped and then resumed with full intensity. We exchanged lists of IP addresses attacking us and compared them, and found no meaningful commonality. However, knowing that another site had been attacked for such a long period of time prompted me to analyse historical logs (I have logs of every hit to the site since it opened to the public on November 28th, 1994) to check for evidence of precursors to the main attack. The results were very interesting, indeed. In the daily status report for February 6, 2004, I report the discovery of a sequence of accesses identical to those of the attack dating back to December 12, 2002, and continuing from the same IP address until November 25, 2003. Other sites were observed to make similar accesses throughout 2003, and have been summarised in the early attacks report.
During the attack, I wrote to a number of system administrators who might be able to identify machines responsible for the attack and provide information about the possible cause. I received only one reply, from a site which tracked down the accesses to a single PC behind their firewall which was found to be infected with a variety of “spyware” and “adware” which, when cleansed from the machine, immediately stopped the hits originating there. Details, including a log of the cleanup of the machine, may be found in the daily update for January 30, 2004.
After these investigations and revelations, I mostly ran out of ideas apart from monitoring the statistics on a daily basis. Given that the Mydoom worm had a self-destruct date of February 12th, I was intensely interested to see what would happen to the hits on my site as that date approached. The hits started here first; would they also end before the 12th? Well, no, but on Friday the Thirteenth of February 2004, the attack began to deflate along a trajectory almost symmetric to the initial ramp-up, and has continued to do so subsequently. The end appears to be at hand. Still, one must temper one's optimism; recall that the other site hit by an almost identical attack saw pauses, then resumption of the assault. In any case, with the attack having ended or at the very least going on hiatus, there's nothing to do other than keep an eye on the logs to see if it comes back. The daily usage chart for February shows the abrupt drop in hit rate as the attack ended, then resumed on the 17th. Entries for the last week of February are perturbed by the introduction of countermeasures against the attack.
The main text of the document which follows was written on January 27, 2004 as an initial incident report on the attack. The historical evolution chart and the analysis reports linked to from the document were updated as the attack progressed, but subsequent revelations were not integrated into the main text. Discoveries made as the attack progressed and forensic analyses of the attack in progress and its history as recorded in the logs are chronicled in the daily updates appended to the original incident report.
It's back! On February 18th, the attack, moribund for several days, ramped back up with a vengeance and reached half its peak level of the first round. I've resumed posting daily updates, the first of which for the second round was posted on 2004-02-18.
Starting around January 21st, 2004, my www.fourmilab.ch site has been under what appears to be an accelerating Distributed Denial of Service (DDoS) Attack with very curious properties which makes me wonder if it is part of a test of an attack network eventually intended to be deployed with much more serious consequences against this site and/or others.
I am not aware of any particular reason my site would merit being the target of an attack. Although I do publish Annoyance Filter which may irritate spammers, and The Digital Imprimatur, which could offend idiots who fail to comprehend it is a cautionary tale written in the style of a dystopia, neither of these are recent additions to the site. The only significant change of late was the discontinuation of Speak Freely on January 15th, 2004, but that was announced in the End of Life announcement of August 1st, 2003, and in any case occasioned no acrimonious reactions I am aware of, particularly since the final version of the program is in the public domain and available from an archive site on SourceForge. All of this is by way of saying that I can't see any reason my site should be a probable candidate for attack, which makes me suspect (along with the nature of the attack I'm seeing) that I'm simply a target of opportunity in all likelihood simply being used to test an attack network.
I first noticed the apparent attack when I noticed the number of hits per day in the site statistics explode from the typical average of 600,000 to 650,000 per day to 823,012 on January 25 and 1,051,992 on the 26th. (Since then, 1,096,133 on the 27th and 1,097,562 on the 28th.) Simply watching the HTTP log scroll by, it was obvious that a grossly disproportionate number of these hits were requests to my site's home page with the request:
GET / HTTP/1.1
When I started to notice this on the evening (GMT +1 time zone) of January 26th, two things were immediately apparent. First, the requests were coming from a large number of different IP addresses which, for those I was able to resolve a host name, were located all over the world. Second, these hosts' accesses to the site consisted exclusively of retrieving the home page—they did not make any other HTTP requests. Now this is downright weird, because the home page document at this site is just a <frameset> container which has no content unless one fetches the <frame>s it references. And most of these sites that were hitting me were requesting the home page over and over, without asking for anything else, but at a rate which was rather modest—not a flat out bombardment like you see when somebody's venomous spider program is in a loop blasting requests at you. Here is the "trajectory" of an individual IP address:
View request trajectory
which began sending requests at 2004-01-26 03:15:28 and was still sending them 40 hours later at 2004-01-27 20:03:27 when I last scanned the log file. Note how the requests seem to arrive almost precisely four minutes apart, to within a second or so. This pattern is common, but not universal. Here's an extract of a second trajectory, this of a site which has sent a total of 1543 hits to the home page, which seems to send pairs of requests four minutes apart, as if in some circumstances multiple instances of the process which is generating the requests can be running.
The apparent attack started out slowly and only began to surge on the 26th; it has grown more or less steadily ever since, although not monotonically. As I write this at 21:00 UTC on January 27th, the request rate is about 20,000 hits per hour, or about 5.67 hits per second. Here is a data table showing the acceleration of the attack over time. The following chart shows the development of the attack over time:
I'll update this document and chart as the situation evolves.
Everywhere! I cranked a list of unique IP addresses through logresolve to obtain fully qualified domain names for all which would resolve, then used the resulting database to annotate a report of attacking sites sorted by total number of hits. I report only sites which made 500 or more hits on the home page since January 21st the last update to this file; those with fewer hits may be legitimate visitors to the site.
View “heavy hitters” report
The “Hours Active” column gives the number of hours between the time of the first hit from this site and the most recent, and “Seconds per hit” the mean time in seconds between hits from this site. IP addresses which could not be resolved to host names via reverse DNS lookup are shown as “?”; host names which I haven't yet tried to resolve (the logresolve process is done as a batch job in the background) appear as blank.
When this began to erupt, I naturally suspected it might have something to do with outbreak of the Novarg/Mydoom worm, which is apparently intended to launch a DDoS attack against sco.com on the first of February. Perhaps, I thought, I was a "test case" for this worm. Well, I don't think so. First of all, the hits to the home page began well before the worm was seen in the wild (but of course, that's to be expected if I was being used to test it). But I've run port scans (naughty John!) on numerous hosts sending the requests, and some of have had the TCP ports (3127–3198) open which the worm seems to use for remote control. I've run all port scans on a few of these hosts, and I don't see anything suspicious on most, which might indicate that they might be infected by a test version of the worm listening on a different port. But none of this is definitive. Post scans of these hosts all have the profile of Windows machines, as far as I can determine. Here are some port scans I have run on attacking hosts. Note that one (192.115.62.211) does indeed appear to be infected with the Novarg/Mydoom Worm, and two have the “subseven” control port open. Interesting. (I excluded from port scanning all hosts to which a telnet timed out without response, as they're probably behind a firewall or NAT box.)
Update: A disassembly of the worm payload indicates it creates a thread which simply requests endless copies of the www.sco.com home page. This is precisely the pattern of requests I'm seeing here, albeit at a much slower rate. Perhaps this is evidence we're seeing a test version of MyDoom here.
One extremely curious aspect of this incident is that if it is indeed a distributed denial of service attack, it is just about as benign as such a one could possibly be. The sites hitting me are all requesting just the <frameset> of the home page which, at 646 bytes, is one of the smallest files on the site. While 20,000 bogus hits per hour is nothing to sniff at, as long as they're for a file this small it's actually well within the capacity of my 2 Mbit/sec leased line and hasn't materially injured response time. But it easily could have. If the requests were for any of the sizable downloads available from this site, or for a CGI request which consumed substantial CPU time on the server (and, no, I'm not going to give you handy-dandy ready-to-(ab)use links to examples of such), this request rate would have already brought Fourmilab to its little knobby arthropod knees.
So again, I wonder, is this a test? “This is only a test. If this had been a real attack, your site would be a squashed bug by now.” Makes ya' wonder. Makes me worry.
If you're seeing anything like this in your server logs or have come across any information relevant to attacks of this kind, please let me know at: Bravo Uniform Golf Sierra @ Foxtrot Oscar Uniform Romeo Mike India Lima Alpha Bravo Decimal Charlie Hotel. (See this if you have trouble figuring out what I just said.)
I'll append updates at the end of this page as things develop and information is gleaned as to what's going on here.
I should note that I received an unsolicited E-mail “defacement alert” shortly before the attack commenced. I dismissed it at the time because my site had obviously not been defaced (it's a hard nut to crack, if I say so myself), but this may have referred to my site's appearing as a target on fora read by miscreants. Here's the message; make of it what you will.
I circulate this only for its informative value; I do not endorse the use of the word “hacker” to denote a perpetrator of criminal conduct.
I've updated the statistics text documents linked to this document to reflect the situation up to 15:00 UTC on January 28th. The time evolution report now shows the number of unique hosts who hit the home page 10 times or more during each hour, and I've added the unique hosts to the time evolution chart above.
I've updated the situation as of 19:00 UTC, adding a “new hosts” column to the time evolution report which shows how many never previously seen IP addresses sent 10 or more hits in each hour. New hosts counts are also plotted in the chart above. This permits tracking “recruitment” of new hosts to the apparent DDoS network. I also computed the total number of unique IP addresses seen so far, which adds up to 2894 as of this report.
Looking over the chart so far, it looks like the real onset of the attack was at around 09:00 on 2004-01-24 with a large jump in recruited hosts beginning around 12:00 on the 25th. The first peak at over 20,000 hits per hour occurred at 20:00 on the 26th, and has followed a diurnal pattern since then, apparently neither growing nor shrinking. Recruitment of new hosts (never seen before in the attack) appears to be at a constant rate, following the same diurnal pattern. Since hosts are distinguished by IP address, some of this may be dial-up or other machines with floating IP addresses which appear to be new hosts each time they connect.
All of the linked files and the chart have been updated as of this time. It looks like the attack has reached a steady state and is now varying on a diurnal schedule as Windows machines are on-line around the world. We'll see how this develops. Based on disassembly of the MyDoom worm, I've revised my initial conclusion that there was no connection between this attack and that worm. See my Slashdot post for details.
The reports linked to this page and the attack history chart have been updated through 2004-01-30 00:00 UTC (note that times in these reports are local time, which is one hour later than UTC). The hit rate appears to be continuing its consistent daily fluctuation without any substantial growth or diminution.
Now here's something extremely interesting. One of the machines consistently hitting me was behind the firewall of a commercial site, so I wrote that site's administrator to report the situation. They tracked down the source of the packets to a single PC which was found to be running a variety of “spyware” and “adware” packages when scanned by Ad-aware and Spybot-S&D. Further, when these products were instructed to remove the “malware” packages from the computer, it immediately ceased sending packets and hasn't sent one since. Consequently, the working assumption at this point must be that one of the packages identified and removed by these system cleaning tools is responsible for the attack, whether deliberately or accidentally. (One can easily imagine a spyware package checking in with headquarters every four minutes and using an HTTP connection, which sails through most firewalls, to do so. Now suppose somehow it got my IP address instead of the one to which it was intended to “phone home”….) Here is a copy of the Spybot log from the run on that machine. I have replaced text which might identify the site which so kindly shared this information with me or the user whose machine was infected with “REDACTED” to protect their privacy; none of the redacted information is relevant to identifying the source of the attack. We now have a list of suspects; perhaps tracking down other machines involved in the attack and cleaning them will permit winnowing the line-up to the actual perp. (Note: by publishing this list of spyware and adware cleaned from the machine which, after said ablution, ceased to sin against Fourmilab, I am not asserting that one of the packages named in this report is, in fact, responsible for the attack or imputing blame to their creators. I'm simply reporting the observation that this machine consistently sent packets every four minutes for more than 50 hours and, immediately after the Ad-aware and Spybot run, ceased to send them. You never know—perhaps the machine was rebooted after the malware was removed, and the actual culprit was a program which had been running but didn't restart after the reboot. We shall see.)
The first evidence that other sites are being hit with this came today in a E-mail responding to my Slashdot post. (again, I keep the identity of the sender and site confidential). The administrator of this site reports that he's seen the same pattern of hits to that site's home page for almost a year, starting slowly with several hundred a day and rising to on the order of 100,000 per day at present. The hits have the same characteristic four minute delay I'm seeing. He says he's heard of no other site being hit with these requests prior to my report.
What to make of all of this? Well, the rapid onset I saw looks very much like the propagation of a worm or virus, but may be the uptake of a command into a network of remote-controlled zombies. The malware connection is certainly worth investigating further when the opportunity to do a forensic analysis of another culprit host presents itself. I'm also going to analyse archived access logs to see if there's evidence for this attack prior to the January 21 date which I've used as the starting point for all analyses to date. Wouldn't it be cool if the perpetrator of this attack ran a test from their own machine before loosing this thing into the wild? The game is afoot.
All of the databases and charts are updated as of the date above. There haven't been a lot of great enlightenments in the last 24 hours. I've had a lot of other matters to attend to, and was able to spend only a little time on this investigation and nothing relevant warped in from any of the other sources I monitor. If I were a stock trader, I'd look at the hit chart as a descending triangle which established its base and peak on the 26th and has been following that pattern ever since. Based on this advanced numerological technology, we'd expect a major violation of the baseline of the triangle around 12,500 to signal the end of the attack. The down-spikes yesterday were due to crashes of the regrettable 3Com firewall appliance whose hideous defects I will chronicle in excruciating detail once this adventure is over.
The administrator of the other site (see the last update) who's been seeing these hits for a year furnished me a list of IP addresses which have been hitting them. I cross-correlated this with my list of “heavy hitters” and found these sites as having hit both of us since January 21st, 2004. There's nothing obvious to be gleaned from this list: two are in Brazil and one in Portugal—some kind or weird lusophone conspiracy? I think not.
I've updated the status reports and charts for today's hits but, occupied with other matters, haven't had a chance to do much new analysis. From the chart, the magnitude of the attack appears to be subsiding, but that may merely be the effect of the week-end, which causes a comparable diminution of legitimate hits on my home page.
The chart and status reports have been updated. The “descending triangle” pattern I mentioned in the update for 2004-01-31 has now been violated by a higher high on February 1st than that of the previous day. It will be interesting to see what happens when the work week starts once again. Also intriguing is the discorrelation which developed over the week-end between the number of unique hosts hitting me and the total number of hits. These tracked closely before, but now it seems they've parted ways. It may be that fewer hosts are alive on the week-end, and those which hit me more frequently account for a disproportionate share of the total hits. Note that the new hosts haven't changed much but, as I noted previously, some new hosts are almost certainly machines with floating IP addresses which appear as new each time they connect to the Internet.
There have been reports of extortion threats against on-line gambling sites which threatened to take them down with a DDoS attack during the Super Bowl of American football. This made me wonder if the sites attacking me might be a DDoS network in “idle mode” waiting to be deployed against gambling sites which didn't pay up. A drastic decrease in hits on my site during the Super Bowl would be evidence for this. At this writing, the Super Bowl is still in the first quarter and I haven't seen a material decrease in the number of hits. I'll update this tomorrow when I have data for the entire duration of the game in hand.
I've updated the analysis tools available for download from the link at the bottom of this page to include a few new programs I've written. These programs are still abysmally documented, but better than the last time around. The onset.pl is my attempt to determine the first plausible instance of the attack. I've run this not just on the log starting on January 21st which I've used for most of these analyses, but on logs dating back to December 8th, 2003, and I've found a few potential precursors of the main attack. The activity in this report from 66.250.131.50 is interesting, to say the least. On December 10th, this site hit my home page 300 times without ever hitting anything else for one hour with a mean time of 21 seconds between hits. Then, on Christmas eve, a series of 3761 hits began which ran for 594 consecutive hours until January 18th, 2004, all without a single hit to any other page and an average time between hits of 569 seconds. Nothing remotely like this shows up in the historical logs I've scanned, and this IP address hasn't been seen since. The IP address resolves to a class C network belonging to pilosoft.com, an ISP in New York City. I shall be writing them tomorrow.
The chart and status reports have been updated. With the week-end over, the number of hits has begun to rise to the levels seen late last week (although so far lower than the peak), Interestingly, the divergence between the number of unique hosts and the number of hits which first appeared on 2004-01-30 has persisted. If you examine the heavy hitters report you'll see that the sites that lead the list seem to be hitting much more frequently than the typical rate of once every four minutes. It's as if whatever is doing this can have multiple infections of a given system or launch multiple independent processes which hit independently of one another. As the “multiple hitters” come to dominate the list, the number of hits and number of independent hosts decouple.
From yesterday's data, it's clear there wasn't a drop-off in the attacks during the Super Bowl as I speculated might happen if this was a DDoS network assembled for attacking gambling sites. Of course, maybe the extortionists didn't have a target to direct this particular network toward.
I've run my program which looks for early instances of attacks following the pattern on archived historical HTTP access logs (I have all of them, back to 1994 when the site opened) as far back as August 11th, 2003, and as far back as I've gone I've seen small numbers of sites hitting with the classic profile of the current mass attack. See the early attacks report for details. (Note: I haven't merged items from different log files, so some IP addresses are reported multiple times from the individual logs they were found within. The list is in ascending order of the date of the first packet received from the site.)
All the usual stuff has been updated. I've spent today scanning historical logs to see if I can identify earlier and earlier instances of accesses which match the pattern of the attack. I've found one as early as 2003-01-12 but none in logs from mid-2002. It takes a long time to scan these logs, and I'll post additional details tomorrow. Meanwhile, I've updated the early attacks page to show the older instances I found today.
I also used “snoop” (the Solaris equivalent of “tcpdump”) to monitor and dump incoming packets from a variety of hosts hitting the site. Packets from hosts hitting about every four minutes are identical. The mystery of why a few IP addresses hit much more frequently is now solved: they are proxy servers relaying requests from hosts behind them. I've posted packet dumps of examples of direct hit and proxied requests. Note that the HTTP GET request includes the specification “Pragma: no-cache”, which forces the request, if relayed through a proxy server, to go ahead and hit my site anyway rather than serving a cached copy. Also note that that's the only specification in the HTTP header: no referer, no user agent, or anything else which might identify the program sending the request. Needless to say, these hits don't look like they're coming from a Web browser. Finally, note that the host name is specified—the request is not routed by IP address but includes the host name. This means that the strategy of moving the target site from an IP address wildcard to name-based virtual hosting and discarding all non-“Host:” specified requests, as suggested by one comment on Slashdot will not block these requests.
I still don't know for sure whether changing the server's IP address would help, but given that the HTTP requests include the domain name, that now seems far less probable. An IP address change might escape currently-hitting hosts, but as you can see from the chart and time evolution table, new hosts are continuously being recruited, and it's hard to imagine that they would include the domain name in the HTTP request but hard-code the IP address rather than doing a DNS lookup for it when the attacking program starts.
The heavy hitters and time evolution reports and the chart earlier in the document have been updated as of an hour ago. I've removed sites with fewer than 400 hits from the heavy hitter list since it was getting a bit long to download.
The peak hit rate remains steady at about 20,000 hits per hour with the divergence between unique hosts and hit rate remaining constant. There's no obvious change in the trend of newly recruited hosts. As I've observed before, there's no way to distinguish a genuinely new host from one with a floating IP address which has changed from one session to the next.
I have continued to scan historical logs and have not so far found anything matching the pattern of the attack earlier than the 2003-01-12 access I reported yesterday. By tomorrow a complete scan from August 2002 through the present should be complete and I'll update the early attacks document to include everything in that interval. The programs I use to scan historical logs are now included in the analysis tools download at the bottom of this page, but they are extremely specific to how the servers are set up here and may be completely useless at sites with different configurations.
A slight fall-off today, although still above the levels of last week-end, as the updated chart and reports show.
Today's real news is the completion of the historical log scan and discovery of the date of the first access which matches the pattern of the current attack. Complete details are in the time evolution document—please read it. Because this information is of ongoing value, I'll leave the details there for any reader who consults that report, as opposed to including them only in this daily update. Eventually, I'll probably integrate that information into this document, but heaven knows when I'll find the time for that. In short, the evidence so far is that the attack began at 18:16:31 local time (17:16:31 UTC) on 2002-12-06 from IP address 68.35.92.91 which resolves to bgp01391858bgs.sequoa01.nm.comcast.net with “HEAD” requests. On 2003-01-12 at 06:22:55 local time, these switched to “GET” requests with only a few “HEAD”s seen in the next few days. This same IP address continued to pound away until 2003-11-25 at 10:49:02 local time, sending a total of 43,249 home page hits during this interval without a single hit to any other page on the site. This IP address has never been seen since in an attack log.
A couple more extremely arcane Perl programs have been added to the analysis tools download at the bottom of this page. These allow “retro-resolving” IP addresses that weren't resolved at the time a report was prepared and then merging the domain names back into the report. The attack onset program now looks for “HEAD” as well as “GET” requests.
Everything has been updated. The hit statistics are about the same as yesterday. There was a sharp down spike in the number of unique hosts and new hosts recruited around 08:00 local time today; I have no idea what this means. Similar down spikes appear elsewhere on the graph, and not necessarily at the same time of day.
I've extended the search for the first attack as far back as 2002-09-27 and found nothing earlier than the 2002-12-06 attack discovered yesterday. This appears to be when it all began.
The number of unique hosts with 100 or more hits passed the 10,000 mark today.
I've been occupied by a multitude of other matters today, so apart from updating the chart and reports, I've done no new analysis nor had any new insights to pursue. I think I may make a program which analyses packet dumps of proxy hosts to tease out the original requesters behind the proxies. This would permit contacting the abuse desks of some of the ISPs which are hitting particularly hard and furnishing them a list of culprits behind their proxies in the hope they may force those users to run antivirus and adware/spyware removal programs, with the goal of obtaining logs which narrow down which is actually responsible for the hits.
There's no significant change in the intensity of the attack today compared to yesterday.
Once again, I've done little other than update the status reports. Today's low was lower than yesterday's, but the high was higher. It will be interesting to see how this develops as we get back into the work week. A discussion with a friend caused me look at all the hits from .MIL sites; presumably, their system administrators would be highly motivated to track down infected hosts on their networks. There have been a few, but interestingly they all seem to stop before too long and the last (an unusually infrequent hitter) last hit on the evening of February 7th. I'll keep an eye on this. The vast majority of sites remain what appear to be home Windows machines.
Still occupied trying to push a particularly recalcitrant project out the door, I've done nothing other than update the chart and reports. Today's high was back at the 20,000 hits per hour high seen last week, so there's no sign of abatement. To avoid hitting the 2 Gb wall, I had to cycle the HTTP log file tonight, so starting tomorrow the reports will be split into a 2004-01-21 to 2004-02-09 historical report and an ongoing report starting at 2004-02-10.
So much for hopes of the attack abating. Today's update marks the fourth consecutive day of higher highs and the third of higher lows, with the peak hit rate popping above 20,000 hits per hour for the first time since January 28th. Interestingly, while the number of unique hosts tracks the hit rate quite closely, there's no corresponding trend in newly seen IP addresses (although they certainly do continue to show up—today the total number of unique IP addresses seen since the start of the attack passed 14,000).
After a little reflection, I figured out how to merge the old and new HTTP logs when preparing the reports, so I'll stay with merged data until it proves unwieldy.
Today's peak was lower, but there was a brief network outage right around the time of the peak (you can see it in the down-spike in unique hosts and new hosts in today's data on the chart). I'm still working on other things, and in any case haven't had any brilliant insights on how to proceed. None of the commercial sites I've written to report what appear to me to be easy to identify infected hosts have replied to my inquiries so far.
Maximum hits per hour was back over 20,000 for two consecutive hours. And now we sail into Friday the Thirteenth.
Well, I started this Friday the 13th by putting away a ladder, and this afternoon I saw a black cat. Far more interesting, however, is what happened today on the attack front—the end may be at hand. After yesterday's peak, the hits per hour, unique hosts, and newly recruited hosts all declined on the customary diurnal pattern, turned up at a higher low than the previous two days, and then abruptly began to decline like somebody pulled the drain plug out of the bathtub.
Starting at 2004-02-13 11:00 local time (which, interestingly enough, is 12:00 UTC, and hence midnight of the 14th at the International Date Line), hits per hour and unique hosts hitting the site began to rapidly fall and have continued to drop monotonically ever since. Further, new hosts (attacks from IP addresses never seen in earlier attacks) dropped from 104 in the 12:00 local time bin to 34 in the 13:00 time bin, then 23 and 16 in the next two hours. Since then the number has continued to fall, although not monotonically, to a mean of around 10 in the last few hours. All of these are levels not seen since the attack began spooling up in earnest on 2004-01-25. The chart shows how dramatic this abatement has been.
These events put the “Mydoom precursor” hypothesis back into play. Recall that the main attack began a couple of days before Mydoom was unleashed in its full ferocity (although, as noted in earlier updates, there is abundant evidence of low-level attacks matching this pattern for over a year before the main assault began). Further, the requests this site was hit with were precisely of the form directed at SCO—repeated hits on the home page without any other requests. Disassembly and analysis of the Mydoom worm indicated that its attack on SCO had a timeout—it was pre-programmed to cease on February 12th. And lo and behold, if the current trend continues, it looks like the attack against this site began to wind down on a specific date—in this case as January 14 arrived around the globe. (Or, perhaps, it was triggered to stop at noon local time on the 13th, and we're seeing noon pass through the regions most heavily populated with hosts. It should be possible to tease this out of the heavy hitters report by analysing the time of the last packet sent from hosts whose time zone can be inferred from their top level domain.) Of course any analysis based on the local time of Windows machines is necessarily fuzzy because most of these machines do not run NTP or any other clock synchronisation protocol and may have clocks set mildly or wildly incorrectly.
While things are definitely looking up, it bears keeping in mind that the administrator of the other site who's seen similar attacks (see the update for 2004-01-30) observed two intervals during which the attack against his site almost entirely ceased, then spontaneously resumed. We'll see what the morrow brings.
And so, that's it then, apparently. The hit rate and unique hosts have continued to decline almost symmetrically to the ramp-up at the start of the attack, while the new hosts recruited has dropped to near zero as the attack drains down. Who knows where it came from? Who knows what motivated it? Who knows why Fourmilab was the target? Who knows why it started and then stopped when it did? Let's hope this is indeed the end, and not a pause. In any case, complete logs covering the entire period of the attack will remain available for whatever retrospective forensic analysis might help identify the perpetrator of this attack and thereby deter future aggression against other sites.
The hit rate having continued to crater, this appears to be the end of the attack. On that happy (albeit perhaps premature) note, I've linked this page to the Fourmilab home page and updated all the reports with the intent that this will be the last update unless this attack, not over as it presently appears to be, assaults us again.
One should never, ever, be overly optimistic about the spontaneous renunciation of evil. The attack, which started winding down on the 13th of February and rapidly reached negligible levels, came back on the 17th, ramping up as quickly as the initial onset and peaking (at this writing) at half the maximum rate of the first go-round. I've updated the chart and reports, and tomorrow I'll see what insights I can find in the logs from the second coming of the attack.
There's no doubt about it—the second wave of the attack has arrived. So far, the maximum hit rate per hour has been 13183, substantially lower than the peaks near 20000 seen in the first go-round. Unique hosts and new IP addresses seen per hour are correspondingly lower. What is fascinating is that the other site that's being hit with this attack (see the update for 2004-01-30) observed precisely the same cessation and resumption of the attack, at the same time it was seen here. It is virtually certain that both of our sites are being hit with the same thing, but what and why remains a mystery.
The attack continues, with the third day of higher highs since resumption of the attack on 2004-02-17. I've still received no reply from the abuse desk of the ISP whose proxy servers are responsible for the largest number of hits, and they're still hitting me. I did a number of port scans on currently-hitting hosts, and haven't had time to analyse them in depth. I had one big false alarm when I noticed that several machines had port 5000 open, a port which has been used by known trojans. Well, it turns out this port is now used by the “Universal Plug and Play Device Host” service which, idiotically, appears to be on by default in Windows XP. This seems to talk some kind of HTTP protocol, because if you telnet to this port and type in “GET” followed by two carriage returns, you get back an error message which says “HTTP/1.1 400 Bad Request”. All the machines with port 5000 open responded this way. There were a variety of other ports open, but I haven't had a chance to look at them in detail. I found resources which list ports used by known trojans here and here. I also came across this in-depth analysis of a distributed denial of service which provides both technical details of how the attack zombie hosts are remote-controlled and offers an insight into the motivation and psychology of those behind these attacks.
Today's peak was lower, but well within the normal fluctuation and/or wind down toward the week-end. I've spent all day doing signature analysis and investigating amelioration and countermeasure strategies. The tools download has been updated to include a program I'm using in this study, but as it requires a custom Apache log configuration, you can't use it unless you're willing to modify your Apache configuration file accordingly.
The attack continues. The almost complete lack of new hosts recruited between 10:00 and 13:00 local time was the most interesting aspect of the data for 2004-02-22. There were no Internet outages or anomalies during this period, and existing heavy hitters continued to pound away during this interval. I spent all day in the guts of Apache working on mitigation strategies.
The level of the attack is more or less unchanged since yesterday, as the updated chart and reports show. Today I emerged from two days' immersion in the Apache source code with a patch which identifies request packets belonging to the attack whether sent directly from a host or relayed through a proxy. This permits experimentation with “active measures” to mitigate the impact of the attack and/or investigate or retaliate in real time. The first of these active measures was put into operation on 2004-02-22 at 21:46 UTC. I shall not discuss the nature of this or other active measures here, as one never knows who's in the audience. Administrators of other sites under this form of attack may request details and source code (in the form of patches to Apache HTTP Server 2.0.39) by writing me at the E-mail address given above at the end of the original Incident Report document.
This is the first full day since I installed the first Apache fix to respond to the attack. The results are about what I expected. While I was willing to entertain the wildly optimistic notion that it would stop the attack, I never really expected that. What it did do, however, was precisely what I intended it to accomplish—reduce the volume of outbound data for each hit to zero bytes (other than TCP/IP handshaking traffic), and eliminate all overhead within Apache related to retrieving and serving the home page to the requester. You can see the impact of this in the usage statistics chart for February. Note that prior to February 23rd, the Pages, Files, and Hits in the top chart closely tracked one another, but on the 23rd, while the Hits remained high, the Pages and Files counts dropped to typical pre-attack levels; that's the patch in action. Starting around 18:00 local time on the 23rd, the unique IP addresses and new IP addresses per hour started an atypical decline (best seen on the time evolution chart—look at the very right and notice how the blue and green lines made a spike and then began to decline away from the red line, as opposed to following its general shape as before). Tomorrow we'll see if this means anything or is just a blip. I'm still researching alternatives within Apache—I know what I want to try next, but I haven't figured out how to shoehorn it into Apache's memory management and process model. The total number of unique IP addresses seen since the start of the attack topped 20,000 today.
The curious divergence between the hit rate (which continues unchanged) and the number of unique IP addresses and new addresses per hour continues, and at the moment I don't know what to make of it. It is barely (but just barely) possible to envision a scenario where the active measure code I put online at 21:46 UTC on the 22nd is having some kind of delayed effect on the recruitment of new hosts (which would explain why it didn't have an immediate effect), while not affecting hosts behind proxies and/or even causing the proxies to hit harder. It's a real stretch to imagine this, and especially to believe the two effects would just cancel out leaving the total hit rate and diurnal profile the same. In any case, this is something which can be determined from the logs, but it will require putting together a special purpose analysis program since I don't presently distinguish direct hits from those through a proxy (although for the last few days I have been logging information which permits doing so). Tomorrow I'll write and run such a program and report what I find in tomorrow's update.
Today saw a new recent high for the second wave of the attack, with the divergence between hits per hour and unique IP addresses still in evidence. I wrote a program to analyse the contribution of hosts hitting directly compared to those hitting through HTTP proxy servers and produced a chart showing proxy and direct hits and the sum. (The chart begins on 2004-02-22 because that's when I started collecting data which permits distinguishing proxy and direct hits.) As a glance at the chart will reveal, there's been no secular change since that date in the mix of proxy and direct hits, so that cannot explain the divergence of hit rate and unique IPs seen recently. Further, it's clear that although proxies occupy the top slots in the heavy hitters table, they account for a small fraction of the total hits, most of which come from individual machines hitting directly. This is useful information, as one of the active measures I'm thinking about deploying will not work on hosts connected through a proxy. With the total contribution of proxies so small, proceeding with anything which might reduce hits, even if only from directly connecting hosts, is justified. The program used to produce the proxy and direct hit chart has been added to the analysis tools download at the bottom of the page. Note, however, that it requires an HTTP server configured to write a special "forensic log", not one of the standard log formats. Instructions for configuring the Apache HTTP server (version 2) to write such a log are included in the program, proxymix.pl.
Today marked another higher low and higher high. The intensity continues to creep up by all measures, but we're still well below the peaks of the first wave. I've spent all day researching mitigation strategies and hope to begin to test two specific approaches tomorrow using the backup server as a guinea pig. Meanwhile, the total number of hits to the home page since the attack began in late January (counting only those identified as part of the attack, excluding home page references from any IP address which requested any other content from the site) exceeded 10 million yesterday.
Today's peak was a bit lower than that of the last two days; this is consistent with what we've seen for previous Fridays. I've spent all day testing the next round of active measures on the backup server, and things have progressed better than I'd expected. Barring any unforeseen “gotchas”, I may be able to migrate these facilities to the main server later today (the 28th) or tomorrow.
If you've been following the time evolution chart, you'll notice some inconsistencies between previous days' data and today's. In the process of checking out the latest round of active measures code, I discovered that in the second wave of the attack, hits may be submitted in either HTTP/1.1 or HTTP/1.0 protocol. In the first wave, only HTTP/1.1 was used. I modified the analysis programs to take into account hits in both protocols, and this revised upward the hit counts for the second wave.
The last few hours are, as a glance at the chart will reveal, a mess. This is due to my rolling out the second round of active measures, which identifies the IP addresses of attacking hosts and automatically blocks them from ever reaching the HTTP server. This is presently in very crude form, and I'm running it mostly to stumble into bugs in the design and implementation. While it's running and doing its job. however, the attack hits should be drastically reduced. I'll have more to say about this attack blocker once it's checked out and put into routine service.
Major perturbations in the Force today. The second round of Active Measures was rolled out, with several hiccups, drastically reducing the impact of the attack upon the site. I'll write more about this experiment after it's had a bit more time to run and interact with the attackers. Meanwhile, if you're really interested in what I'm doing, download the analysis tools and ponder the Perl program gardol.pl. Gardol relies upon the most excellent system-independent kernel-level packet filtering provided by IP Filter.
In the last two days, I have installed tools to identify and block the attack against my site. Introduction of these measures have rendered reports previously used to track the attack misleading. I will update the status of the attack once I have tools in hand which adequately take into account its mitigation by the tools recently deployed.
The attack remediation tools have been in production without any difficulties for the last three days. The following chart shows the attack in its second wave on 2004-02-26 through 2004-02-28 when I first began to test the facilities to block attacking hosts. There were some fits and starts as I tested various versions of the program and pulled them offline for adjustments and refinement. Since early on 2004-03-02 the automatic blocker has been in continuous operation and the chart indicates how effective it has been in attenuating the impact of the attack (which, however, in terms of the rate of arrival of attacking packets continues unabated).
The residual level of attack is for two principal reasons. First, to avoid accidentally identifying a legitimate user as an attacker, the packet monitor requires that packets both conform to a signature which the attackers match and repeat a specified number of times without sending any other packets not matching the signature. This means that each host newly recruited into the attack will slip a few packets past the filter before it is unambiguously identified as an attacker and blocked. Second, and more signficant, is that attack packets which are relayed through ISP HTTP proxy servers are not blocked at the IP address level as are those originating from directly-connected attackers. If proxy servers found to be forwarding attack packets were blocked, innocent users of those servers would find themselves blocked without any explanation. So, I permit the proxy requests to reach the HTTP server, where the second level of defence against the attack discards them. This causes them to show up in the log and appear in the chart above. Since they are discarded immediately upon being identified as attack packets, their impact on the server and outbound network bandwidth (the resources most scarce at this and most Web sites, and hence most vulnerable to an attack of this kind) are negligible. It would be easy enough to block the proxy servers and, say, generate an automatic E-mail to the abuse desk of the ISP they belong to indicating why. If I don't get any more response, not to speak of assistance, on the part of the ISPs who are relaying these attacks by their customers, it may come to that.
In other news, today I heard from an administrator at yet another site which is experiencing this same attack—thousands of hosts hitting, most every four minutes. They are currently seeing about 3200 distinct IP addresses hitting them, which is consistent with what I saw in the second phase of the attack, and also the attack reported by the other site I discussed in the update for 2004-01-30. That makes three sites so far under this attack. One wonders how many more of us there are.
Today the attack blocker intercepted and discarded its five millionth packet since being put into service. I'm in the process of documenting the tools I've developed to respond to the attack, which I'll make available to other sites under attack. Having suffered occasional DDoS attacks on other occasions over the last few years (although nothing remotely severe as this one), I've tried to make the detection and blocking facility flexible enough to be deployed against most attacks which do not saturate a site's inbound bandwidth.
The remediation tools (attack identifier and dynamic IP blocker) continue to run smoothly and reduce the impact of the attack on server load and outbound bandwidth to almost the noise level. Today, the IP filter blocked its ten millionth attack packet since remediation began. I've updated the post-remediation chart above through 19:00 local time on 2004-03-09.
Having minimised the impact of the attack on the site about as much as possible by measures within the target zone, I've spent the last few days researching possible sources of the attack, and I have found a very promising candidate—not the person or persons responsible for launching the attack, but the software which may be responsible for sending the packets hitting the site. At this point, I have no probative evidence that this software is responsible, but every characteristic of the software in question is completely consistent with the nature of the attack, and I have discovered no discrepancies between what I'm seeing on the receiving end and which I would expect to see were this software responsible for the attack.
I am sure you will be neither shocked nor stunned to learn who is responsible for this software: Microsoft. The facility in question is a "feature" introduced with Windows XP (and which may be retrofitted to Windows 98 and Me) called Universal Plug and Play (UPnP), which had such a catastrophic potential impact that the United States Federal Bureau of Investigation's National Infrastructure Protection Center (now part of the Department of Homeland Security) issued a warning about it on December 20, 2001. Basically, any unpatched XP system (or 98/Me with UPnP installed), becomes a wide-open Web server which can be used to bombard any host on the Internet with packets. Detailed information about the UPnP vulnerability is available in the following documents.
Now, at this point, I have absolutely no hard proof whatsoever that the UPnP vulnerability is what's causing the attack. But what is obvious is that it could be used to mount such an attack and, if it were, the resulting attack would have precisely the characteristics of the attack seen at this and the two other sites who are enduring it. All a potential attacker needs to do is take the already-posted exploit program, plug in the URL of the site to be attacked, then aim it at ranges of IP addresses likely to contain vulnerable PCs, of which there are doubtless bazillions, since any original installation of XP which has not been patched and can receive a UDP packet on port 1900 can be recruited into the attack. What's more, the very first evidence of this kind of attack was seen in the log file here on 2003-12-06, just two weeks before the vulnerability was publicly announced (see the update for 2004-02-06), and the attack on one of the other sites who've been hit became apparent in early 2002, shortly after the vulnerability was announced and the exploit code published. “If it looks like a duck, and it walks like a duck, and it quacks like a duck…”.
Only actually identifying a host participating in the attack and inspecting its registry settings (or wherever the UPnP poll address it kept) can determine for certain whether UPnP is the source of the attack. But, if it is, one can make some predictions which are falsifiable on the receiving end of the attack. If the evidence seen is inconsistent with the predictions, then UPnP can be ruled out as the cause. The most obvious prediction is the following: Any attacking host which is not behind a firewall, NAT box, or other filter, should have the ports used by UPnP (5000/tcp and 1900/udp) open. If a large population of attacking hosts do not fit this profile, then they are probably not XP machines running UPnP and could not be the source of an attack exploiting it.
For a period of 16 hours, I monitored each newly attacking host and, immediately after receiving an attack packet from it, launched a nmap port scan upon it, checking ports 1900 and 5000. Port 5000 is a TCP port, for which there are three possible results from the scan: “open”, which means the machine will accept connections on that port, “closed”, which means no service is listening on that port, and “filtered”, which generally indicates the port is behind a firewall of some kind. I scanned a total of 3666 attacking hosts, and found the following for port 5000:
Open | 1010 |
---|---|
Closed | 753 |
Filtered | 1903 |
The high fraction of “Filtered” results is consistent with most of these hosts being home PCs connected to the Internet with NAT boxes which do not permit inbound connections on TCP ports. A “Closed” result does not necessarily mean the machine is not listening on a port—many firewalls, including the one at this site, respond to attempted connections on prohibited ports as if they were closed—these could be, for example, machines on university networks behind a campus firewall (but which could be recruited into the attack by another machine on the local network behind the firewall). Port scanning takes a while, and since I scanned hosts serially, there was often a delay between receipt of the attack packet and the port scan of its sender. It's possible, therefore, that some hosts may have disconnected in the interim, which would yield a “Closed” status.
Port 1900, where the “NOTIFY” request is sent to direct a vulnerable machine toward the target site, is a UDP port. UDP is a “fire and forget” protocol, and there is no way to distinguish a UDP port blocked by a firewall from an open UDP port. Consequently, a port scan will report only “Open” or “Closed” for port 1900. Here's the result:
Open | 2961 |
---|---|
Closed | 705 |
This result is largely consistent with the hypothesis, as the number of machines with port 1900 found “Open” (2961) is quite close to the sum of machines with port 5000 “Open” or “Filtered” (2913). Additional “data mining” studies are possible based on the port scan database (for example, do all machines connected through a given ISP or .edu domain have the same filtering profile?). I will report whatever I discover in subsequent updates.
The final question is, if this is indeed the source of the attack, is there anything that can be done to identify who is responsible for it and/or to cause the attack to abate? Several possibilities come to mind in this regard.
Honeypot. Machines would be recruited into a UPnP DDoS attack by sending UDP messages to their port 1900. Wouldn't it fascinating to know the IP address from which these recruitment messages is originating? Monitoring external attempts to connect to port 1900 to machines on a local network might permit this. For the last two days I have been monitoring my firewall logs for such traffic, and have seen none. I also checked for attempts to connect to port 5000, as that might be used to determine if the machine was an XP host which was potentially vulnerable. But I haven't seen any port 5000 hits in the firewall log either. Of course, the attacker may not be scanning my piddly little Class C network when there's ISP address spaces rich in Windows machines to harvest.
Reach out and whack someone. Given the exploit code, one obvious approach is to respond to a packet from an attacking host by sending a packet back to its port 1900, directing it to hit somebody else. (One obvious candidate comes to mind!) Since UDP is fire and forget, the only confirmation you'd have that this worked is observing the packets ceasing to hit your site (and, if you've directed the attacker at another site you control, arrive there). Ideally, one would like to test locally with a vulnerable XP machine under your control. Unfortunately, all of the XP machines at this site have been patched and I haven't been able to perform this local test so far. I have tried to set a number of attacking hosts to hit a different machine on my network, without success so far. But then there may be some error in what I'm sending to the remote hosts. If this can be made to work, it could put an end to attacks from non-firewalled machines in short order.
Reply in kind. An attacking host “thinks” it's contacting a Universal Plug and Play server. Well, rather than sending our home page, upon detecting that a host was attacking, we could respond as if we were one, ideally sending something back which would cause the attacker to shut up and not bother us any more. This would be an ideal countermeasure, since it would work even for hosts behind firewalls and NAT boxes—since the attacker initiated the connection to us, our reply would always make it back to the attacker. At this moment, I know next to nothing about the UPnP protocol. Obviously I need to study it in detail.
Another approach would be to counter-attack in the same manner the published “chargen” exploit does—send an endless stream of data back to the attacker, eventually filling up its memory and hanging it in a CPU loop (which may get users' attention enough to motivate them to apply the patches which will fix the vulnerability) or, at least, by taking the attacker down, keep it from hitting us. The problem with this is that to take down an attacker, we'd have to send a constant data stream large enough to fill its memory. Few sites, certainly not this one, have sufficient outbound bandwidth to stream hundreds of megabytes of data to each attacking host. Somebody with a big pipe might consider this, but then the impact of this kind of attack on such a site would be tolerable and may not be worth the trouble trying to respond.
Today I heard from the fourth known site under this attack. The site in question belongs to a small hotel in Warsaw, Poland, and is suffering about 5400 hits to its home page (both HTTP/1.1 and HTTP/1.0) per hour. These attacks have the characteristic repetition rate of once every four minutes from most IP addresses, and are based on the site name, not IP address, as this is a name virtual hosted domain. Reviewing access logs which go back to 2001-08-03, the administrator of the site has established that the attack began on 2004-02-17. The attack packets received at this site have precisely the same blank referer and user agent as seen here and at other attacked sites.
After the installation of the remediation software on 2004-03-03, the intensity of the attack (measured now by the greatly-reduced level of hits received before the packet blocking kicks in for an attacking site) remained more of less constant until March 23–24 when it appeared to fall by about a factor of two. (The spike in the chart is an artefact due to my installing a new version of the remediation software, which results in a short-term increase in attacking packets until the new version “learns” the list of currently attacking IP addresses and blocks them.) The attack remained at this reduced level of intensity until around 2004-04-01, when it started to diminish on at a roughly linear rate. By 2004-04-12 the attack had diminished to levels not seen since the “holiday” in mid-February. The number of attacking hosts has declined to a “mere” 400 or so from the thousands seen shortly after the packet blocking was implemented. Looking at the chart above, it's tempting to conclude that something switched off on the first of April, with population of attackers and intensity of the attack declining ever since.
- Heavy Hitters (Most active sites)
- Time Evolution of the attack
- Earliest Apparent Attacks matching the pattern
- Packet Dumps of representative direct and proxy attackers