Bitsquatting: DNS Hijacking without exploitation

Bitsquatting refers to the registration of a domain names one bit different than a popular domain. The name comes from typosquatting: the act of registering domain names one key press different than a popular domain. Bitsquatting frequently resolved domain names makes it possible to exploit computer hardware errors via DNS. For more details on bitsquatting my research, please see my Blackhat 2011 whitepaper. Someone has posted a youtube video of my DEFCON 19 talk about this topic. The slides from my DEFCON 19 talk are also available.

Introduction

Computers suffer from errors that manifest as memory corruption of one or more bits. The causes of these errors range from manufacturing defects to environmental factors such as cosmic rays and overheating. While the probability of a single error is minuscule, the amount of Internet-connected hardware is extremely large: there were approximately 5 billion devices connected to the Internet in 2010. The best way to conceptualize a small probability distributed over many rounds is to think of a lottery. The odds of winning the jackpot are infinitesimally small, but given enough players someone will win.

Researchers have exploited bit-errors before in amazing ways. But bit-errors can be detected and exploited in new ways on an Internet-wide scale. One of these means is by bitsquatting, or registering domain names one bit different than frequently resolved domains.

Theory of Operation

When bit-errors occur they can change memory content. Computer memory content has semantic meaning. Sometimes, that meaning will be a domain name. And applications utilizing that memory will use the wrong domain name.

An example can illustrate this more clearly. The following is a binary representation of cnn.com:

01100011 01101110 01101110 0101110 01100011 01101111 01101101
cnn.com
Lets suppose you are using a computer with a bad memory module. You browse to a page (like this one) with a link to cnn.com. You click the link. How many times is the binary representation of cnn.com copied in your computer's memory? As I am writing this, I can think of several:

  • by the TCP/IP stack from kernel to user mode [varies by implementation]
  • by your browser when it parses HTML
  • ... and when it creates an internal representation of the DOM tree
  • ... and when it creates a new HTTP request
  • by your OS APIs during domain resolution

Lets further suppose one of these copy operations writes to the faulty memory module. The binary representation changes by one bit. It now represents con.com.

01100011 01101111 01101110 0101110 01100011 01101111 01101101
con. c o m

Upon clicking the link your browser will navigate to con.com instead of cnn.com.

Experiment

The concept behind the experiment is simple: if bit-errors are indeed mutating domain names in device memory, then these devices must resolve and connect to these bitsquat domains. Therefore bitsquats of frequently resolved domains should be visited by devices around the world.

Execution of the experiment is not so simple. First is the problem of choosing domains to bitsquat. There is a difference between popular websites and frequently resolved domains. There are many frequently resolved domains that few people type or know. These domains belong to the content delivery and advertising networks of the Internet; domains such as fbcdn.net, 2mdn.net, and akamai.com. Content delivery and ad domains also make the best experimental targets as these domains are extremely unlikely to be typed by people. Second, every DNS query must be answered with two answers: one for the original domain and one for the bitsquat domain. This is because the original requestor may be requesting a response for the original name, and will discard responses for invalid domains. For more on this, please read the whitepaper or look at the slides.

For my experiment I registered the following domains. Note: the registration for these has since expired and they are no longer owned by me.

Bitsquat Domain Original Domain
ikamai.net akamai.net
aeazon.com amazon.com
a-azon.com amazon.com
amazgn.com amazon.com
microsmft.com microsoft.com
micrgsoft.com microsoft.com
miarosoft.com microsoft.com
iicrosoft.com microsoft.com
microsnft.com microsoft.com
mhcrosoft.com microsoft.com
eicrosoft.com microsoft.com
mic2osoft.com microsoft.com
micro3oft.com microsoft.com
li6e.com live.com
0mdn.net 2mdn.net
2-dn.net 2mdn.net
2edn.net 2mdn.net
2ldn.net 2mdn.net
2mfn.net 2mdn.net
2mln.net 2mdn.net
2odn.net 2mdn.net
6mdn.net 2mdn.net
fbbdn.net fbcdn.net
fbgdn.net fbcdn.net
gbcdn.net fbcdn.net
fjcdn.net fbcdn.net
dbcdn.net fbcdn.net
roop-servers.net root-servers.net
doublechick.net doubleclick.net
do5bleclick.net doubleclick.net

I used a Python script to answer DNS queries and Apache to log incoming HTTP requests and waited for connections. And to my surprise, devices connected.

Experimental Findings

The following findings are based on my Apache logs from September 26, 2010 to May 5, 2011. Log entries due to search engine crawlers and web-app vulnerability scans were manually filtered. As the process was manual, some crawler/scanner requests may still be counted in these statistics.

Finding 1: Bit-errors can be exploited via DNS

During the logging period there were a total of 52,317 bitsquat requests from 12,949 unique IP addresses. When not counting 3 events that caused extraordinary amounts of traffic, an average of 59 unique IPs per day made HTTP requests to my 32 bitsquat domains. These requests were not typos or other manually entered URLs, and some show signs of several bit errors. Here are some actual examples (with personal data removed):

static.ak.fjcdn.net 109.242.50.xxx "GET /rsrc.php/z67NS/hash/4ys0envq.js HTTP/1.1" "http://www.facebook.com/profile.php?id=xxxxxxxxxx" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; WOW64; Trident/4.0; GTB6.5; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.2; Hotbar 11.0.78.0; OfficeLiveConnector.1.5; OfficeLivePatch.1.3; AskTbZTV/5.8.0.12304)"
msgr.dlservice.mic2osoft.com 213.178.224.xxx "GET /download/A/6/1/A616CCD4-B0CA-4A3D-B975-3EDB38081B38/ar/wlsetup-cvr.exe HTTP/1.1" 404 268 "Microsoft BITS/6.6"
s0.2ldn.net 66.82.9.xxx "GET /879366/flashwrite_1_2.js HTTP/1.1" "http://webmail.satx.rr.com/_uac/adpage.html" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; HPNTDF; AskTB5.2)"
mmv.admob.com 109.175.185.xxx "GET /static/iphone/img/app@2x.png HTTP/1.1" "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; HW iPhone2,1; en_gb) AppleWebKit/525.18.1 (KHTML, like Gecko) (AdMob-iSDK-20101108; iphoneos4.2)"

Finding 2: Not all bit-errors are created equal

Some machines control considerably more traffic than others. While a bit-error in the memory of a PC or phone will only affect one user, a bit-error in a proxy, recursive DNS server, or a database cache may affect thousands of users. Bit-errors in web application caches, DNS resolvers, and a proxy server were all observed in my experiment. For instance, a bit error changing fbcdn.net to fbbdn.net led to more than a thousand Farmville players to make requests to my server.

Finding 3: Mobile and embedded devices may be more affected than traditional hardware

The following graphic shows a comparison of HTTP User-Agents from visitors to Wikipedia during March of 2011 to User-Agents visiting my bitsquat domains. The "other" column, which includes various phones, game consoles, and other embedded devices was considerably more prevalent in the bitsquat visitors. Curiously the are considerably fewer MacOS User-Agents visiting bitsquat domains than there were visiting Wikipedia. I do not have an explanation as to why this is so.

Finding 4: Bitsquat traffic represents a slice of normal traffic

The visitors to my bitsquat domains came from all over the world and included every major operating system and embedded platform. While there were considerable differences in the percentage of visitors using MacOS and mobile platforms, the percentage of visitors using Windows, Linux, Android and iPhones was approximately the same as that of Wikipedia visitors. Additionally for the visitors determined to be in United States via a geoip database, a diurnal pattern corresponding to computer use can be observed.

Finding 5: HTTPS/TLS will not help. DNSSEC will help a tiny bit.

HTTP 1.1 includes a header field called Host. This field is populated with the domain the client thinks it connected to. If the Host header contains the bitsquat domain, then a bit error occurred before domain resolution. If the Host header contains the original domain, the error occurred during domain resolution. In 96% of the cases, the bit-error had occurred prior to DNS resolution.

Transport security technologies such as SSL and TLS are designed to protect the confidentiality, authenticity and integrity of data moving between two nodes. Bit-errors most frequently happen to data when it is "at rest" on one of the nodes. DNSSEC will only resolve the 4% of the cases where a bit error occurred in the resolution process.

The Data

The PCAPs of all DNS traffic are available here: (dnslogs.tar.7z, 56Mb, 7zip compressed)
The HTTP log entries may contain personal information of random people and hence will not be publicly released. If you have legitimate research interest in the HTTP logs, please contact me.

A tool for quick identification of potential bitsqat domains is avilable here: (bitsquat.py, github)

Further research

Duane Wessels of Verisign looked for evidence of network level bit-errors in DNS queries as seen at domain roots. His findings indicate "that bit-level errors in the network are relatively rare and occur at an expected rate." (emphasis mine). The goal of his research was to determine if the 4% of requests with a non-bitsquat Host header were due to corruption of UDP packets after transmission. The final determination was that the packets were very unlikely to be corrupted during transmission on the network. In his own words: "We believe that UDP checksums are effective at preventing 'bitsquat' attacks and other types of errors that occur after a DNS query leaves a DNS resolver and enters the network. Bitsquat errors that occur prior to entering the network, however, will not benefit from UDP checksums since the sender calculates its checksum over the erroneous data."

I fully encourage anyone reading this to replicate my experiments and share their findings. If you would like more information, please feel free to contact me.