DNS tunnels is a method used to exchange data on top of DNS traffic. It's generally used to bypass some filtering equipement to communicate between two hosts.
This article is about the detection of a DNS tunnel. Can we find DNS tunnels by analyzing a pcap offline? We will see that the first basic solution (counting subdomain for a domain) is prone to a high number of false positive when applied to real DNS data. We will then propose another approach wich could help for DNS Tunnel detection
1/ DNS tunnels: the theorical part
The theory here is super easy. If a client in a protected network can send DNS requests and receive response, then you can mount a DNS tunnel.
Clients sends encoded data for a specific subdomain of attacker domain and waits for the answer. This way, data is
exchanged. The client just have to pay attention to make unique DNS requests, in order to avoid caching of an intermediary DNS server. Thanks to the multiple type of DNS requests, and to the size of requests, you can get some enough decent bandwidth to do an ssh session.
You can find a lot of resources for building DNS tunnels with a lot of tools:
iodine,
dnscapy,
dnscat2, etc etc.. Those tools provide usually some encryption and helpers to exchange data.
2/ Finding should be super easy by counting requests
The most obvious way to spot a DNS tunnel should be counting the number of subdomain for a specific domain. For each chunk of data, client have to build a new DNS name for the domain owned by attacker.
I captured some DNS traffic while using a DNS tunnel (dnscat2 in that case). I estimate that we can type a lot of harmfull bash commands in dnscat2 in less than 50 DNS requests. Let's use 50 as a threshold: if a domain has more than 50 subdomain, then you can flag it as a DNS tunnel
3/ Ok, let's do it the dirty way: bash+python
There are a lot of way to analyze data, I choose a splitted approach:
- at first, extract interesting data with tshark and bash to a flat file
tshark -2 -r "$1" -R "dns.flags.response == 1" -T fields -e dns.qry.name > doms_qry
- then analyze data with python
This way is convenient for me because I have easy access to the flat file after extraction and before analysis.
mitsurugi@dojo:~/DNS_analyzer/analyze_doms$ ./do_flat.sh ../dnscat2.pcap
Copied unique domains in doms_qry
Number of domains copied: 244 doms_qry
mitsurugi@dojo:~/DNS_analyzer/analyze_doms$ ./analyze.py
###################results#####################
Longest domains is constitued of 6 labels
210a01a0afcdbdc7ba0c620018e34268b55725c38e3452f6fb89aaddc8dd.7d5f153eaafe56a1e7d7e3599d60b753f400516596638349fd29a0d3468b.0edcc3ced25269ca7e8e75121f858c2f5d5c142f14c21d964051c8a15c0b.97601ee531b3fb8e7c321622c48abf5acc99c629.dnscat2.tux
91db01a0afeebc809c889b001975661a47a87c35a271ab34972c8d321c8c.ae573af81e94756d71c633d32188bb0554f1ade983d00e23f206d5cf20c4.a4b41c7b5960c85fb5b7283286c76ac27a37b49b1a6075a9e5727878430d.bd0a421c95229477fd7bbee07c5768ee411e5d7b.dnscat2.tux
15b601e0afcc2d43946b4a001aa7894e4255e74d2d1a481cd8dc578e7c45.82768a7af119a392e3284104fc821624a12d3deaafd91bf138b99f13e6a2.487a65e38f2e594e6d3728a7660b2d5a62595460b7a133670dbfc308dc4b.6447719f4f6e.dnscat2.tux
a9670169561ddb4bb9e05b0013eed9143bb125481944c4c3d12b21664f9f.0bab3f63a255e1f239b71ba4f3db28d5c3f8415346af21e80ed504912211.e9ea5b237042d1f0ee9fb5351ff93b6faf1c5d1ad26120a4a066973ca5ee.e9962c0ca2ee14724031a6269e54e7c31a1d75dc.dnscat2.tux
acd901695e08afb6284a450014bf31ca85d18fb7f6bc023bb69ddbeb4194.91e90cb2788515712616907396bfa530fe785512ba5846280ee953c79f6a.e469e9dcb61508a2b532dbafc82eebb336f0812678cdbc81f58efd73eac6.7ba69eeac8388fa1deb5267771a1d8c866572cfd.dnscat2.tux
Longest domain name is 235 char long:
210a01a0afbdadc7ba0c620018e34268b55725c38e3452f6fb89aaddc8dd.7d5f153eaafe56a157d7e3599d60b753f40051659663a349fd29a0d3468b.0edcc3ccd25269ca7e8e75121f858c2f5d5c142f14c21d964051c8a15c0b.97601ee531b3fb8e7c321622c48abf5acc99c629.dnscat2.tux
91db01a0afdeba809c889b001975661a47a87d35a271ab34972c8d321c8c.ae573af81e94756d31c633d32188bb0554f1ade982d01e23f206d5cf20c4.a4b41c7b5950c85fb5b7283286c76ac27a37b49b1a6075a9e5727878430d.bd0a421c95229477fd7bbee07c5768ee411e5d7b.dnscat2.tux
a96701695e1dda4bb9e05b0013eed9143bb12f481944c4c3d12b21664f9f.0bab3f63a255e1f2f9b71ba4f3db28d5c3f8415346afb1e80ed504912211.e9ea5b237062d1f0ee9fb5351ff93b6faf1c5d1ad26120a4a066973ca5ee.e9962c0ca2ee14724031a6269e54e7c31a1d75dc.dnscat2.tux
acd901695e08afa1284a450014bf31ca85d18ab7f6bc023bb69ddbeb4194.91e90cb278851571b616907396bfa530fe785512ba6856280ee953c79f6a.e469e9dbb6d508a2b532dbafc82eebb336f0812678cdbc81f58efd73eac6.7ba69eeac8388fa1deb5267771a1d8c866572cfd.dnscat2.tux
######### Following domains ###########
Interesting tld for rank 0
tux(243 subdomains)
######### Following domains ###########
Interesting tld for rank 1
dnscat2.tux(243 subdomains)
You should investigate on domains
dnscat2.tux (243 subdomains)
mitsurugi@dojo:~/DNS_analyzer/analyze_doms$
That works remarkably well. If you find this kind of data in your DNS logs, you should investigate quickly.
4/ But, does this work in real life? (spoiler: no)
Well, this work in a test lab. Now, how does this work in real?
I have to say that it's incredibly hard to find real DNS data. See end of blog for details, but I had the chance to put hands on two full pcap capture from an university. I have to name and thanks ISCX
http://www.unb.ca/research/iscx/dataset/iscx-dataset.html : data is legit for research and analysis. Really a big thanks.
Now, I use the exact same programs and let's try if we can find DNS tunnels in those captures:
mitsurugi@dojo:~/DNS_analyzer/analyze_doms$ ./analyze.py
###################results#####################
Longest domains is constitued of 8 labels
www.deloitte.com.edgekey.net.globalredir.akadns.net
images.apple.com.edgesuite.net.globalredir.akadns.net
images.apple.com.edgesuite.net.globalredir.akadns.net
images.apple.com.edgesuite.net.globalredir.akadns.net
images.apple.com.edgesuite.net.globalredir.akadns.net
images.apple.com.edgesuite.net.globalredir.akadns.net
www.intel-sino.com.edgesuite.net.chinaredirector.akadns.net
You fool! Don't make me laugh!
~Soulcalibur - Mitsurugi
s3.getmiro.3.0.com.s3.amazonaws.com
s3.getmiro.3.0.com.s3.amazonaws.com
s3.getmiro.3.0.com.s3.amazonaws.com
Longest domain name is 59 char long:
community-powered-web-search-swicki-swicki.socialsearch.com
community-powered-web-search-swicki-swicki.socialsearch.com
community-powered-web-search-swicki-swicki.socialsearch.com
www.intel-sino.com.edgesuite.net.chinaredirector.akadns.net
######### Following domains ###########
Interesting tld for rank 0
au(54 subdomains); net(1502 subdomains); gov(53 subdomains); jp(96 subdomains); org(195 subdomains); com(3181 subdomains); uk(91 subdomains)
######### Following domains ###########
Interesting tld for rank 1
wordpress.com(52 subdomains); akamaiedge.net(105 subdomains); msn.com(54 subdomains); akamai.net(324 subdomains); edgesuite.net(172 subdomains); akadns.net(98 subdomains); co.uk(73 subdomains); akam.net(51 subdomains); google.com(68 subdomains); yahoo.com(87 subdomains)
######### Following domains ###########
Interesting tld for rank 2
com.edgesuite.net(146 subdomains); g.akamai.net(121 subdomains)
You should investigate on domains
com.edgesuite.net (146 subdomains)
g.akamai.net (121 subdomains)
akamaiedge.net (105 subdomains)
akadns.net (98 subdomains)
yahoo.com (87 subdomains)
co.uk (73 subdomains)
google.com (68 subdomains)
msn.com (54 subdomains)
wordpress.com (52 subdomains)
akam.net (51 subdomains)
mitsurugi@dojo:~/DNS_analyzer/analyze_doms$
OK, so that's a full list of false positive. If I raise the threshold higher, then I can miss some short living DNS tunnel.
I think that we can clearly say that counting the number of subdomain doesn't work in practice.
5/ Can we do better? (spoiler: yes, sort of)
We have some options here:
5/1/ whitelisting domains
I don't like the idea of whitelisting. Usually, maintaining such lists are pita, but it can help you.
5/2/ analyzing only requests of type other than 'A' (and 'AAAA').
Well, you'll end up with a lot of MX and/or SPF records, that's still a good way to lower the noise, but it should help. I didn't see any DNS tunnels relying on A or AAAA type, but hackers have a lot of imagination.
5/3/ calculating entropy of domain.
Yeah, it should works. But you will face other problems: cdn names can look like random, MX name can look totally legit while being DNS data tunnels. More and more file reputations and cloud antivirus relies on TXT DNS requests... and they look totally likes DNS tunnels (a lot of different TXT requests going to the same domain). So, try at your own risks.
5/4/ Analyzing DNS response instead of requests.
It would be a lot better, combined with the previous one. Still, an hacker would be able to use AAAA or A records to exchange data (throughput of tunnel would be lowered by an order of magnitude, but hackers have a lot of time...)
5/5/ Raising threshold
It could work, but we have to agree on a number not to high, not to low. In the end, we must rely to human analysis, and it can't work that way.
6/ Lessons learned
They are some lessons here:
Counting subdomains doesn't work
No, counting subdomains doesn't work at all. It can helps do lower the size of data to analyze, but you can't trust it for sure.
It's hard to grab DNS data (really.)
Sad but true. If you want to share DNS data with me (or know place where you can have some), I would be grateful. Beware, DNS data is an impressive way to gather metadata and has a lot of privacy concerns.
Parsing DNS data is not that easy.
Everybody think they know DNS: "
you ask a fqdn, you get an IP" and it's WRONG! You can end up with a lot of RR, multiple queries, UDP and TCP, endless CNAME requests, and so on. For this kind of research a flat file is enough, but I have to find better way to parse and analyze this kind of structured data.
...and beyond
You can find a LOT of things inside DNS data (more blog posts to come)
0xMitsurugi
You fool! Don't make me laugh!
~Soulcalibur - Mitsurugi