UK Honeynet Project

Apr 2005 to Oct 2005

1.0 DEPLOYMENTS

1.1a Current technologies deployed (today)

During this period we have had between one and three active honeynets deployed in the UK at any time. This has been a mixture of GenII (Eeyore) systems and GenIII (Roo) systems, all using the Honeynet Project’s bootable Honeywall CDROMs. A range of honeypots was deployed within each of the three active UK honeynets during this reporting period, and Autumn 2005 sees us mid-migration from Eeyore to Roo on our remaining GenII systems. We are also about to start a large new distributed honeynet resarch project – “Honeynet Central”, and are mid move of ISPs and network provisioning, therefore our current deployments are:

UKA – UK ISP ADSL with new full class C network

Data Control and Capture
- Roo Honeywall
- Sebek v3
Data Analysis
- Walleye web interface
Honeypots
- Redhat 7.3 unpatched
- Fedora Core 1 unpatched
- Solaris 9 on Sun U10 unpatched
- Nepenthes v0.1.2 (~250 IP)

UKB – UK ISP Data Centre

Down for upgrade to Roo in November, and depending on ISP circuit and hardware delivery lead times, probably replaced with “Honeynet Central” in the next reporting period.

UKC – UK University

Decomissioned due to lack of interesting traffic and shortage of power/aircon/rack space! UKC will be replaced by one of the nodes in “Honeynet Central”.

1.1b Current technologies deployed (entire period)

During the period March 2005 to October 2005 our previously deployed honeynets were:

UKA – UK ISP ADSL – Partial Class C Network

Data Control and Capture
- Honeywall v0.69 – Partial Class C
- Sebek 2.1.7 or 2.5.3
- Test Roo Honeywall – Partial Class C
- Sebek v3
- Roo Honeywall – Partial and New Full Class C
- Sebek v3
Data Analysis
- ACID console
- Sebek web interface
- Walleye web interface
- Honeywall upload script
- Honeywall daily analysis script + email alerting
- Honeysnap daily analysis script
Honeypots
- Redhat 7.3 unpatched
- Fedora Core 1 unpatched
- Fedora Core 3 unpatched
- Windows XP Professional SP2 fully patched
- Solaris 9 on Sun U10 unpatched
- Nepenthes v0.x (~1-250 IP)
- MWcollect v2.x (1 IP)

UKB – UK ISP Data Centre

Data Control and Capture
- Honeywall v0.69 – Partial Class C
- Sebek 2.1.7 or 2.5.3
Data Analysis
- ACID console
- Sebek web interface
- Honeywall upload script
- Honeywall daily analysis script + email alerting
- Honeysnap daily analysis script
Honeypots
- Redhat 7.3 unpatched
- Fedora Core 2 unpatched
- Solaris 9 on Sun U10 unpatched
- Solaris 8 on Sun U5 unpatched

UKC – UK University

Data Control and Capture
- Honeywall v0.69 – Partial Class C
- Sebek 2.1.7
Data Analysis
- ACID console
- Sebek web interface
- Honeywall upload script
- Honeywall daily analysis script + email alerting
- Homebrew scripts
Honeypots
- Redhat 7.3 unpatched

During this reporting period we have remained on Eeyore v0.69 for UKB and UKC but routinely upgraded the UKA honeynet during Roo Honeywall development, tested many revised GenIII Roo Honeywall CDROM releases, up to and including the current version (and yum updates).

First generation “eeyore” deployments UKA (roo-002a) and UKB (roo-002b) were actively submitting obfuscated data to the Honeynet Project’s centralised database on Kanga on a daily basis, until this facility was replaced by GenIII technology.

1.2 Lessons learned from the technology, what you like about it.

The GenIII Roo system added many useful features (hard disk installation, Fedora based OS, dynamic patching, real time patch management and security updates, Sebek v3 support, web interface, attack trees, etc).

Sebek v3 works well for us, and the arrival of a Windows port means we can begin deploying Microsoft based honeypots again during the next reporting period.

Walleye and Roo should help with more in-depth incident analysis, and we look forward to capturing our first interesting or novel compromise with full Roo logging.

We still find Honeysnap to be the best initial daily indicator of when analysts need to spend time investigating activity on our honeypots. Arthur had been hard at work on a Perl version, but the recent port to Python and subsequent developement should ensure that this becomes a better supported and more fully featured tool during the next six months.

We have been particularly impressed with the work done by the MWcollect and Nepenthes teams in Germany (and their active rate of change). Previously we had collected little in the way of Windows based malware as we rarely deployed Microsoft based honeypots due to the levels of unwanted noise they attracted. Vulnerability emulation provides a quick and easy way to monitor malware activity on our netblocks, and we expect to post daily analysis reports and Google Maps in the very near future.

Nepenthes – automatic submission of malware for checking is a nice feature – again, we could do with much better with realtime stats. Norman Sandbox and VirusTotal continue to be first point of call for malware analysis, although Nepenthes has taken over much of the routine work. The daily slew of inbound Norman reports makes understanding what activity is occuring very easy.

Larger networks make a big difference, especially for malware collection.

Our newly developed Honeystick is handy and great for honeynet concept and technology demonstrations. We need to investigate its potential uses, from hands-on training to incident response. Although it works for us, we also need to see how it performs in the real world after a public release and decide if anything further can be produced (such as bootable USB malware collectors, honeyclients, including an attack host with pre-built exploits and adding an entire virtual Internet using honeyd). Any good ideas welcome!

The VMware stealth patch produced by the French Honeynet Project ( http://honeynet.rstack.org/tools/vmpatch.c) is useful for quickly disguising VMware honeypots, and hopefully making their use a little safer. We still routinely use the excellent VMware for many development, testing and temporary deployment purposes, and the v5 snapshot improvements are particularly useful.

Plone/Wiki continues to be very useful – we’re starting to share more information, but there is still a long way to go (RSS feeds to push information out, wider-reaching CVS, better mail lists on the way, more (any!) dynamically updated statistical content, Plone based public web site, etc)

Googlemaps with dynamic honeynet data are cool, but what useful or interesting information can we extract from them once the wow factor has worn off? We are working on latitude / longitude and ISP / Domain / Country analysis and will add this data to our web site soon. Drillable overlays on a world map would also be useful (such as pie charts, etc). This is one area of research where we hope to make big gains in the coming months.

Cacti / RRD seems to be rather useful for monitoring health of machines and quickly viewing trends. We plan to make more use of this in the coming months and will merge out current work into the upcoming Roo releases to add web based system management and capacity planning reporting and into Honeysnap for at-a-glance status monitoring.

1.3 Lessons learned from the technology, what is lacking, what you would like to see improved

A quick cut and paste from our previous report – “Data analysis still remains our main concern. We are good at building and operating honeynets, but analysis of captured data is still very time consuming and requires too much human analyst time. Better data analysis and reporting tools, with greater cross group communication and incident response are essential. Near real time alerting and responses are also highly desirable, which again require more powerful automated analysis tools and techniques”. Although we have recorded fewer incidents during this period, this is still probably our biggest challenge today. The last six months seems have done little to improve the situation, but we hope that new tools and techniques will begin to make a difference.

System monitoring is also an issue, and we could do with more of this in the Honeywall; for instance, there’s no real point having a honeywall running when its hflowd daemon has died several days ago. The same problems occur if log generation fills up the machine’s disk space or another key component is not working correctly.

Recently concerns have been raised regarding the need to shut down a honeypot quickly once it’s suspected that crimes are being committed. Unless you have minute-by-minute honeynet alerting, it’s all too easy to miss illegal actions until some considerable time later when you go through the logs.

In future, don’t move forward to new versions until you have ensured the previous features still exist! Roo sometimes felt a little less stable than Eeyore and operated slightly differently to the way we had previously used our honeywalls (ie no centralised data uploads, changes in nightly honeysnap reporting, etc). The upgrade didn`t always seem to be a full improvement.

Moving to a more complex software system obviously increases the need for good pre-release testing and QA. Sometimes Roo upgrades felt like life on the bleeding edge and it was occaisonally a question of crossing your fingers on a live upgrade. Hopefully the Project’s new development and QA process will promote a more structured and systematic approach in the future, although release cycles are likely to slow down as the technology stabilises and is more robustly unit tested.

Having strong, cutting-edge technology is great, but what you do with it is more important. We plan to go back over our historic data and see what extra value we can extract in the coming months (statistics, trends, KYE on IRC activity, etc). Previously we have focused on discrete incidents and events but we need to also complement this work at a macro level.

Probably already been asked, but is there anyone out there who would like to produce a Sebek v3 client for Sparc Solaris? We may well have to stop deploying Sun honeypots if Sebek does not keep up with changes in Roo, and changes in data formats make duel v2/v3 sebek clients less desirable to deploy in the wild.

When running malware collectors, it’s pretty important to gauge the volume of traffic you’ll be receiving. A in built connection or bandwidth throttle would be a useful addition (as would out of the box reporting if more people are to deploy such sensors).

2.0 FINDINGS

2.1 Number and type of systems compromised during six month period

UKA
April and May using Eeyore, one honeypot compromised June to October using Roo, no compromises.

Incident #1 was an unpatched Solaris 9 Sun Ultra 10 system, deployed at 20:50 on the 03/01/05 and compromised at 08:19:58 on the 02/04/05. The attack was a dtspcd exploit by Italian blackhats, who then attacker downloaded and ran “ntpstat”, “psybnc”, “patch-is” and “patch-sun” to install an IRC server and rootkit. Data shared with Italian Honeynet Project and members of the main Honeynet Project.

This was a particularly quiet period for UKA, even after moving to a different full class C network.

UKB
No compromises

This is the first reporting period when we have not recorded a compromised honeypot, although we started the period with a compromised Solaris server still running IRC.

UKC
No compromises (but decomissioned over the summer as university security policy meant network was no longer worth monitoring.

2.2 Highlight any unique findings, attacks, tools, or methods

We have a large amount of black hat IRC activity logged over the previous 18-24 months, including a number of distinct types of blackhat activity. We are currently working on a KYE white paper on IRC trends for publication in 2006.

In terms of malware collection, from a single nepenthes v0.1.2 sensor with 250 IP addresses on a class C network, we recorded the following network worm activity in the last 48 days:

3120416 logged downloads
1248099 logged submissions
3737 unqiue hexdumps
472 unique binaries

All have been submitted to Norman Sandbox. This equates to around 11 compromises per IP per hour for a vulnerable Windows system on UK ADSL IP space, and a expected life span of around 5 minutes once connected to the Internet.

A virus check using ClamAV (0.84/858) identified the binary files as follows:

2.3 Any trends seen in the past six months

During this reporting period we have not documented any particularly unusual activity, and generally our unix based honeypots appear to have remained unhacked for much longer than in previous cycles. Conversely, much more Windows based malware activity has been observed, so we have shifted some of our attention to malware collection and analysis. Current malware collection data and graphical historical traffic analysis reports will be posted to our web site in the coming weeks.

2.4 Document data analysis tools and methods being used

Standard forensic tools: TCPdump, Ethereal, tcprelay, ACID, etc plus Sebek, Chaosreader, Privmsg, Honeysnap and Roo/HFlow/Walleye.

Due to the volume of IRC data we are parsing, basic shell based unix text processing and pattern matching has been in surprisingly regular use!

2.5 For data analysis what tools work well, and what still needs to be developed?

Basic traffic and malware analysis using Cacti make spotting trends in data very easy. Tools that alert or report on change/statistical variance would be useful. For example, using Nepenthes from 30th October:

This graph clearly illustrates a sudden jump of 550 hex dumps in one hour, and is a zoomed view of the graph presented earlier in this report.

During the Chicago 2005 workshop we added a number of new features to the original shell version of Honeysnap, and we think that the following features need to be completed in v1 of the python port:

Basic IRC analysis (via new privmsg.pl)
privmsg messages
timestamp output
spots PING/PON/NICK/JOIN/MODE/PART/QUIT/NOTICE
attempts to spot bot commands and multiple repeat messages
Spot IRC on other ports (via tethereal)
Dealing with new file format
Possibility to run snort over the input file
maybe not needed in py version?
file extraction and naming of files
ftp filenames
still bugs in PASV filename handling?
http filenames
smtp flows
sebek file extraction (via sbk_extract & friends)

We would also like to see the following additional features developed for v2 of honeysnap in python:

Extra IRC reporting options:
- number of messages
- number of unique talker
- number of unique hosts
- number of unique channels
- count of messages per channel
- count of talkers per channel
- new channels seen today
- new talkers seen today per channel
- new hosts seen today per channel
- sysops names for each channel
- splits/joins
- average rate of messages per channel
- average rate of messages per talker
- alerts when these rates significantly alter from historical averages
- number of unique key words
- number of unique key words per channel
- top 10 key words per channel
- new words seen per day per channel
Outbound URL reporting (and split GET/POST + USER/PASS requests)
Better download handling (check for sucess codes first)
FTP username and password reporting
FTP directory listing reporting
String checking against and DB of known:
- IP addresses
- DNS names
- IRC names
- IRC keywords
- Remote servers for download
- Downloaded file MD5sums
- HTTP URLs
- Usernames / passwords
- Filenames
Files in remote directory listings
Mail senders / recipients / subjects

A ‘graphviz’-style connection modeller would also be interesting.

We also intend to start building a local “forensic database” for tracking and correlating distributed data such as probes, attacks, userids, passwords, hostnames, URLs, downloaded files/shellcode, filesums/details, IRC channels, unique strings, etc. Once we have a working prototype we would like to look at making this more open and encouraging some form of (potentially sanitised) data sharing on a global level.

Ethereal continues to be an essential item in the analyst’s toolkit. We have also discussed building an ethereal-like GUI for IRC analysis – something that lets you rapidly and intuitively filter privmsg logs by host/sender/channel/keyword, build groups, follow threads, etc, report on the items above against an applied filter, etc. This would probably be quite useful for IRC heavy log analysis. Anyone interested in writing some code, please get in touch! 🙂

3.0 MISC ACTIVITIES

3.1 Presenting at conferences

Attended Annual Honeynet workshop in Chicago in September 2005.

3.2 Developing, testing or releasing code

Developing:

Honeysnap development – shell and Perl
Modified version of privmsg.pl
Honeystick development and build process documentation
Data visualisation – eg. Google Maps, Cacti / RRD integration

Testing:

Roo (all versions)
Nepenthes v0.x
MWcollect v2.x

Releasing Code:

Honeysnap
New privmsg.pl (Arthur effectively maintaining now)
Honeystick (this is a do-it-yourself concept, rather than code) – see http://www.ukhoneynet.org/honeystick.htm

3.3 Publication of papers

The UK Honeynet Project co-authored the Honeynet Project’s KYE Phishing white paper, published May 2005.

http://www.honeynet.org/papers/phishing

3.4 Involvement in SotM challenges

None in this period.

3.5 Other

Infrastructure:

Steve Mumford has been running an internal Honeynet Project workstream to improve the core Project infrastructure and increase remote collaboration:

Investigation of content management systems that would satisfy the Project’s requirements for a communication server able to store and centralise information
Working on a specification for an internal server suitable for the above solution (thanks to Lance and the Honeynet Project for funding this!)
Thanks to Ryan Smith, Tony Petz and the other UT guys for installing and hosting the hardware
Initial setup of server with Plone, Mailman and AWStats
Honeynet Workshop – working on layout, structure, htDig searching and PGP encryption for mail lists

Future work:

Fulltext indexing for the Plone system
Rolling out maillists to replace current versions
Documentation / howtos
Project management plugins (eg. work trackers)

HoneyStick:

We have created HoneyStick – a bootable USB honeynet system. For more details see this page.

IP Geolocation database:

We have recently purchased a subscription for the IP2LOCATION database and have been experimenting with latitude, longitude, country and ISP analysis. We will make the results of this public once we’ve finalised the presentation.

Other Organisations

During this period we have started to work with the following organisations:

The mwcollect alliance ( https://alliance.mwcollect.org/) . Joined and have data to submit once everything is ready, and will deploy a v3 sensor next week.
The Leurre Honeynet Project ( http://www.leurrecom.org) . NDA signed, hardware ready and CDROM downloaded and ready to deploy next week.
Agreed to share UK collected malware with CERT.

4.0 ORGANIZATIONAL

4.1 Changes in your structure of your organization

No significant changes during this period, except for more face to face meetings and free time for Honeynet RnD.

5.0 LESSONS LEARNED

5.1 What positive things can you share with the community, so they can replicate your success

Regular face to face brainstorming sessions are extremely valuable and can generate a lot of interesting ideas – especially over a curry 🙂 We make more concerted progress when we meet up regularly and keep our momentum going.
The Chicago workshop was a great catalyst; perhaps more frequent, regionalised get-togethers would be useful (or meeting up a major events/conferences for an extra couple of days of workshops)?
Our internal group Wiki is proving very useful for group communication and collaboration, especially during a period of time when people were geographically separated.
Tools like honeysnap can take out a lot of the manual labour in operating honeynets, allowing us to easily focus our (fairly limited) resources on incidents worthy of further investigation.
Data captured from IRC sessions has provided us with a much richer data set than IDS and firewall logs, exploits or sebek keystroke logs. Much of what we have learned during of blackhat activity during this period has been from IRC

5.2 What mistakes can you share with the community, so they don’t make the same mistakes

Although it might be time consuming, always securely wipe a hard drive before installing a new honeypot. In addition, always make a full disk image after Sebek has been tested but before the honeypot is made live, to avoid being unable to easily confirm if a file on the file system was changed.

When working on whitepapers, particularly with international groups speaking different languages, agree on a common tool set and control method for changing documents, concrete goals and version control. If you don`t already have it, set up CVS/SVN and use this to manage your drafts.

Always assume that your initial estimates for analysis time will be massively shorter than the reality of sizeable PCAP files or endless Romanian IRC! 🙂

6.0 FUTURE GOALS

6.1a Plans/Goals for previous six months

Re-activate our public web site [met]
Release a small number of useful tools and papers [met]
Work on platform development and testing of next-gen Honeywall CDROM, plus QA process [met]
Participate in Honeynet conferences in Europe (Aachen May 2004) and America (Chicago Autumn 2004) [met]
Start publishing basic attack statistics [almost ready]
Make more time for honeynet research and other activity [ongoing]
Improve timeliness of incident response and data analysis [ongoing]
Design and begin deploying a much larger distributed honeynet system in the UK (“Honeynet Central”), if possible with funding [ongoing]
Begin fundraising and seeking external sponsorship [ongoing]
Produce a bootable LiveCD version of the next generation “roo” platform [not met, but HoneyStick released]
Bring in a couple of additional core team members [not met]
Help to organise logistics for another UK Honeynet conference with the Network Defence team at GCHQ (September 2004) [not met]
Form connections with legal experts in the UK and assess legal position of honeynet data capture – particularly IRC [not met]

6.1b Plans/Goals for next six months

External Systems

Upgrade public website (Plone) and start to post material to it regularly
Use news publishing interface to keep news current and allow more people to contribute
Upload statistics graphs / charts / data to the website automatically
Add dynamic statistics and reporting
Set up / revive a general mail list for interested parties
Start publishing basic attack statistics

Internal Systems

Move internal Wiki to Plone
Improve internal systems reporting
Move to a bigger server for data processing
Improve data backups

Organisational

Make more time for honeynet research and other activity
Continue fundraising and seeking external sponsorship
Bring in a couple of additional core team members
Improve timeliness of incident response and data analysis
Start “handlers diary” for incidents
Develop contacts organisations such as Jill Dando Cybercrime Research Centre and National Hi-Tech Crime Unit

Events

Participate in DoD / DoE / NSA conference (Washington State Dec 2005)
Help to organise logistics for another UK Honeynet conference with the Network Defence team at GCHQ (Feb 2006)
Form connections with legal experts in the UK and assess legal position of honeynet data capture – particularly IRC

Deployments

Continue to operate existing Roo honeynets
Upgrade UKB to GenIII
Begin deploying a much larger distributed honeynet system in the UK (“Honeynet Central”), if possible with improved funding. More details to be posted to the web site in the coming weeks.
Future honeypots for UKA:
- Ground hog honeypot (self deploying virtual honeypot that automatically extracts downloaded malware and restarts itself
- Mac Mini honeypot (ready to buy hardware)
- XBox honeypot (ready to buy hardware)
- SGI honeypot (ready to deploy now)
- GSX Server honeypots with FreeBSD, WinXP, Win98 (ready to deploy now)

RnD

Continue developing Honeysnap features and working with Jed Haile to create stable production version for public release
Move to python internally for as much development as possible
Create new data processing scripts for log analysis and data presentation
Release a number of useful tools and articles
Complete KYE Blackhat IRC white paper
Submit a paper to IEEE West Point
Begin another KYE whitepaper
Continue developing improvements to HoneyStick
Produce a bootable LiveCD version of the next generation “roo” platform
Investigate automatic deployment / decommissioning of shortlived honeypots
Evaluate building a dedicated IRC analysis tool
Begin developing “forensic database” concept and produce prototype