Mar 2006 to Apr 2007

1.0 DEPLOYMENTS

1.1a Previous technologies deployed

During this reporting period we changed our previous approach to our UK honeynet deployments and phased out our remaining active physical high interaction honeynets (a mixture of GenII (Eeyore) systems and GenIII (Roo) systems, all using the Honeynet Project’s bootable Honeywall CDROMs).

Our end of life physical honeynet deployment configurations were:

UKA – UK ISP ADSL (full class C network)

  • Data Control and Capture
    • Roo Honeywall
    • Sebek v3
  • Data Analysis
    • Walleye web interface
    • Honeywall daily analysis script + email alerting
    • Honeysnap daily analysis script
  • Honeypots
    • Redhat 7.3 on Intel unpatched
    • Fedora Core 1 on Intel unpatched
    • Solaris 9 on Sun U10 unpatched
    • Malware collection on Debian Intel using Nepenthes (250 IP)
    • Leurre.com UK node (4 IP)

UKB – UK ISP Co-Locate (partial class C network)

This long running UKB Eeyore based honeynet was decommissioned within the first month of this reporting period, with little valuable data being recorded.

1.1b Current technologies deployed

Our aging physical honeynet deployments were replaced with a combination of three main streams of activity:

  1. Ongoing malware collection using subsequent generations of Nepenthes software running on dedicated physical or virtual machines. This activity ran almost continually, based on between 1 and 250 public IP addresses, and submitted malware for analysis to various shared malware repositories such as alliance.mwcollect.org, Norman Sandbox, CWSandbox and VirusTotal.
  2. Regular development and testing deployments of our soon to be released VMWare Server based Groundhog automated high interaction virtual honeynet.
  3. From September 2006, development and test nodes for the Global Distributed Honeynet (GDH) project technology, now live. In a departure from our historical deployment activities, this project is also based on VMWare Server powered virtual honeynets.

Individual members have also deployed technology such as:

  • Honeybow sensor (Windows client honeypot)
  • Various types of sandboxes/sandnets
  • CaptureHPC (Windows client honeypot)
  • GoogleHackHoneypot (single IP)
  • Phphop (single IP)

in more limited local deployments.

1.2 Activity Timeline

The early part of this reporting period was focused on decommissioning our existing honeynet systems and analysing any remaining data. We then moved initially to malware collection and evaluation of various low and high interaction solutions for highly automated malware collection, including development of a automated high interaction honeypot system called Groundhog which we plan to release to the public and publish more detailed information about in the near future.

The UK Honeynet Project were the official Honeywall CDROM test centre for Roo version 1.1, ending February 2007, helping to establish a more formal QA test process and generating 50+ associated defects in the Bugzilla tracking system.

During April 2006 we set up two identical malware collection sensors with 100 IP addresses each, alternately interspersed on a Class C network with 2Mbit/sec of Internet connectivity. One sensor system was build using the latest version of Nepenthes and one was build using the latest version of MWCollect, to enable a direct comparison of both technologies. The Nepenthes sensor had collected roughly 4x as many malware samples when the test was stopped due to the MWcollect and Nepenthes teams deciding to merge there development efforts into a single Nepenthes work stream. We therefore chose Nepenthes for our long term automated malware collection needs, and have operated a range of sensors continually on multiple Fedora Core and Debian systems since then.

We continued to spend time tracking and investigating various spam and phishing scams, particularly related to the early period of Rock Phish and tracked various underworld forums and botnets used for carding, phishing and identity theft. This included regular sharing of data and discussions with other Research Alliance groups.

In August 2006 we proposed that the Honeynet Project and Research Alliance should attempt to design, deploy and operate a distributed honeynet based on existing GenIII honeynet technology. The Global Distributed Honeynet (GDH) initiative grew from this alternative to the on-hold ‘Honeynet Central’ project, and almost all UK honeynet R&D activity has been focused on the GDH project between October 2006 and April 2007, with the UK team leading all aspects of this long term international research project. GDH Phase One is expected to complete during May/June 2007, with monthly internal status reporting having been delivered, and a full status report will be included in our next public bi-annual report.

We have continued to work on Point of Sale (POS) systems and the investigation of their potential as high value honeypots, and hope to publish a present some initial finds in the coming months.

2.0 Findings

2.1 Highlight any unique findings, attacks, tools, or methods

Due to the UK’s full time commitment to GDH for the past six months, almost all unique findings will form part of this project’s status reports and eventually published as a public Know Your Enemy: GDH whitepaper later in 2007. Internal status reports about activity around GDH to date are available on the internal project server, with summary report on the past three months activity to be completed by the end of June 2007.

2.2 Any trends seen in the past twelve months

Windows malware attack rates remained significant, with the expected mean time to compromise of an unpatched Windows XP host remaining between 5-10 minutes. However, the addition of Windows XP SP2 has improved this situation considerably, and we have just started a study of this increased survivability.

We still experience high volumes of SSH mass scanning activity, coupled with regular brute force or dictionary attacks, on many of our honeypot and production systems. We now run the Kojoney honeypot to analyse some of these incidents.

Compromised honeypots still almost always seem to have a PsyBNC IRC server installation, and the most common attacker language continues to be Romanian. Blackhat IRC traffic continues to suggest that mass scanning, compromises of a wide range of systems and malicious activity aimed at financial fraud remain common place on the Internet.

2.3 What are you using for data analysis? What is working well, and what is missing, what data analysis functionality would you like to see developed?

Honeysnap continues to be our main daily indicator of when an analyst need to spend time investigating activity on our honeypots. This tool is under daily development, with Arthur Clune now leading python activities and a number of others making regular code contributions too. Many new features have been recently added, such as better IRC handling, SOCKS proxy reporting and DNS query extraction, and we have also recently merged in most of the capabilities of our unreleased Honeymine tool too.

We recently forked the public honeysnap code base to support ongoing GDH’s data analysis requirements, and a database driven honeysnap_db variant is now well under way, using SQLAlchemy as the ORM and TurboGears for the front end interface. There are a large number of requirements currently being addressed, and we’ll include our todo list and wish list in the first public release. We hope to release working code and more details in the next reporting period, as part of our external publications on the GDH project.

Currently the biggest data analysis issue we face is producing a viable solution to easily determine the attacker’s source IP address for a particular Sebek keystroke at the command line. We have so far failed to find anyone willing or able to solve this challenge (ideally a modification to sebek_extract + sbk_ks_log.pl that optionally inserts the source IP address as an field in each output line), and we firmly believe that this represents the most critical flaw in the Honeynet Project’s technology today. In an ideal world, the estimated source OS variant (determined via p0f) also being included would round things off nicely, so if anyone reading this is interested and up for a challenge, please get in touch!

3.0 Lessons Learned

3.1 What new positive things can you share with the community, so they can replicate your success?

Managing lots of honeynets in GDH is extremely time consuming. GenIII technology needs to be further developed and improved before this becomes practical in general.

Deploying international nodes through volunteers, without standardised hardware platforms throws up many logistical challenges. Large scale honeynet deployments are are almost certainly easier when standardised honeyfarms can be utilised.

Current data analysis tools are really not yet up the job of high volume, distributed data analysis of multiple linked incidents. Until our DA tools improve, expect to spend a lot of time manually investigating incidents!

3.2 What new mistakes can you share with the community, so they don’t make the same mistakes?

Don’t assume that all modern PCs and servers will support USB device booting or have floppy disks available. Always assume the worst and offer support for as many different media types as possible.

When processing pcap files, never exclude particular ports by default, in case you miss a rich vein of IRC activity that could be occurring on that particular port!

When releasing software, assume no-one will both to read the installation instructions and code accordingly (ie extremely defensively!).

Try to avoid putting large amounts of effort into R&D activities when a project, tool or service is about to suddenly disappear or stop supporting open access to it’s data/services!

For distributed honeynet deployments, particularly when using virtual machines (which experience significant clock skew over time), it is essential to enable NTP on the honeypots in addition to the other infrastructure elements. Ideally traffic should be to a single known good NTP source and whitelisted, helping to make it easier to link different types of log data after an incident – particularly Sebek keystroke data.

3.3 Are there any research ideas you would like to see developed?

We recently proposed a set of research topics for the Google Summer of Code, which can be found on the Honeynet Project’s main website here.

The lack of effective pcap analysis tools remains the achilles heel of current honeynet technology, and much more focused and co-ordinated activity is needed here before we can begin to truly push the boundaries with our research.

Once the Sebek attack source IP to keystroke mapping issue is resolved, the Honeynet Project and research communities should look at the rapidly evolving work on hypervisor based rootkits, such as SubVirt or BluePill, and investigate whether more stealthy solutions to covertly trojaning honeypots can be developed.

A freely distributable and license free open source version of VirusTotal and CWSandbox that could be set up locally would be of substantial benefit to groups like ours with large volumes of malware samples they wish to regularly re-analyse.

Continued development and integration of web application honeypots such as Googlehack Honeypots or PHP.HoP would be ideal, as would development of web crawlers and other forms of Honeyclients, for a balanced set of low/high interaction and manual/automated honeypot research.

We still want to see a Sebek v3 client for Sparc Solaris, since we may effectively had to stop deploying Sun honeypots because Sebek did not keep up with changes in Roo, and changes in data formats made dual v2/v3 sebek clients less desirable to deploy in the wild. Are there other people out there still operating Solaris Sparc Honeypots?

Finally, how about a MacOS version of Sebek too, for the people out there interested in deploying Mac honeypots?

4.0 NEW TOOLS

4.1 What new tools or technology are you working on?

GDH and Honeysnap_db represent our main areas of development focus at the moment, with substantial amounts of work being done in many areas. Small, tactical tools still form the basis of most analysis but we are moving towards the start of more automated data analysis and attack profiling approaches. We intend to continue building a local “forensic database” for tracking and correlating distributed data such as probes, attacks, userids, passwords, hostnames, URLs, downloaded files/shellcode, filesums/details, IRC channels, unique strings, etc. Once we have a working prototype we would like to look at making this more open and encouraging some form of (potentially sanitised) data sharing on a global level.

A cut and paste from last time, but: Cacti / RRD into Roo for monitoring honeywall health and quickly viewing trends. We still plan to make more use of this in the coming months and will merge out current work into the upcoming Roo releases to add web based system management and capacity planning reporting and into Honeysnap for at-a-glance status monitoring. Distributed systems management is now a major requirement for GDH Phase Two too.

Honeynet Widgets – ie Mac or Windows desktop widgets for showing system status, alerts, etc – had to take a back seat for now, but we would still like to develop something in this area eventually, along with delivering alerting and incident summary data via Blackberry or SMS. We hope that this might improve our response to interesting honeynet activity.

We continue to work on a number of standalone non-GDH honeypots, such as Groundhog and POS related honeynet technology, and we intend to launch a new Research Alliance wide initiative on centralised malware collection and analysis (in partnership with the German Honeynet Project) very soon.

4.2 Would you like to integrate this with any other tools, or you looking for help or collaboration with others in testing or developing the tool?

The centralised malware collection project is going to require the co-operation of many groups, and the development of a number of new tools for reporting and analysis, so assistance would be appreciated here once we make a formal announcement of the project.

We`d still like to see Honeysnap become a core part of the Honeywall, and eventually have the next generation analysis and profiling tools integrated too. Any feedback on real world usage experience, bug reports or feature ideas would be useful. The same goes for Groundhog when it is finally released.

Further testing of Honeystick, general comments on usage experience, etc would be appreciated – particularly with the recent changes in VMWare licensing around Player and Server plus the latest Debian release.

We`d love to see the French Honeynet Project (or others) continue developing their VMWare VM obfuscation patches for the current code releases (hint, hint guys!)
🙂

5.0 PAPERS AND PRESENTATIONS

5.1 Are you working any papers to be published, such as KYE or academic papers?

Yes. Upcoming papers will be KYE: GDH and potentially some collaborative work with other Honeynet Project and Research Alliance members on the latest botnet and malware observations. Also in the pipeline is an article on IDS implementation and evasion. Still on the back burner are KYE: Blackhat IRC (which may become KYE: Romanians or KYE: Profiling, or another separate paper).

5.2 Are you looking for any data or people to help with your papers?

Ongoing development and operation of GDH, particularly people interested in improving our incident analysis and attacker profiling capabilities, are our primary goal this year, and we will continue to need help supporting GDH (node deployments, data analysis and tool development). Also, feedback and improvements to Honeysnap would also be welcomed. More generally, data on specific incidents is always welcome, as are co-authors.

5.3 Where did you publish/present honeypot-related material?

  • March 2006 – DoE/DoD/NSA Honeynet Workshop, Washington State (David Watson – “Attacker Profiling” and “New Honeynet Concepts”
  • April 2006 – JTF/GNO Honeynet Workshop, Virginia (David Watson – “Attacker Profiling”)
  • April 2006 – Infosec Europe, London (Tareque Choudry and David Watson)
  • August 2006 -Honeynet Project Annual Workshop, Chicago (David Watson, Arthur Clune, Steve Mumford and Jamie Riden)
  • December 2006 – Honeynet Workshop, Calgary (David Watson)
  • January 2007 – ISOI2 Workshop, Seattle (David Watson)
  • January 2007 – UKNOF Workshop, Southampton (Chas Tomlin)
  • February 2007 – JANET CERT Workshop, London (Arthur Clune)
  • March 2007 – EuSecWest07, London (David Watson – “Lightning Talk: GDH”)
  • March 2007 – KYE: Web Application Threats (Jamie Riden)
  • April 2007 – CanSecWest07, Vancouver (David Watson – “Lightning Talk: GDH”)
  • April 2007 – NetWorkshop35, Exeter (Chas Tomlin and Arthur Clune)
  • May 2007 – SOCA Honeynet Workshop, York (David Watson)

6.0 ORGANIZATIONAL

6.1 Changes in the structure of your organization

None, although we gained three new members (Chas Tomlin, Jamie Riden and Jon Stearn).

6.2 Your feedback on Alliance activities.

Plenty of interest from new Alliance groups wanting to join the research Alliance, and another very productive annual workshop, so good to see the organisation continuing to grow. Our first annual workshop outside Chicago (Costa Rica, December 2007) should hopefully be an excellent event! 😉

6.3 Any suggestions for improving the Alliance?

 

  • Complete the current Honeynet Project restructuring and remove some unnecessary internal barriers to team working.
  • More data sharing and research collaboration.
  • More regular face to face meetings.
  • More timely release of new tools (ie Honeywall CDROM) and publications of new materials.
  • More use of the internal Plone intranet.
  • Some kind of “dating service” that actively tracks who is interested in what, working on something new or has good ideas going spare that need development assistance – to encourage more inter-Alliance collaboration.
  • 7.0 GOALS

    Three of the UK Honeynet Project members are currently performing leading roles on the international Honeynet Project’s core activities:

    • GDH (David Watson)
    • Honeysnap (Arthur Clune)
    • Internal systems migration (Steve Mumford)

    Because of this, to a large extent our research agendas have therefore merged. The postponed “Honeynet Central” concept was successfully implemented this year as GDH Phase One, and much as expected, 10+ honeynets collecting real time data pretty much swamps the operators with raw data. We expect to expend most of our time and resources in the coming year on addressing this problem.

    7.1 Which of your goals did you meet for the last six months?

    Internal Systems

    • Move internal Wiki to Plone
    • Improve internal systems reporting
    • Move to a more powerful dedicated server for data processing
    • Improve data backups

    Organisational

    • Make more time for honeynet research and other activity
    • Continue fundraising and seeking external sponsorship
    • Bring in a couple of additional core team members
    • Improve timeliness of incident response and data analysis
    • Start “handlers diary” for incidents
    • Improve contacts with CERTs, phishing data collectors, etc
    • Develop contacts organisations such as Jill Dando Cybercrime Research Centre and National Hi-Tech Crime Unit
    • Form connections with legal experts in the UK and assess legal position of honeynet data capture – particularly IRC

    Events

    • Honeynet Profiling Workshop in Washington DC (April 2006)
    • Present at Infosec Europe 2006 (April 2006)
    • Run at least one track at annual Honeynet meeting in Chicago (August 2006)

    Deployments

    • Continue to operate existing Roo honeynets
    • Upgrade UKB to GenIII (if Honeynet Central still on hold)
    • Decide when to begin deploying a much larger distributed honeynet system in the UK (“Honeynet Central”)
    • Future honeypots for UKA:
      • Ground hog honeypot (self deploying virtual Windows XP honeypot that automatically extracts downloaded malware and restarts itself)
      • Move to Nepenthes only for malware collection

    R&D

    • Continue developing Honeysnap features and working with Jed Haile to create stable production version for public release
    • Move to python internally for as much development as possible
    • Create new data processing scripts for log analysis and data presentation
    • Begin developing “forensic database” concept and produce prototypes of Honeymine
    • Deploy live HoneyPOS systems in the wild and analyse activity
    • Release a number of useful tools and articles
    • Investigate automatic deployment / decommissioning of shortlived honeypots
    • Evaluate building a dedicated IRC analysis tool

    7.2 Which of your goals did you not meet for the last six months?

    External Systems

    • Upgrade public website (Plone) and start to post material to it regularly / blog
    • Use news publishing interface to keep news current and allow more people to contribute
    • Upload statistics graphs / charts / data to the website automatically
    • Add dynamic statistics and reporting
    • Set up / revive a general mail list for interested parties
    • Start regularly publishing basic attack statistics

    Events

    • IEEE S&P Honeynet Track at West Point (June 2006)
    • Begin regular European honeynet meetings (starting in Paris this spring)
    • Help to organise logistics for another UK Honeynet conference with the Network Defence team at GCHQ

    Deployments

    • Future honeypots for UKA:
      • Mac Mini honeypot (ready to buy hardware)
      • XBox honeypot (ready to buy hardware)
      • PS3 honeypot (ready to buy hardware)
      • SGI honeypot (ready to deploy now)
      • GSX Server honeypots with FreeBSD, WinXP, Win98 (ready to deploy now)

    R&D

    • Complete KYE Blackhat IRC white paper
    • Begin another KYE whitepaper (Profiling?)
    • Continue developing improvements to HoneyStick

    7.3 Goals for the next six months

    De-ja-vu, but yet another cut and paste from our previous bi-annual reports – “Data analysis still remains our main concern. We are good at building and operating honeynets, but analysis of captured data is still very time consuming and requires too much human analyst time. Better data analysis and reporting tools, with greater cross group communication and incident response are essential. Near real time alerting and responses are also highly desirable, which again require more powerful automated analysis tools and techniques. Data Analysis remains our biggest challenge today.”

    This section contains some items from the last status report, with some additional newer items too:External Systems

    • Replace current public website with WordPress blog and start to post material to it regularly
    • Use news publishing interface to keep news current and enable all members to contribute
    • Set up / revive a general mail list for interested parties
    • Start regularly publishing basic attack statistics

    Internal Systems

    • Revamp internal Plone system and consider potential future replacement systems
    • Improve internal systems reporting
    • Move dedicated data analysis server to 3Ware 9650SE RAID6
    • Improve data backups

    Organisational

    • Complete Honeynet Project restructuring and have significant input on future development of Honeynet Project research
    • Better integrate our newer, non-York based members
    • Continue fundraising and seeking external sponsorship
    • Continue to improve timeliness of incident response and data analysis

    Events

    • Attend a number of major upcoming Infosec events (Syscan, BlackHat USA, DefCon, PacSec07, BlackHat Japan)
    • Present on GDH and POS research at at least one upcoming Infosec events (Syscan, BlackHat USA, DefCon, PacSec07, BlackHat Japan)

    Deployments

    • Complete GDH Phase One and 3 month status report
    • Begin planning and rollout of GDH Phase Two
    • Finish rolling out our larger UK Nepenthes deployment (UKC)
    • Launch new centralised malware collection and analysis initiative within the Honeynet Project / Research Alliance
    • Release and publish information on our Groundhog honeypot (self deploying virtual Windows XP honeypot that automatically extracts downloaded malware and restarts itself)
    • Experiment with future honeypots:
      • Mac Mini honeypot (ready to buy hardware)
      • XBox honeypot (ready to buy hardware)
      • SGI honeypot (ready to deploy now)
      • GSX Server honeypots with FreeBSD, WinXP, Win98 (ready to deploy now)

    R&D

    • Continue developing Honeysnap
    • Continue developing “forensic database” concept
    • Deploy live HoneyPOS systems in the wild and analyse activity
    • Decide when to release KYE: GDH whitepaper
    • Consider releasing additional KYE whitepapers
    • Upgrade and re-release HoneyStick
    • Evaluate building a dedicated IRC analysis tool

    8.0 MISC ACTIVITIES

    8.1 Anything else not covered you would like to share.

    Systems support work has continued on the Honeynet Project internal systems over the last year, once again mainly around keeping Plone up-to-date, general systems updates and administration, creation and administration of mailing lists in Mailman, and maintaining backups. Plone extensions for polling / voting seem to work as planned.

    The Honeynet Project began migrating various public and private systems to a new centralised ISP hosting facility in January 2007, and Steve Mumford has been leading this activity for the past few months. Hardware has been sourced and installed, firewalls and VLAN infrastructure is now configured, and we hope to begin moving live services soon.

    The UK group switched from Plone to WordPress for its own internal site in Autumn 2006, and intends to re-launch its public website using WordPress in the coming weeks.