The Dorothy2 Wiki


Dorothy2 is a framework created for suspicious binary analysis. Its main strengths are a very flexible modular environment, and an interactive investigation framework with a particular care of the network analysis. Additionally, it is able to recognise new spawned processes by comparing them with a previously created baseline. Static binary analysis and an improved system behaviour analysis will be shortly introduced in the next versions. Dorothy2 analyses binaries by the use of pre-configured analysis profiles. An analysis profile is composed by the following elements:

  • A certain sandbox OS type
  • A certain sandbox OS version
  • A certain sandbox OS language
  • A fixed analysis timeout (how long to wait before reverting the VM)
  • The number of screenshots requested (and the delay between them)
  • A list of the supported extensions, and how the guest OS should execute them

The use of profiles gives the researcher the possibility to run analysis on a set of binaries by using different environments. As it is known, some malwares are configured to run only in specific environment. A CSIRT, might use them to test suspicious malwares only against an environment that reflects the one of its customers. Sources can also be configured to be automatically analysed by certain profiles (e.g. use Profile_Windows_30sc for all the binaries retrieved by Kippo_source).

Dorothy2 is a continuation of my Bachelor degree's final project (Dorothy: inside the Storm ) that I presented on Feb 2009. More information about the whole project can be found on the Italian Honeyproject website.


The framework is mainly composed of five modules that can be even executed separately. The following picture gives an overview of the current modules and how they are connected each others.

The Binary Fetcher Module (BFM)

In charge of retrieving the binaries from the configured sources. Currently a “binary source” can be system folder, an email-box, or a host reachable by ssh. Once the binaries have been retrieved, the BFM will populate the analysis queue.

The Dorothy analysis engine

In charge of analysing the queue by executing the scheduled binaries into a sandbox, and then storing the generated network traffic and its screenshots into the analysis folder (moreover populating Dorothive with the basic information of the file, and CouchDB with the network pcaps).

The (network) Data Extraction Module (old dparser)

In charge of dissecting the pcaps file, and storing the most relevant information (flows data, GeoIP info, etc) into Dorothive. In addition, it extracts all the files downloaded by the sandbox through HTTP/HTTPS and store them into the binary file's analysis folder.

The (dummy) Webgui

A simple Sinatra application which gives an interactive overview on all the acquired data. WARNING: this module is intended to be executed in an controlled environment. The author strongly discourage to expose it on the Internet.

The Java Dorothy Drone

This module was mainly coded by Patrizia Martemucci and Domenico Chiarito, but not part of this gem and not publicly available.
The JDrone is our botnet infiltration module, refers to this ppt presentation for an overview.

The first four modules are publicly released under GPL 3 license as tribute to the the Honeynet Project Alliance. All the information generated by the framework - i.e. binary info, timestamps, dissected network analysis - are stored into a postgres DB (Dorothive) in order to be used for further analysis. A no-SQL database (CouchDB) is also used to mass store all the traffic dumps thanks to the pcapr/xtractr technology.

I started to code this project in late 2009 while learning Ruby at the same time. Since then, I´ve been changing/improving it as long as my Ruby coding skills were improving. Because of that, you may find some parts of code not-really-tidy :)




Updated by Marco Riccardi about 9 years ago · 5 revisions