Android Dynamic Code Analysis - Mastering DroidBox

In this article I'll have a a closer look at DroidBox which provides a mobile sandbox to look at Android applications. In the previous post I've dealt with static code analysis. This time will start running our malicious application and look at the "noise" it generates. That would be:

  • file system access
  • network activity
  • interaction with the operating system
  • interaction with other applications
  • etc.

DroidBox is very easy to use and consists of an own system image and kernel meant to log one applications activities. Using adb logcat DroidBox will look for certain debug messages and collect anything related to the monitored app. However I must say that loged data isn't always complete. Sometimes you'll get only a striped version of the data which caused the activity. In that case it's almost impossible e.g. to have a deep look at the network traffic (especially HTTP). You won't be able to construct a full request-response-sequence due to missing data. Nevertheless you can use DroidBox to get an overview of malicious activities triggered by the app. For a more technical analysis of the data you'll need additional tools (more to come in future posts).

Requirements for DroidBox

First you'll have to install some requirements DroidBox needs. First make sure you have the system relevant packages installed:

[email protected]:~# apt-get install python-virtualenv libatlas-dev liblapack-dev libblas-dev

You'll need those in order to use scipy, matplotlib and numpy along with Droidbox. Now create a virtual environment and install python dependencies:

[email protected]:~/work/apk# mkdir env
[email protected]:~/work/apk# virtualenv env
...
[email protected]:~/work/apk# source env/bin/activate
(env)[email protected]:~/work/apk# pip install numpy scipy matplotlib

Install Droidbox

Download the package:

(env)[email protected]:~/work/apk# wget https://droidbox.googlecode.com/files/DroidBox411RC.tar.gz

Setup PATH

In [2]:
import os
import sys

# Setup new PATH
old_path = os.environ['PATH']
new_path = old_path + ":" + "/root/work/apk/SDK/android-sdk-linux/tools:/root/work/apk/SDK/android-sdk-linux/platform-tools:/root/work/apk/SDK/android-sdk-linux/build-tools/19.1.0"
os.environ['PATH'] = new_path

# Change working directory
os.chdir("/root/work/apk/DroidBox_4.1.1/")

Setup IPython settings

In [415]:
%pylab inline
import binascii
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import networkx as nx
import datetime as dt
import time
import ipy_table
from IPython.display import display_pretty, display_html, display_jpeg, display_png, display_json, display_latex, display_svg
from IPython.display import HTML
from IPython.core.magic import register_cell_magic, Magics, magics_class, cell_magic
import jinja2

# Ipython settings
pd.set_option('display.height', 1000)
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.max_colwidth', 100)
pd.set_option('display.width', 1000)
pd.set_option('display.column_space', 1000)
Populating the interactive namespace from numpy and matplotlib
height has been deprecated.

External extensions

In [5]:
# Install
%install_ext https://raw.githubusercontent.com/dorneanu/ipython/master/extensions/diagmagic.py
    
# Then load extensions
%load_ext diagmagic
Installed diagmagic.py. To use it, type:
  %load_ext diagmagic

Utilities

In [416]:
Expand Code

Create Android Virtual Device (ADV)

Now you'll have to install an Android device virtually in order to analyze the APK. Supposing you have installed the SDK in the previous step now you should have some targets available on your machine. If not (that was my case) then make sure you have a X session running and run android from the console. In my case I've fired up vnc and connected to the Kali machine.

This is what I've got:

In [22]:
%%bash
android list targets | head -n 10
Available Android targets:
----------
id: 1 or "android-16"
     Name: Android 4.1.2
     Type: Platform
     API level: 16
     Revision: 4
     Skins: WXGA800-7in, WQVGA400, WVGA800 (default), WXGA800, HVGA, WSVGA, WVGA854, WQVGA432, WXGA720, QVGA
 Tag/ABIs : default/armeabi-v7a
----------

Now we create the AVD using following command:

# android create avd --abi default/armeabi-v7a -n android-4.1.2-droidbox -t 1 -c 1000M
Android 4.1.2 is a basic Android platform.
Do you wish to create a custom hardware profile [no]
Created AVD 'android-4.1.2-droidbox' based on Android 4.1.2, ARM (armeabi-v7a) processor,
with the following hardware config:
hw.lcd.density=240
hw.ramSize=512
hw.sdCard=yes
vm.heapSize=48
In [14]:
%%bash
android list avd
Available Android Virtual Devices:
    Name: android-4.1.2-droidbox
    Path: /root/.android/avd/android-4.1.2-droidbox.avd
  Target: Android 4.1.2 (API level 16)
 Tag/ABI: default/armeabi-v7a
    Skin: WVGA800
  Sdcard: 1000M

Start the emulator

In DroidBoxs package directory you'll find startemu.sh. Open it and add your favourite parameters.

In [5]:
%%bash
cat startemu.sh
#!/usr/bin/env bash

emulator -avd $1 -system images/system.img -ramdisk images/ramdisk.img -wipe-data -prop dalvik.vm.execution-mode=int:portable &

Afterwards make sure you have a X session and run the emulator with your previously created AVD:

(env)[email protected]:~/work/apk/DroidBox# ./startemu.sh android-4.1.2-droidbox
...

Now you should see your emulator booting ...

Run DroidBox

In [10]:
!./droidbox.sh /root/work/apk/DroidBox_4.1.1/APK/FakeBanker.apk
 ____                        __  ____
/\  _`\               __    /\ \/\  _`\
\ \ \/\ \  _ __  ___ /\_\   \_\ \ \ \L\ \   ___   __  _
 \ \ \ \ \/\`'__\ __`\/\ \  /'_` \ \  _ <' / __`\/\ \/'\
  \ \ \_\ \ \ \/\ \L\ \ \ \/\ \L\ \ \ \L\ \ \L\ \/>  </
   \ \____/\ \_\ \____/\ \_\ \___,_\ \____/ \____//\_/\_\
    \/___/  \/_/\/___/  \/_/\/__,_ /\/___/ \/___/ \//\/_/
Waiting for the device...
Installing the application /root/work/apk/DroidBox_4.1.1/APK/FakeBanker.apk...
Running the component com.gmail.xpack/com.gmail.xpack.MainActivity...
Starting the activity com.gmail.xpack.MainActivity...
Application started
Analyzing the application during infinite time seconds...
^C

DroidBox will then listen for activities until you kill it by ^C.

Meanwhile I was interacting with the APP and saw that DroidBox was collecting the logs during the interacttions. DroidBox will output its results as a JSON file. I've uploaded the results to pastebin.com. Now let's have some fun and take a look at the results.

Before starting analyzing the output keep in mind that:

[...] all data received/sent, read/written are shown in hexadecimal since the handled data can contain binary data.

(Source: https://github.com/floe/mobile-sandbox/blob/master/DroidBox_4.1.1/scripts/droidbox.py)

Results analysis

First let's download the data and let python parse it

In [27]:
Expand Code
Out[27]:
[u'apkName',
 u'enfperm',
 u'opennet',
 u'cryptousage',
 u'sendsms',
 u'servicestart',
 u'sendnet',
 u'closenet',
 u'accessedfiles',
 u'fdaccess',
 u'dataleaks',
 u'recvnet',
 u'dexclass',
 u'hashes',
 u'recvsaction',
 u'phonecalls']

So we have diffenrent categories of activities we can look at. After analyzing the JSON content I've come to following most important activities.

File system activities

Let's have a look at the file system access actions triggered by the application. Due to DroidBox limitations I couldn't have a look at the complete raw data.

In [421]:
Expand Code
Out[421]:
Timestamp operation path rawdata
0 0:00:03 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...
1 0:00:04 read /proc/1184/cmdline com.gmail.xpackp/FakeBanker.apkain...
2 0:00:04 read /proc/1197/cmdline logcatDroidBox:Wdalvikvm:WActivityManager:Ip/FakeBanker.apkain...
3 0:00:08 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...
4 0:00:09 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...
5 0:00:10 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...
6 0:00:10 read /proc/1205/cmdline com.gmail.xpack:remotep/FakeBanker.apkain...
7 0:00:36 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
8 0:00:36 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...
9 0:01:05 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
10 0:15:06 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
11 0:15:06 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
12 0:15:24 read /dev/urandom E0�2qV���4!=�Nd��V
13 0:15:27 read /proc/1239/cmdline com.android.exchangep/FakeBanker.apkain...
14 0:15:35 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
15 0:15:37 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
16 0:15:37 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
17 0:15:58 read /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
18 0:16:00 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
19 0:16:01 write /data/data/com.gmail.xpack/shared_prefs/MainPref.xml <?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<int name="PASSADDED" value="10...
20 0:16:10 read /proc/wakelocks name\tcount\texpire_count\twake_count\tactive_since\ttotal_time\tsleep_time\tmax_time\tlast_chan...

Network activities

Opened connections (opennet)

In [405]:
Expand Code
Out[405]:
Timestamp desthost destport fd
0 0:00:08 80.74.128.17 80 17
1 0:15:06 80.74.128.17 80 23
2 0:15:36 80.74.128.17 80 28
3 0:16:00 80.74.128.17 80 33

Sent data (sendnet)

Here you can have a look at the sent data. Again: The POST/GET requests are not fully complete.

In [404]:
Expand Code
Out[404]:
Timestamp desthost destport fd operation type rawdata
0 0:00:08 80.74.128.17 80 17 send net write POST /images/1.php HTTP/1.1\r\nUser-agent: Mozilla/4.76 (Java; U;Linux armv7l 2.6.29-gc497e41; r...
1 0:15:06 80.74.128.17 80 23 send net write POST /images/1.php HTTP/1.1\r\nUser-agent: Mozilla/4.76 (Java; U;Linux armv7l 2.6.29-gc497e41; r...
2 0:15:36 80.74.128.17 80 28 send net write POST /images/1.php HTTP/1.1\r\nUser-agent: Mozilla/4.76 (Java; U;Linux armv7l 2.6.29-gc497e41; r...
3 0:16:00 80.74.128.17 80 33 send net write POST /images/1.php HTTP/1.1\r\nUser-agent: Mozilla/4.76 (Java; U;Linux armv7l 2.6.29-gc497e41; r...

Received data (recvnet)

In [403]:
Expand Code
Out[403]:
Timestamp host port type rawdata
0 0:00:08 80.74.128.17 80 net read HTTP/1.1 406 Not Acceptable\r\nDate: Mon, 28 Jul 2014 13:29:38 GMT\r\nServer: Apache\r\nContent-...
1 0:00:08 80.74.128.17 80 net read x=10\r\nConnection: Keep-Alive\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE H...
2 0:15:06 80.74.128.17 80 net read x=10\r\nConnection: Keep-Alive\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE H...
3 0:15:06 80.74.128.17 80 net read HTTP/1.1 406 Not Acceptable\r\nDate: Mon, 28 Jul 2014 13:44:36 GMT\r\nServer: Apache\r\nContent-...
4 0:15:36 80.74.128.17 80 net read HTTP/1.1 406 Not Acceptable\r\nDate: Mon, 28 Jul 2014 13:45:06 GMT\r\nServer: Apache\r\nContent-...
5 0:15:36 80.74.128.17 80 net read x=10\r\nConnection: Keep-Alive\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE H...
6 0:16:00 80.74.128.17 80 net read HTTP/1.1 406 Not Acceptable\r\nDate: Mon, 28 Jul 2014 13:45:30 GMT\r\nServer: Apache\r\nContent-...
7 0:16:00 80.74.128.17 80 net read x=10\r\nConnection: Keep-Alive\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n<!DOCTYPE H...

Requests sequence

Since I was not able to get the full contents of the POST/GET requests (and their equivalent responses), I had to rely on the information found here. Below is a short sequence diagramm describing the general process of the communication. Keep in mind that the sequence only tries to give you a short overview of the data exchange between the process and the webserver.

In [73]:
Expand Code

And now a complete request/response pair:

Request:

POST /gallery/4.php HTTP/1.1
User-agent: Mozilla/4.76 (Java; U;Linux i686 3.0.36-android-x86-eeepc+; ru; The Android Project 0)
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Pragma: no-cache
Host: best-invest-int.com
Connection: Keep-Alive
Accept-Encoding: gzip
Content-Length: 86
Data Raw: 64 61 74 61 3d 55 32 6c 74 55 33 52 68 64 47 55 67 50 53 42 4f 54 31 51 67 55 6b 56 42 52 46 6b 67 43 67 25 33 44 25 33 44 25 30 41 26 4c 6f 67 43 6f 64 65 3d 43 4f 4e 46 26 4c 6f 67 54 65 78 74 3d 43 68 65 63 6b 2b 70 75 6c 6c 2b 6f 66 66 2b 75 72 6c 73 26 
Data Ascii: data=U2ltU3RhdGUgPSBOT1QgUkVBRFkgCg%3D%3D%0A&LogCode=CONF&LogText=Check+pull+off+urls&

Response:

HTTP/1.1 403 Forbidden
Date: Thu, 21 Nov 2013 12:37:26 GMT
Server: Apache/2.2.3 (CentOS)
Content-Length: 299
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>403 Forbidden</title></head><body><h1>Forbidden</h1><p>You don't have permission to access /gallery/4.phpon this server.</p><hr><address>Apache/2.2.3 (CentOS) Server at best-invest-int.com Port 80</address></body></html>

Crypto activities

In [417]:
Expand Code
Out[417]:
Timestamp algorithm key operation type
0 0:00:09 Blowfish 52, 101, 54, 54, 55, 54, 54, 101, 54, 98, 54, 97, 54, 99, 54, 101, 55, 54, 54, 98, 54, 97, 52, 9... keyalgo crypto
1 0:15:06 Blowfish 52, 101, 54, 54, 55, 54, 54, 101, 54, 98, 54, 97, 54, 99, 54, 101, 55, 54, 54, 98, 54, 97, 52, 9... keyalgo crypto
2 0:15:36 Blowfish 52, 101, 54, 54, 55, 54, 54, 101, 54, 98, 54, 97, 54, 99, 54, 101, 55, 54, 54, 98, 54, 97, 52, 9... keyalgo crypto
3 0:16:00 Blowfish 52, 101, 54, 54, 55, 54, 54, 101, 54, 98, 54, 97, 54, 99, 54, 101, 55, 54, 54, 98, 54, 97, 52, 9... keyalgo crypto

Activities chart

Now let's have a look in which order the several activities took place. Below you'll find a table containing the timestamp, operation and category of each specific activity (e.g. file system access, network read/write etc.)

In [418]:
Expand Code
Out[418]:
Timestamp Operation Category
0 0:00:03 read file system
1 0:00:04 read file system
2 0:00:04 read file system
3 0:00:08 read file system
21 0:00:08 net open network
25 0:00:08 net write network
29 0:00:08 net read network
30 0:00:08 net read network
4 0:00:09 write file system
37 0:00:09 Blowfish crypto
5 0:00:10 write file system
6 0:00:10 read file system
7 0:00:36 write file system
8 0:00:36 read file system
9 0:01:05 read file system
10 0:15:06 write file system
11 0:15:06 write file system
22 0:15:06 net open network
26 0:15:06 net write network
31 0:15:06 net read network
32 0:15:06 net read network
38 0:15:06 Blowfish crypto
12 0:15:24 read file system
13 0:15:27 read file system
14 0:15:35 read file system
23 0:15:36 net open network
27 0:15:36 net write network
33 0:15:36 net read network
34 0:15:36 net read network
39 0:15:36 Blowfish crypto
15 0:15:37 write file system
16 0:15:37 write file system
17 0:15:58 read file system
18 0:16:00 write file system
24 0:16:00 net open network
28 0:16:00 net write network
35 0:16:00 net read network
36 0:16:00 net read network
40 0:16:00 Blowfish crypto
19 0:16:01 write file system
20 0:16:10 read file system

A fancier overview ...

In [419]:
%%jinja html
<!-- collapse=True -->
<html>
<head>
  <script src="http://d3js.org/d3.v3.min.js"></script>
  <script src="http://dimplejs.org/dist/dimple.v2.1.0.min.js"></script>
<title>{{ title }}</title>
</head>
<body>
<div id="bar_chart"></div>
  <script type="text/javascript">
    var json_data  = {{ json_data }};
    var svg = dimple.newSvg("#bar_chart", 800, 800);
    var myChart = new dimple.chart(svg, json_data);
    myChart.setBounds(150, 50, 700, 680)
    myChart.addCategoryAxis("x", ["Category", "Operation"]);
    myChart.addCategoryAxis("y", "Timestamp");
    myChart.addSeries("Operation", dimple.plot.bar);
    myChart.addLegend(170, 10, 630, 20, "right");
    myChart.draw();
  </script>
</body>
</html>
Out[419]:

A few observations:

  • file system access (both read and write) are taking place all the time
  • the crypto routines are apparently involved when sending data over internet or receiving data

Conclusion

I think DroidBox is a very good tool to deal with Android APKs and analyze their behaviour during run-time. It comes with a working mobile sandbox meant to inspect and monitor an applications activities. However during my analysis I had to rely on previous analysis since the results didn't contain the full details. Not only the network traffic but also the contents read from files weren't complete. In order to fully unterstand one malware I need complete details about its behaviour. For example I had following response from the server which is completely useless:

HTTP/1.1 406 Not Acceptable\r\nDate: Mon, 28 Jul 2014 13:29:38 GMT\r\nServer: Apache\r\nContent-...

Besides that I was indeed able to see that the application is reading from some file. But the delivered content was once again striped:

<?xml version='1.0' encoding='utf-8' standalone='yes' ?>\n<map>\n<string name="DOWNLOADDOMAIN">c...

I hope the developers will see this as a vital necessity and update as soon as possible. Furthermore I'll look forward to other mobile sandboxes which have data instrumentation capabilities. Next time I'll have a deeper look at Androids DDMS.


Prev: Disect Android APKs like a Pro - Static code analysis
Next: Web Application Survey Tutorial

comments powered by Disqus
Published:
2014-08-05 00:00
category:
Tag: