IcedID Initial Attack Chain Analysis

Analysis of the IcedID attack chain all the way to the loading of the core module.

IcedID Initial Attack Chain Analysis
https://openart.ai/discovery/md-1b3f2e2b-e1e4-4b27-a18b-76a0cf98a13c

Introduction

The post will focus on the initial attack chain IcedID uses to execute its main payload. IcedID has multiple components, starting from an ISO file, before it begins executing its core module.

Specifically, the article will be divided into the following sections:

  • Initial Execution: Walkthrough of initial execution of IcedID.
  • First Stage Execution: Unpacking routine of IcedID first stage.
  • First Stage Execution:  Core execution of IcedID first stage.
  • Second Stage Payload Downloading: Download and execution of second stage.
  • Execution of Core Module: Execution of Core IcedID Module

The sample used for analysis was retrieved from a Malware Traffic Analysis post. Specifically, the initial ISO file used during this infection has the hash of E2963BA47D2E07A98EAFBD2EF56FFC6AE0E0C483E5E8E1FB1F24F8516ABD246A.

Initial Execution

The initial execution of the IcedID sample begins with an ISO file that is mounted by a user.

IcedID ISO File

Inside the mounted ISO file the user is presented with a LNK and a hidden folder.

Contents of IcedID ISO File

The hidden folder contains more files, including the IcedID payload and other files that facilitate its execution.

Contents of Sub Folder in IcedID ISO File

When the document.lnk is clicked it will invoke a JS file formingGuying.js in the scabs folder.

LNK Execution Path

When the formingGuying.js file is invoked by the document.lnk, it will in turn call a Batch script and pass a paramter to it.

Javascript Executed by LNK File

The Batch Script will invoke the packed IcedID first stage that exists in the form of a DLL file. The rundll command is not hardcoded in this batch script, a portion of it is passed as a parameter. As can be seen only the ll is present in the command and the rest is included from the paramter passed to the Batch script.

Batch Script Executed by Javascript File

The z.txt packaged in the ISO is not used and appears to contain random text. This may be an attempt to decrease the entropy created from the packed DLL file.

Benign File with Text

The roars.jpg also appears to be benign and unused during the process. This may also be an attempt to reduce entropy.

Benign Image File

The following acts as a diagram to summarize the initial execution chain:

Inital Access Diagram

First Stage Execution - Unpacking

The first stage IcedID payload takes the form of a DLL with an extremally large .data section. This is the result of the true first stage being packed and hidden within the .data section.

Large .data Section in IcedID First Stage

The packed DLL contains multiple exports, however, only the export with the ordinal of 1 will be used since that is what is invoked by the Batch script in the ISO file.

Exports from Packed IcedID First Stage DLL

Shellcode Decoding

During the execution of the packed IcedID first stage DLL encoded shellcode is placed into a memory buffer allocated with VirtualAlloc with Read, Write, and Execute permissions.

Packed IcedID DLL Allocates RWX Memory Space

In the packed first stage DLL, the shellcode is stored in an encoded format. Decoding is done through a simple XOR loop with the key 0x1B.

Decode Unpacking Shellcode via an XOR Loop

After the shellcode is copied into memory and decoded, a call is used to enter the buffer with the shellcode.

Begin Execution Decoding Shellcode

Inside the shellcode the first stage loader from the .d section will be decoded and mapped into memory. Following this the shellcode will transfer control to the entrypoint of the first stage loader.

Execution of Unpacked IcedID First Stage

At this point the IcedID sample has been unpacked and reveals a new executable.

Unpacked IcedID First Stage Section Layout

First Stage Execution - Main Logic Execution

Config Decryption

The IcedID first stage stores its configuration in a PE data section, in this case it is in the last section named .d. The total length of the configuration is 0x80 bytes, with the first 0x40 being the configuration and the later 0x40 being the XOR key to decode it.

IcedID Configuration and XOR Key from .d Section

The decoding routine will loop through each byte, and use the value 0x40 bytes ahead of the current byte as the XOR key value. The following is a script that will decode the configuration file:

```python
import pefile
import binascii

file_name = "sample.bin"
section_config_name = ".d"

def byte_xor(ba1, ba2):
    return bytes([_a ^ _b for _a, _b in zip(ba1, ba2)])

def decode_config(config_data):

    # The sample has a hardcoded size of 0x80, 0x40 for the config and 0x40 for the XOR key
    encoded_config_raw = config_data[0:0x40] 
    xor_key_raw = config_data[0x40:0x80]
    encoded_config = binascii.hexlify(config_data[0:0x40])
    xor_key = binascii.hexlify(config_data[0x40:0x80])

    print("Encoded Config: ", encoded_config.decode())
    print("XOR Key Stream: ", xor_key.decode())

    decoded_config = byte_xor(bytes(list(encoded_config_raw)), bytes(list(xor_key_raw)))
    return decoded_config

def config_extract(filename):
    pe = pefile.PE(filename)
    for pe_section in pe.sections:
        if pe_section.Name.decode("utf-8").strip("\x00") == section_config_name:
            print("[+] Found Section: " + pe_section.Name.decode("utf-8").strip("\x00"))

            # Get raw decoded config
            raw_config = decode_config(pe_section.get_data())

            # Print campaign ID (Reverse the bytes to make it big endian)
            raw_id = binascii.hexlify(raw_config[0:4])
            reverse_id = []

            for index in range(0,len(raw_id),2):
                reverse_id.append(raw_id[index:index+2])

            reverse_id.reverse()
            print("Campaign ID: ", int(b''.join(reverse_id), 16))

            # Print C2 Domain
            print("C2 Domain: ", end='')
            for i in range(0,len(raw_config[4:-1])):
                current_char = chr(raw_config[i+4])
                print(current_char, end='')

                if current_char == chr(0):
                    break
            print()

config_extract(file_name) # Tested with unpacked sample 406e5f1b61eb7dda498b525c8af5d8f8611a3c41ff2cdeafbc37081bdd375da7
```
IcedID Configuration Decoder

The following is a demonstration of the configuration decoder executing.

IcedID Configuration File Decoded

Information Collection

When the IcedID loader initially reaches out to the command and control server it will collect information about the system and embed it into a HTTP request as multiple cookie values. In the following screenshot six cookie values have been observed:

Information About System Sent to IcedID C2 Server

This section will cover all the cookie values and the information that is embedded in each of them.

Host and Campaign Information - __gads

The __gads cookie value will contain information about the host and IcedID campaign that was extracted from the embedded configuration. All the stored information is separated by a : character.

Host and Campaign Information
  • [FIRST]: Campaign tracking value from the configuration.
  • [SECOND]: Hardcoded “1” value – A possible version number for the software.
  • [THIRD]: System Uptime in Seconds (Retrieved via GetTickCount).
  • [FOURTH] : Number of Running Processes On System.

Windows Version Information - _gat

The _gat cookie value will store the major and minor version of the Windows operating system.

Windows Version Information

User and Computer Name Information - _u

The _u cookie value will store information about the username and hostname of the computer, along with a anti-sandbox value that is likely to be checked by the server.

User and Computer Name Information
  • [FIRST]: Computer NetBIOS name represented in ASCII in hex.
  • [SECOND]: Current username IcedID is running under in ASCII represented in hex.
  • [THIRD]: Anti-sandbox value generated from RDTSC operations.

SID Information - __io

The __io cookie value will hold a portion of the SID of the current running user. The cookie value will look like the following:

SID Information

This value was retrieved from the following SID value:

SID Account of Current User

Processor Information - _ga

The _ga cookie will hold information about the processor, along with a anti-sandbox value.

Processor Information
  • [FIRST]: This value can be considered to have three bits, with each bit indicating a specific result from CPUID. These values will all be OR'd with each other to form the final decimal number.
    • Bit 0: If set, CPUID.0.EBX contains “Genu”.
    • Bit 1: If set, execute disable bit is available as checked in CPUID.800000001.RDX[20]
    • Bit 2: If set, digital temperature sensor is supported as checked in CPUD.6.RAX[0]
  • [SECOND]: DWORD value from CPUID.1.EAX.
  • [THIRD]: Anti-sandbox value calculated from a combination of RDTSC loops that run CPUID.
  • [FOURTH]: Hypervisor detection value. Uses CPUID.40000000.EBX to determine the hypervisor model used.
    • A zero value will indicate a physical host, and a non-zero value will indicate a hypervisor.
    • In this case, '1238' refers to the ASCII bytes 'MV' from the 'VMwa' output of CPUID.40000000.EBX.

MAC Address Information - _gid

The _gid will hold a list of MAC Addresses from the host, each encoded and separated with a : character.

The cookie value will appear as follows:

MAC Address Information

The highlighted MAC Address 00685951A0C2 is actually the encoded MAC Address 00-0C-29-27-10-53.

MAC Address of Each Interface

The encoding algorithm will take the original MAC address and shift the bytes around in order to create the encoded version, to decode the same operation is performed in opposite.

The following code acts as a proof of concept encoding algorithm used:

#include <iostream>
#include <Windows.h>
#include <vector>

void PrintVector(std::vector<BYTE> targetVector) {
    for (auto value : targetVector) {
        printf("%02X ", value);
    }
}

int main()
{

    std::vector<BYTE> macAddress{ 0x00, 0x0C, 0x29, 0x27, 0x10, 0x53 }; // MAC Address to Encode
    std::vector<BYTE> encodedMacAddress;

    // Encode MAC Address
    for (size_t i = 0; i < macAddress.size(); i++) {

        // Main encoding logic here
        BYTE encodedMacAddressByte = macAddress[i] + i;
        __asm {rol encodedMacAddressByte, 3};

        encodedMacAddress.push_back(encodedMacAddressByte);
    }

    // Print original MAC Address
    printf("Original MAC Address:\t");
    PrintVector(macAddress);
    printf("\n");

    // Print encoded MAC Address
    printf("Encoded MAC Address:\t");
    PrintVector(encodedMa
cAddress);
    printf("\n");
}
Mac Address Encoding

The following demonstrates the output of the example program:

Mac Address Encoding Running

Second Stage Payload Downloading

The first stage will download the second stage payload through HTTP in the GZIP format. The data downloaded is not actually in a GZIP format, this is just a field added to the HTTP request. In reality the second stage is XOR encoded and will need to be decoded.

IcedID C2 Emulation

During the analysis of this sample the original download server was already offline, this means to facilitate dynamic analysis a webserver was setup to emulate the IcedID second stage downloading.

The following Github repository contains the Golang program used to serve the second stage.

Second Stage Download via HTTP

Once the second stage payload is downloaded successfully it is decoded in memory. If the download fails the program will sleep and retry.

Second Stage Decode Loop

After the download of the second stage the first two bytes will be checked for match 0x1F8B.

First Two Bytes of Second Stage

Once the second stage is decoded it contains the following data at the associated offsets:

  • Second Stage Offsets
    • 0x2: DWORD that indicates the size of license.dat
    • 0x6: DWORD that indicates the size of vote32.tmp
    • 0xA: Name of Folder in AppData (SignDinner) 0x2A - Name of IcedID core payload on disk (license.dat)
    • 0xC9 - Name of the Core Module Loader (vote32.tmp)
    • 0x2C6: Beginning of encoded licence.dat
    • 0x2C6 + size of licence.dat: vote32.tmp file

Both the licence.dat file and vote32.tmp file will be written to disk before starting the routine to decode the core module and execute it.

Write Licence.dat and Vote.tmp to Disk, before decoding and running core module

For visualization purposes the following screenshot is the beginning of the decoded second stage. The highlighted text reveals null terminated strings of the licence.dat file name, the SingDinner folder name where licence.dat will be written to, and the vote32.tmp file name. The authors of IcedID can easily change any of these names, however, the licence.dat is usually unchanged based on public reporting and other samples published online.

Beginning of the Decoded IcedID Second Stage

Licence.dat

The licence.dat file is arguably the most important file of the attack chain, this file contains the final IcedID core module that will be executed. The file itself is stored as a PE file with a customized PE header. The customizations include stripping the PE header and section headers and storing the fields in a non-standard format.

Due to the fact the core module is stored with a customized format it cannot be executed through Windows natively. During the beginning the core module is loaded and executed via the first stage, however, later the licence.dat is passed to a dedicated loader run via a scheduled task.

The licence.dat file is always present in an encoded stage and must be decoded using XOR before usage.

The key for the decoding of this file are stored in the last 16 bytes of the file, 4 DWORDs act as the key to the XOR decoding stream. Furthermore, every time a byte is XOR'd two DWORDs from the key will be rolled right.

The following is an example of the 16 byte key (4 DWORDs) that are used as the key in the sample analyzed in this post:

4 DWORD Bytes that act as an XOR Key

The following is a decoding function that will parse the XOR key and decode each byte of the licence.dat file. Note, we loop over sizeOfSectionStage - 0x10 bytes of the file, since the last 16 bytes (0x10 bytes) is the key. We do not want to XOR the bytes we are using as a key stream.

int DecodeRoutine(PBYTE pSecondStage, DWORD sizeOfSecondStage) {

    DWORD sizeOfSecondStageMinusXORBytes = sizeOfSecondStage - 0x10;
    PDWORD pXORBytes = (PDWORD) ((PBYTE)pSecondStage + sizeOfSecondStageMinusXORBytes);

    for (DWORD i = 0; i < sizeOfSecondStageMinusXORBytes; i++) {

        DWORD xorBytesIndex1 = i % 3;
        DWORD xorBytesIndex2 = (i + 1) % 3;

        // Construct an XOR byte based on xorBytesIndex1 and xorBytesIndex2 
        BYTE xorByte = *((PBYTE)&pXORBytes[xorBytesIndex1]) + *((PBYTE)&pXORBytes[xorBytesIndex2]);

        // XOR the licence.dat buffer wit the XOR byte
        pSecondStage[i] = pSecondStage[i] ^ xorByte;

        // Rolling the DWORD that
        BYTE rollNumber = *((PBYTE)&pXORBytes[xorBytesIndex2]) & 0x7;
        DWORD bytesToRoll = pXORBytes[xorBytesIndex1];

        __asm {
            push ecx
            push ebx

            mov cl, rollNumber
            mov ebx, bytesToRoll

            ror ebx, cl
            mov bytesToRoll, ebx


            pop ecx
            pop ebx
        }

        pXORBytes[xorBytesIndex1] = ++bytesToRoll;

        // Rolling the DWORD that xorBytesIndex2 points to
        rollNumber = bytesToRoll & 0x7;
        bytesToRoll = pXORBytes[xorBytesIndex2];

        __asm {
            push ecx
            push ebx

            mov cl, rollNumber
            mov ebx, bytesToRoll

            ror ebx, cl
            mov bytesToRoll, ebx


            pop ecx
            pop ebx
        }

        pXORBytes[xorBytesIndex2] = ++bytesToRoll;

    }

    return 0;
}
Decoding Function for Licence.dat

Once decoded the licence.dat file can be separated into the following parts:

  • Red: The first 0x81 bytes in the file are skipped and not used.
  • Green: Contains the PE header data, the organization of these header attributes is specific to IcedID and does not follow a traditional PE header format.
  • Purple: Beginning of the data sections.
Format of the Decoded Licence.dat

The headerless PE data structure can be represented using the following struct. Only the minimum required fields are present which are used to find the associated sections, map them to memory, and then begin execution of the entrypoint.

struct PEHeader {
    ULONGLONG ImageBase;
    DWORD SizeOfImage;
    DWORD AddressOfEntryPoint;
    DWORD IATRva;
    DWORD BaseRelocationRVA;
    DWORD BaseRelocationSize;
    DWORD NumberOfSections;
};
PE Header in Headerless IcedID File

The section headers begin right after the PEHeader struct. There are a total of PEHeader->NumberOfSections number of sections, each having the size of 0x11 bytes.

struct SectionHeader {
    DWORD VirtualAddress;
    DWORD VirtualSize;
    DWORD PointerToRawData;
    DWORD SizeOfRawData;
    BYTE SectionPageProtection;
};
Section Headers in Headerless IcedID File

Finally, after the section headers finish the raw data of each section begins - just like in regular PE files.

During this analysis a decoder was created to convert the licence.dat file into a regular PE file in order to facilitate further static analysis. This Github repository has the source code.

The main functions of the decoder are:

  • Load the licence.dat file from disk.
  • Decode the licence.dat file in memory.
  • Parse the PE headers and Section headers.
  • Rebuild a valid PE file based on the parsed headers.

The following is a example video of the decoder in action:

Vote32.tmp

Vote32.tmp is the file that comes after licence.dat in the second stage download. Vote32.tmp is a DLL file that acts as a loader for the licence.dat. The first time IcedID executes, licence.dat will be loaded and run by the first stage. However, the core module will establish persistence in the form of a scheduled task and use Vote32.tmp to load and run the licence.dat file.

The following is an example of the persistence that the core module will setup, the DLL being executed is a copy of Vote32.tmp that takes the licence.dat as input.

IcedID Core Module Persistence

Execution of Core Module

Once the customized IcedID core module header has been parsed and the sections themselves have been mapped into memory, execution of the module can begin by entering the entrypoint of the file.

Entry into Core IcedID Module Entrypoint