Cryptographic Ransomware Encryption Detection: Survey: on Crypto-ransomware Behavior and Methodology

cover
15 Jun 2024

Authors:

(1) Kenan Begovic, currently a Ph.D. candidate in Computer Science at Qatar University. He received his MS in Information and Computer Security from University of Liverpool;

(2) Abdulaziz Al-Ali ,received the Ph.D. degree in machine learning from the University of Miami, FL, USA, in 2016 and he is currently an Assistant Professor in the Computer Science and Engineering Department, and director of the KINDI Center for Computing Research at Qatar University;

(3) Qutaibah Malluhi, a Professor at the Department of Computer Science and Engineering at Qatar University (QU).

2. On crypto-ransomware behavior and methodology

The crypto-ransomware attack is characterized by a specific action of encrypting victims’ data with the intention to extort financial or other benefits as a ransom for decryption. Researchers have observed distinct actions that mark noticeable separate phases of a ransomware attack (Moussaileb et al., 2021; Berrueta et al., 2019; Al-rimy et al., 2018; Eze et al., 2018). After a careful examination of different proposals for ransomware-specific kill chains, as well as the growing tendency of ransomware groups to carefully choose the target and emulate Advanced Persistent Threat (Sophos, 2021), we synthesized our findings and, as a result, identified four distinct phases of a crypto-ransomware attack. Our proposal for a kill chain is shown in Fig. 2.

2.1. Phases of the attack

There are many other surveys of ransomware used kill chains with different numbers and scopes of phases. We propose a kill chain with four distinct steps or phases for a ransomware attack. The kill chain, presented in Fig. 2, was found to be the best fit to focus on detecting encryption as a defining characteristic of cryptographic ransomware attacks. The following section will explain in detail the essential characteristics of each of the four phases of our kill chain, namely Initial compromise, Establishing foothold, Encryption, and Extortion, to present ransomware’s lifecycle and emphasize the importance of the Encryption phase.

2.1.1. Initial compromise

Initial compromise marks the phase in which a ransomware attack compromises the first computer. Various methods for delivering and executing initial compromise include phishing, spearphishing, corrupt web pages, and actual security bugs and system misconfigurations (vulnerabilities). Fig. 3 shows the most common methods of initial compromise based on original research by the authors, covering the years between 2013 and 2021.

As presented in Fig. 3, phishing is the most common method for initial compromise, often combined with exploiting vulnerabilities or corrupted websites.

Fig. 2. Ransomware kill chain.

Fig. 3. Initial compromise attack vectors.

Spear-phishing is rare, but that could result from many researchers not focusing on segregating spam (unsolicited emails), phishing, and spear-phishing.

Due to non-standardized terminology describing the initial compromise techniques, applying the same unique valuation across all sources takes a lot of work. The result is that some ransomware attacks fall into more than one category. However, for the sake of clarity, we placed such ransomware attacks into the category corresponding to its most defining characteristic.

2.1.2. Establishing foothold

In most cases, after the initial compromise, the attacker attempts to establish a permanent foothold in the compromised system and move laterally or otherwise. The activity usually, but not necessarily, starts with connecting to command and control (C2) servers. C2 is an Internet host or entire infrastructure built to control ransomware’s behavior, issuing commands, generating, distributing, and/or storing encryption keys, and collecting information about the ransomware victim.

Ransomware attacks that do not utilize C2 reduce detection surface to the host detection capabilities only, entirely avoiding network detection measures that focus on communication detection between the initial intrusion code and C2 (Berrueta et al., 2019). If this type of ransomware propagates and establishes a foothold in the manner of a worm, then network controls can detect it (Alotaibi and Vassilakis, 2021). Notable ransomware attacks that do not use C2 are BadRabbit (Alotaibi and Vassilakis, 2021), CTBLocker (Upadhyaya and Jain, 2016), Bart (Labs, 2017a), KillDisk (The rise of TeleBots, 2016), Patcher (New crypto–ransomware hits macOS, 2017), Revenge (Revenge Ransomware, a CryptoMix Variant, Being Distributed by RIG Exploit Kit), Spora (Lemmou et al., 2021), BTCWare (Wood and Eze, 2020), Crysis (Wood and Eze, 2020), NotPetya (NotPetya Ransomware Attack [Technical Analysis], 2017), GlobeImposter (Dargahi et al., 2019), Sage2.0 (Sage 2.0 Ransomware), Scarab (Lemmou et al., 2021), LockerGoga (Adamov et al., 2019), Jigsaw (Berrueta et al., 2019), Ryuk (A Targeted Campaign Break-Down - Ryuk Ransomware, 2018), and Zeoticus 2.0 (Walter).

Ransomware attacks that use C2 to establish control over compromised hosts and further direct actions use three different approaches such as C2 server static IP address, static DNS domains for C2 servers, and dynamically generated domain names. With static C2 server IP addresses, the IPs to which ransomware attempts to connect in this attack phase are already hard encoded within attack tools and files downloaded in previous stages. In a recent example, the ransomware dubbed Maze was widely distributed in Italy during 2020 using the list of static IPs to connect to C2 servers and share the information about the victim host immediately after the encryption (Ransomware Maze, 2020). Unlike Maze ransomware, WannaCry used static hard-encoded DNS domains to access C2 servers instead of IP addresses. Incidentally, another static DNS domain, iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea.com, was hard-coded into WannaCry. Subsequently, the ransomware researchers found the domain name to be a kill-switch for WannaCry propagation (Akbanov et al., 2019). Finally, dynamically generated domain names characterize ransomware families that aim to make both static binary analysis and network detection difficult. Examples like

Fig. 4. Usage of encryption algorithms by major ransomware families 1989 - 2021

Locky and TeslaCrypt ransomware (Berrueta et al., 2019) utilize dynamic generation algorithms (DGA) to create domain names dynamically. DGA’s purpose is to make it difficult for defenders to discover and block C2 servers’ names and/or IP addresses. In order to keep its activity hard to detect and yet avoid total randomness, the DGA is using some of the following building elements:

  • Seed, which can be a word(s) and/or number(s), is a building element introduced by ransomware DGA writers, and it can be changed to segregate C2 domain names between different versions or groups of victims.

  • Time-based is the element that changes dynamically with time. It does not need to be necessarily influenced by time or date, and some other event can trigger it; the only condition is that it changes over a period of time.

  • Top-level domains (TLDs) are the final part of DGA-created domain names. The first two create the body of a domain name by being combined, and then a predetermined TLD is added. TLDs like “.xyz,” “.top,” and “.bid” are very popular when creating DGA (Arntz, 2016).

Ransomware C2 servers’ communication plays a prominent role in many proposed ransomware detection mechanisms that detect C2 IPs and domain names in the ransomware tools and network traffic. These can be used in activities from deny-listing all the way to detecting DGA-created domain names in DNS queries to be used with DNS sinkholes (Dynamic Resolution: Domain Generation Algorithms, Sub-technique T1568.002 - Enterprise | MITRE ATT&CK®).

2.1.3. Encryption

The Encryption phase of a ransomware attack includes the following phases: encryption key generation, obtaining a public key from the C2 server, searching file system, encryption, exfiltration of data with specific extensions or in particular folders, and deletion of possible backups like shadow volumes.

Different ransomware families use various encryption schemes to encrypt their victims’ data. Whether the attacker chooses to use symmetric, asymmetric, or a combination of both encryptions directly influences cryptographic key generation and management during the Encryption phase of the attack. Table 1 names prominent ransomware families since 1989 and their choice of encryption. The distribution of cryptographic methods with symmetric and a combination of symmetric and asymmetric are most commonly used, while asymmetric alone is used much less. While researching sources for information contained in Table 1, the authors have compiled data from these sources to create Fig. 4, which shows the distribution of various encryption algorithms’ usage from the first ransomware attack in 1989 to the end of 2021. In the case of exclusive symmetric encryption use, key generation is done by either using local operating system cryptographic capabilities or a custom implementation of cryptographic algorithms.

In Microsoft Windows, ransomware uses the function BCryptGenRandom Cryptography API: Next Generation (CNG), as exemplified by Noberus ransomware (Noberus) or CryptGenRandom - Maze ransomware (Ransomware Maze, 2020). In Apple’s macOS and IOS, the SecRandom function carries similar capabilities to CryptGenRandom and Linux, along with several other UNIX-like operating systems that implement getrandom as a system call. Ransomware for the latter operating systems uses open source libraries like mbedtls - examples seen in KeRanger (New OS X Ransomware KeRanger Infected Transmission BitTorrent Client Installer, 2016) and RansomEXX (RansomEXX Trojan attacks Linux systems, n.d.) ransomware re. The secret key is sometimes protected when utilizing an asymmetric encryption scheme in remote secure storage. In the case of the local generation of keypair, the secret key is encrypted with another C2- provided public key. The public key is either locally generated with a secret key, supplied by a C2 server, or both. Ransomware like Cerber used C2 supplied RSA public key to encrypt locally generated RSA secret key that was used to encrypt locally generated RC4 key used for victim’s files encryption (Sala). On the other hand, CryptoWall ransomware would not start encryption unless a 2048-bit RSA key is received from C2 (Cabaj et al., 2015).

In most cases, successful ransomware attacks combine symmetric ciphers like Rijndael, ChaCha/Salsa20, or RC4 together with asymmetric ciphers like RSA or ECC. This is primarily due to the speed of encryption advantage that symmetric cipher provides over asymmetric encryption when encrypting a large volume of data. In scenarios where the secret key remains on C2, asymmetric encryption is a good option to encrypt the symmetric key. This way, the victim’s responders to that attack would not be able to use it in decryption before paying the ransom. The speed is also a factor in locating the files to be encrypted by ransomware. Some attackers infect all drives alphabetically (in Windows-based attacks), while some limit infection to specific user folders like Desktop or Documents. Most sophisticated ransomware provides whitelist exclusion of specific system folders and system configuration files to maintain the operating system’s functionality after the encryption (Lemmou et al., 2021).

During the actual encryption, ransomware applies four tactics: reading, encrypting in memory, writing to the file system, and removing original files. While reading a file, ransomware like CryptoWall tries to read files in one read, reducing the number of read/write operations (Lemmou et al., 2021). On the other hand, ransomware can use fixed block lengths for reading and writing files during encryption. WannaCry or LockerGoga ransomware read files in 256 kb and 64 kb blocks, respectively (Loman, 2019). The third approach to encryption is when ransomware performs a read of the fixed buffer from the beginning or from the end of the file twice before committing a write to the file system. This behavior has been observed in ransomware Spora which uses two read operations checking for ransomware added specified values to the content of each file before encryption from the end to establish if the file is already encrypted (Lemmou et al., 2021).

Finally, ransomware can write directly to the original file and then optionally rename it during the destruction of the original files. Another way of destruction is by saving encrypted files in the new location and then deleting, moving, or overwriting the original. Also, the third method includes moving the original file to some temporary location, overwriting it with encrypted data, and then moving it back to its original place in the file system.

RIPlace is a new technique of replacing the original files with encrypted files that have been able to bypass all of the known protection systems for the Windows family of operating systems (CISOMAG, 2019). Found in ransomware like Thanos (Walter), the RIPlace utilizes IRP_MJ_SET_INFORMATION system callback in combination with the legacy DefineDosDevice function to delete original files. At the same time, renaming is performed on both original and encrypted files.

Table 1Major ransomware families’ usage of encryption schemes during the Encryption phase of the attack in chronological order 1989 - 2021.

Deletion of backup files most commonly occurs with the deletion of Windows Volume Shadow Copy using operating system tools or through encryption of shared drives when some sort of NAS solution is deployed for backup purposes.

2.1.4. Extortion

Once the files are entirely or, in some cases, partially encrypted, the ransomware creates a ransom note as a text or HTML file instructing the victim on what to do in order to retrieve their data.

Payment of ransom in the extortion phase of ransomware attack has represented difficulty for cyber-criminals since ransomware’s first appearance in 1989. The inability to remain anonymous has pushed early ransomware attackers to use payment means like premium-rate text messages or pre-paid vouchers like Paysafe cards (Oz et al., 2022) in the times before the appearance of cryptocurrency. After the introduction of BitCoin in 2009, most ransomware attackers moved towards cryptocurrency ransom payments in the Extortion phase of the attack. In 2012, locker ransomware Reveton was the first Ransomware-as-a-Service (RaaS) and the first ransomware to demand payment in BitCoin. Among cryptographic ransomware, CryptoLocker in 2013 was the most advanced and among the first to strongly emphasize payment by BitCoin (Liao et al., 2016).

The section 2.1 has outlined the main characteristics of all four kill chain phases. We identified the most common instances of crypto-ransomware behavior and methodology. However, in order to adapt this kill chain into actionable recommendations necessary for the effective prevention of ransomware, the following sections will introduce a novel approach where the focus in detecting ransomware is concentrated on the detection of Encryption as conceptualized in the previous section.

This paper is available on arxiv under CC BY 4.0 DEED license.