The curious case of BLATSTING's RSA implementation

Among BLATSTING’s modules is one named crypto_rsa. According to the name one’d expect it to implement the well-known asymmetric cryptosystem going under that name.

RSA

One’d expect an encryption operation

o = m^e (mod n)

and a matching decryption operation

o = m^d (mod n)

where m is the message, n is the RSA modulus (p times q), and e and d are the encryption and decryption exponent respectively, parameters computed during key generation and stored in a public or private key structure.

The interface

crypto_rsa offers four functions:

i01020000 crypto_rsa

  + 0x14: 2 args create_ctx(keyblock,size)
      Allocates a context, endian-swaps and copies the key data.

  + 0x18: 4 args encrypt(ctx, inptr, outptr, inlen)
      Performs (supposedly) RSA encryption using a fixed exponent e=65537

  + 0x1c: 0 args dummy()
      Unimplemented, just "ret"

  + 0x20: 1 args free_ctx(ctx)
      Frees the context ctx

Apparently the interface only offers encryption, likely used for signature verification. The keyblock is a structure of 544 bytes containing a (up to 1024 bit) RSA key, with various bignum parameters represented by arrays of 32-bit integers;

struct key_block {
    uint32_t n[32];    // modulus
    uint32_t unk0[32]; // Unused
    uint32_t x[32];    // ?
    uint32_t unk1[32]; // Unused
    uint32_t unk2;     // Unused
    uint32_t fudge;    // ?
    uint32_t padding[6];
};

The fixed exponent e is not encoded in this structure.

But what are those extra fields for, x and fudge?

Curioser and curioser

Here’s a Python version of what encrypt does:

def weirdmod(a, b, fudge): # function at 0x08000cd4
    # tally: 32 integer multiplications, 32 bn_muls, 32 bn_adds, 1 bn_compare, 1 bn_sub
    v = a
    for i in range(32):
        v = ((((fudge * (v&0xffffffff))&0xffffffff) * b) + v) >> 32
    if v > b:
        v -= b
    return v

# RSA according to BLATSTING
def bs_rsa_encrypt(ctx, temp): # function factored out for clarity
    # pre-multiplication
    temp = weirdmod(temp * ctx.x, ctx.n, ctx.fudge)

    # m ** 65537 mod n
    to = temp
    for i in range(16): 
        to = weirdmod(to * to, ctx.n, ctx.fudge)
    temp = weirdmod(temp * to, ctx.n, ctx.fudge)

    return weirdmod(temp, ctx.n, ctx.fudge)

def bs_rsa_encrypt_outer(ctx, inptr, outptr, len): # function at 0x08000170
    temp = memcpy_bswap4_in(inptr, len)
    temp = bs_rsa_encrypt(ctx, temp)
    memcpy_bswap4_out(outptr, temp, len)

Broadly it looks like a RSA encrypt operation with a hard-coded exponent of 65537 (which is standard), except that an unconventional pre-multiplication with x is done. After each operation a mod is applied to bring the result back within the range [0..n-1].

But wait: note that weird_mod does not actually, as would be first expected, implement a modulus operator. I’m honestly not sure what it is. Unlike mod, applying it repeatedly to a value does not yield the same result, applying it to 1 does not yield 1. What use would they have for such a a mutilated version of RSA?

Call site

The only place where this module is used from is TADAQUEOUS, from the hooked function __add_ipsec_sa. It supplies the following parameters:

class Context:
    n = 0xd257c42f17e16815bef4c2f3fede55b5b7ed35fa4ae040aac0515a7bc662f564ac4e98272b61c24b666581479b295833ba9f22d6df733dacd819599fcc757e40a63f88fcbd3007ce7775688e5288a5810add8c1badb4773bff9abb067cf35f5d51bc23f02192cf67a54fbc4e54d7933023511b8812e0f6de8cc8ea1ef2361241
    unk0 = 0x2da83bd0e81e97ea410b3d0c0121aa4a4812ca05b51fbf553faea584399d0a9b53b167d8d49e3db4999a7eb864d6a7cc4560dd29208cc25327e6a660338a81bf59c0770342cff831888a9771ad775a7ef52273e4524b88c4006544f9830ca0a2ae43dc0fde6d30985ab043b1ab286ccfdcaee477ed1f0921733715e10dc9edbf
    x = 0x76bc66dabca44047215cedfe4b6182cee4a9af38201d5b83ea8b3ab5ad7a05e835327be2337d8c302adb02625af5d206d6d28393e570308d66f99f5f368f14b129e85067e8e662d33a8f7de7db52d2dffee4637e276ac79d490654da2e4bedfbb293ec461fe848979ba81b39e1a8bebe6a8940f12391e436772ca7d14c42c0eb
    unk1 = 0
    unk2 = 1
    fudge = 0xbb4d023f

Surprise

So imagine my surprise when I tried it out and compared, using the above parameters:

# Conventional RSA
def rsa_encrypt(ctx, m):
    return pow(m, 65537, ctx.n)

# Try with random 1024-bit value
ctx = Context()
m = random.randint(0, (1<<1024)-1)

# Compare results
assert(rsa_encrypt(ctx, m) == bs_rsa_encrypt(ctx, m))

The result matches convential RSA without pre-multiplication and with a normal expmod operator! So it is some kind of optimization, but I had not seen it before, which doesn’t say that much, but it’s not part of e.g. OpenSSL. Edit: it is, according to k240df and martins_m on reddit this is Montgomery reduction which is in OpenSSL under crypto/bn/bn_mont.c. The thought came up when writing this that it was Montgomery reduction but I did not recognize it as such.

I’m not up to date with the state of the art is with regard to efficient bignum arithmetic. Assuming 1024-bit numbers: A naive modulus implementation based on long division would take up to 1024 bignum comparisons and 1024 bignum subtractions, whereas the weird_mod operation takes 32 integer multiplications, 32 bignum muls, 32 bignum adds, 1 bignum compare, and 1 bignum sub. Whether it is a win depends on how bignum multiplication is implemented. A naive bignum multiplication would take up to 32*32 integer multiplications and 32 bignum adds in which case it would not really help. I have not studied the particular bn_mul implementation in BLATSTING (address 0x080004a0*).

Independent of the performance characteristics, I think this alternative implementation is worth highlighting, as it is in things like this that the Equation Group keeps true to their name. It looks like a form of Barrett reduction, turning divisions into multiplies, and precomputing a multiplicant over the exact number of modular reductions required. Edit: apparently this is a well-known optimization called Montgomery Reduction. Disappointing, I had hoped to catch at least some crypto magic in the act.

* All mentioned memory addresses are as shown by radare2, which loads the ELF part of Firewall/BLATSTING/BLATSTING_201381/LP/lpconfig/m01020000/m01020000.impmod at 0x08000000.

Written on September 13, 2016
Filed under