Fast Crypto

2019-07-22 :: Jarrod Pas (ydob0n) from CyBRICS CTF Quals 2019

Here you have some modern encryption software. Actually, it’s even too modern for your hardware.

Try to find out how to decode the WAV file with a secret message:

From the description we can see that is this a crypto challenge with some custom rolled encryption. Lets take a look at what we have inside

$ unzip
Archive:  ../
  inflating: enc.wav
  inflating: public.key

We have enc.wav the encrypted data, the cipher, and public.key an RSA public key in JSON format.

Taking a closer look at we can see a few interesting things:

def get_next(a, power, N):
    b = a ** power % N
    return b, b % 256

Which is an incredibly inefficient1 RSA computation, you can do much better with modular exponentiation via pow(a, power, N).

Finally, the encryption algorithm:

    # calculate offset
    for _ in range(key['O']):
        seed = get_next(seed, power, key['N'])[0]

    with Path(sys.argv[2]).open('wb') as w:
        for i in tqdm(range(len(data))):
            seed, bt = get_next(seed, power, key['N'])
            w.write(bytes([data[i] ^ bt]))

Instead of actually using RSA, this algorithm uses RSA as a random number generator and uses the least significant byte of each step as a key byte. Since we know the output file is a WAV file we know what that the first 4 bytes are RIFF in ASCII, which we can use to determine that the first 4 bytes of the key stream are 14CA0F02!

The script also prompts the user for some input on the seed value for the RNG and which power to use, but we can think of these as our private key! The seed is a 16-bit integer and the power is between 2 and 16 so the number of possible keys is roughly 220, which is small enough that we can brute force it!

At this point I go and write the brute forcer in Julia since it is almost as fast as C but has easy constructs for parallelism. To get an idea of how long the brute force will take we will benchmark different implementations of the offset computation for each possible power.

samples = 30

println("original code ported")
@btime for power in 2:16
    state = rand(Int16)
    for _ in 1:O
        state = (state^power) % N
end samples=samples

println("looped powermod")
@btime for power in 2:16
    state = rand(Int16)
    for _ in 1:O
        state = powermod(state, power, N)
end samples=samples

println("powermod with powering offset")
@btime for power in 2:16
    state = rand(Int16)
    state = powermod(state, BigInt(power)^O, N)
end samples=samples

println("powermod with precomputed offset powers")
offset = collect(map(x -> BigInt(x)^O, 1:16))
@btime for power in 2:16
    state = rand(Int16)
    state = powermod(state, $offset[power], N)
end samples=samples

Which has following results:

original code ported
  2.708 s (4089858 allocations: 487.64 MiB)
looped powermod
  1.758 s (3745181 allocations: 121.70 MiB)
powermod with powering offset
  684.417 ms (194 allocations: 1.17 MiB)
powermod with precomputed offset powers
  656.555 ms (105 allocations: 1023.28 KiB)

Taking the final result as an estimate for each seed we can make the estimate that it will take about 12 hours to complete on a single core which is small enough that we could finish it before the CTF ends.

This is the brute forcing routine. tfilter is a helper function for a threaded calling of the key checker and progress monitoring. The full source can be seen here.

function find_keys(range)
    offset = collect(map(x -> BigInt(x)^O, 1:16))

    tfilter(range, v -> !isnothing(v)) do seed
        for power in 2:16
            state = powermod(seed, offset[power], N)

            state = powermod(state, power, N)
            if state & 0xff != 0x14 continue end

            state = powermod(state, power, N)
            if state & 0xff != 0xca continue end

            state = powermod(state, power, N)
            if state & 0xff != 0x0f continue end

            state = powermod(state, power, N)
            if state & 0xff != 0x02 continue end

            return seed, power

        return nothing

Now with this in hand we can throw a ton of CPUs at it and find the key! If you watch diligently it will come out at around 7%, but after an about an hour on a machine with 48 threads we exhaused the search space!

We found that seed is 4485 and power is 7. Now all we need to do is throw them into the

$ python3 enc.wav dec.wav
Input seed value (int16): 4485
Input power value (2-16): 7

At last we can listen to dec.wav and hear:

the flag is cybrics{blum_blum_crypto}

  1. When done this way it computes a ** power first, then computes the remainder which consumes an exponential amount of memory. ^