An Intermediate Guide To RSA

The venerable RSA public key encryption algorithm is very elegant. It requires a basic understanding of modular arithmetic, which may sound scary if you haven’t studied it. It reduces to taking the remainder after integer long division. The RSA Wikipedia article describes five simple steps to generate the keys. Encryption and decryption are a matter of basic exponentiation. There’s no advanced math, and it’s easy to understand their example of working with small numbers.

So when I was looking for a simple project to practice a new programming language, I thought, “Hey, RSA isn’t that hard, let me try implementing that.”

Yeah.

I quickly learned that those five simple steps do not map to simple code. There is a gap in the online materials documenting RSA. Most give the “beginner” version from Wikipedia. The rest are highly technical, with either deep discussions proving the math or the complex realities a truly secure production RSA implementation requires.

Nothing I read used variable names that were meaningful at all. The symbology of mathematics was designed (in the days before tab-complete) for ease of hand-writing. It pains me no end to read python code that ruin the true beauty of the language—it’s readability forgettable one-letter variables. I used to think my brain was too small for advanced mathematics, but it actually just needs better mnemonics.

I wanted an intermediate description of the RSA algorithm oriented towards programmers. Instead, I wrote one. This is it.

Standard Caveat: This code is provided entirely for pedagogical purposes. It is not intended to be cryptographically secure, do not use it in production.

Goals

This article provides:

comprehensible Python code implementing basic RSA
implementations of arithmetic algorithms (the only external imports are namedtuple and secrets).
unit tested code
a detailed walk-through of the RSA algorithm, mapping code to theory

The RSA algorithm in (sort-of) plain English

Here are snippets of the RSA algorithm, copy pasted from Simple English Wikipedia:

Generating a key:

Choose two different large random prime numbers p and q
Calculate n = pq
- n is the modulus for the public key and the private keys
Calculate the totient: ϕ(n) = (p−1)(q−1)
Choose an integer e such that 1 < e < ϕ(n), and e co-prime to ϕ(n) i.e: e and ϕ(n) share no factors other than 1; gcd(e ,ϕ(n) = 1.
- e is released as the public key exponent
Compute d to satisfy the congruence relation de ≡ 1 (modϕ(n)) i.e: de = 1+kϕ(n) for some integer k. (Simply to say : Calculate d = (1+kϕ(n))/e)
- d is kept as the private key exponent

Encrypting messages:

c = m ^e mod n

This can be done quickly using the method of exponentiation by squaring.

Decrypting messages:

m = c^d mod n

That may or may not make much sense, but if you read through the working example, it should come clear. It’s basic math, with the possible exception of de ≡ 1 (mod ϕ(n)). The ≡ symbol in modular arithmetic means “is congruent to” and the whole equation means “find d such that de mod ϕ(n) = 1”. “mod” means remainder. Python uses the % symbol for it.

Breaking it down, with code

The key to the steps above is to expect more computational work than I gave it credit for. None of the steps requires exponential brute forcing of a solution or anything (except, of course, hacking the original message from the cipher, but that is supposed to be hard), but they aren’t simple mathematical formulae. Each step is an algorithm in its own right. Only step 2, “Calculate n = pq is obviously straightforward, though step 3 is close.

Choose two different large random prime numbers

The only way to generate a random prime is to “guess and check”, which seems inefficient. Generating secure random data requires a little thought (though not much, in Python). Creating numbers as large as those needed for cryptography isn’t straightforward either. Making sure the number is prime is more computationally expensive than I expected.

Random numbers

Most people know not to use the python random module to create randomness for cryptography. The random module creates pseudo-random numbers. If you know the algorithm used to generate a random number (there are many), it’s straightforward to regenerate the sequence. These numbers are mathematically related, and are therefore not cryptographically secure.

Luckily, modern operating systems can create truly random numbers. I don’t know the techniques, but I believe it’s based on entropy (randomness) generated from the unpredictable flow of i/o through the various peripherals, storage drives, and network. Maybe someone can give us more information in the comments. Python (3.6 and above) kindly makes this random data available in the secrets module.

Large numbers

So making a number random is easy if you know the secrets, but how do we define “large number”? I may come across as not very bright here, but it took me a bit of research to understand. It’s not just a matter of picking some minimum and maximum “large enough” integers and picking an integer between them using randint. Aside from the fact that secrets doesn’t have a randint function, how do we arbitrarily decide what the minimum and maximum values are?

Recall that cryptographers usually talk about encryption in terms of bits. When I remember SSL becoming a thing, “really secure, like banks use” meant 128-bit encryption, which sounds funny today. I checked https://docs.python.org/ (I had it open to the secrets module, obviously), which uses 2048-bit encryption. That, plus the fact that secrets.randbits is a function is a hint as to how we can generate our “large random numbers”.

Remember, we need two numbers p and q which get multiplied together to get n. Those variables are meaningless to me, so I’m going to say prime1 and prime2 which are multiplied together to get the key_modulus instead. The key_modulus will be used as the modulus part of the key during encryption and decryption.

“2048-bit encryption” refers to the number of bits in this key_modulus. In order to make key_modulus have 2048 bits (give or take), the two prime numbers each need 1024 bits. A first attempt might look like this:

from secrets import randbits

def random_prime(num_bits:int) -> int:
return randbits(num_bits)

The problem with this code is that, while it returns 1024 random bits, the bits could be 1023 zeros and 1 one. It doesn’t matter how many zeros are in front of a 1, it’s still a 1. Just guess how cryptographically secure the number 1 is.

The solution is to “set the high bit”, that is: ensure the bit in the 1024th position is a 1. I don’t do a ton of bitwise arithmetic in Python, so I had to look up the syntax: value | 1 << (num_bits - 1).

To break that down, imagine numbits is 5. The 1 << x creates a number with x zeros. So 1 << 4 results in the binary number 10000, which is the integer 16. The value | x is the “bitwise or” operator. It yields a number where at any bit position, if either of the two numbers has a 1, the resulting number is also a 1. So the bitwise or for the integers 6 and 16 is as follows:

110
10000
-----
10110

which represents the integer number 22.

(If testing in the python interpreter, you can see the binary form of an integer using bin(the_integer) or f"{the_integer:b}")

Since I was doing bitwise arithmetic anyway, I added a couple extra bit shifts and a wrapper function to make my API a bit (no pun intended) cleaner:

def guarantee_bits(value: int, num_bits: int) -> int:
return value & (1 << num_bits) - 1 | (1 | 1 << (num_bits - 1))

def exact_randbits(num_bits: int) -> int:
return guarantee_bits(randbits(num_bits), num_bits)

In addition to setting the high, the bitwise operators in guarantee_bits do the following:

If the value has too many bits, truncate the extras. This situation can’t happen in the secrets.randbits scenario, but I wanted my unit test to be complete for the hypothetical case.
Make sure the lowest bit is set, which forces an odd number. This cuts the search space for a prime number in half (all even numbers above two are not prime).

Prime numbers

Having generated a random number, the next step is to check whether or not it is prime and discard it if not. Primality testing is also not trivial.

To be prime (by definition), an integer must not be divisible by any integer other than itself and one. I started with an implementation of the fairly obvious trial division method, which checks if any possible divisor evenly divides it. Some tricks can be applied to reduce the number of factors that need to be checked, but it’s still not efficient.

My initial implementation of this algorithm looked like so:

FIRST_PRIMES = [2, 3, 5, 7, 13, 17, 19, 23]

def trial_division(number: int) -> int:
if number <= 1:
return False
for prime in FIRST_PRIMES:
if number == prime:
return True
if not number % prime:
return False

    redundant_above = math.ceil(math.sqrt(number))
    for i in range(5, redundant_above + 1, 6):
        if not number % i or not number % (i + 2):
            return False
    return True

This works. It passed my unit test, which checks it against a first thousand primes list I found online. But it is too slow for cryptography. I gave up after waiting a few minutes for it to test one prime number.

In spite of popular belief, I rarely find that Python is “too slow”, but when it is, I reach for Cython. A Cython implementation of the algorithm gave more than an order of magnitude speedup. I think it might even have been “fast enough” for “educational purposes” except for one small issue:

Python knows how to do math on integers of any size. C (which Cython transpiles to), on the other hand has 64-bit, integers. 64 is obviously nowhere near the 1024 or more bits we need. I decided it wasn’t worth studying libraries for arbitrary precision integer arithmetic in C and set that solution aside. I won’t include it here, but my Cython code is in the GitHub repository.

Instead I turned to what is usually the correct solution when faced with Python code that is “too slow”: use a smarter algorithm.

Most of the RSA resources I read said to use the Miller-Rabin test, but a few suggested the Rabin-Miller test! I didn’t originally follow those suggestions, assuming that any algorithm named after two people is probably hard to understand and implement (RSA, after all is named for the initials of three people). I’m not looking to create a production implementation of RSA here, just to understand and demonstrate how it works.

However, after reading the Miller-Rabin Wikipedia page as well as most of the top Google results, I was able to understand the algorithm. This link, especially, gives an easier-to-understand description, although I rearranged it for my own code (probably introducing some errors along the way).

The Miller-Rabin test does not prove that a number is definitely prime. It can prove a number is composite (not prime), though. If you run the test multiple times, seeded with a random number, you can get near enough to proof that it doesn’t matter.

My implementation looks like this:

FIRST_PRIMES = [2, 3, 5, 7, 13, 17, 19, 23]

def miller_rabin(number: int) -> int:
if number <= 1:
return False
for prime in FIRST_PRIMES:
if number == prime:
return True
if not number % prime:
return False

    odd_factor = number - 1
    mod_levels = 0
    while odd_factor % 2 == 0:
        odd_factor = odd_factor // 2
        mod_levels += 1

    for trials in range(40):
        witness = random.randrange(2, number - 1)
        mod_pow = pow(witness, odd_factor, number)
        if mod_pow == 1:
            continue
        for i in range(mod_levels):
            if mod_pow == (number - 1):
                break
            else:
                i = i + 1
                mod_pow = (mod_pow ** 2) % number
        else:
            return False
    return True

The first section checks some trivial cases. It iterates the first few hard-code primes and tests if any of them evenly divides the candidate prime (if number % known_prime = 0 then number is composite).

The second section divides the number in half repeatedly down to an odd number. We know number - 1 is even because number is odd: all even numbers would have been weeded out by being multiples by the 2 in FIRST_PRIMES. This gives us the number of ’levels’ of modular operations that need to be checked by the inner loop.

The third section does the actual test. It iterates 40 times, enough to “virtually guarantee” that the candidate number is truly prime. It picks a random number as a witness. If the witness proves the number is composite, we discard it, returning False.

I briefly understood the math of the inner loop, but you should probably look to Wikipedia or a search engine to get a decent proof.

You may not have seen the three-argument version of the pow function used in the first highlighted line. It does exponentiation with a modulus operator (i.e: x^y mod z). The Python pow function can calculate x ** y % z more efficiently.

The for...else syntax may also be unfamiliar. I usually avoid this construct because it isn’t widely known, but seemed the most concise way to implement this algorithm. The else clause on a for loop is only executed in the event that the loop ran to completion without encountering a break statement.

If this loop reaches a state of mod_pow == (number - 1), then the number might be prime and it breaks to test another witness. If the loop passes through all the mod_levels without reaching that state, then the number is definitely composite and we can return False.

If 40 such witnesses cannot prove the number is composite, we assume (with tremendous confidence: my favourite stack overflow answer indicates a bit flip in your CPU is more likely) that the number is prime and return True.

I probably shouldn’t have gone all the way down that rabbit hole. The sympy.isprime function is certainly better. But rabbit holes can be fun and educational, and it was worth it.

Now the earlier random_prime function can finally be extended to loop until it finds a prime number:

from .is_prime import miller_rabin as is_prime

def random_prime(num_bits: int) -> int:
number = exact_randbits(num_bits)
while not is_prime(number):
number = exact_randbits(num_bits)
return number

With random_prime available and unit tested, we can start the generate_key function coordinating the four steps:

def generate_key(num_bits: int) -> RSAKey:
prime1 = random_prime(num_bits // 2)
prime2 = random_prime(num_bits // 2)
while prime2 == prime1:
prime2 = random_prime(num_bits // 2)

I threw the while loop in their for the unlikely scenario of getting the same 1024 bits twice. This makes the unit tests more robust when testing against small numbers.

A note on the return value

I’m using type hints throughout the article. That is overkill since every parameter and argument is an integer except this RSAKey construct returned here. `RSAKey is a dataclass (before Python 3.7 I would have used a namedtuple) containing the numbers required to perform encryption and decryption:

from dataclasses import dataclass

@dataclass
class RSAKey:
modulus: int
pub_exponent: int
priv_exponent: int

RSA’s public key and private key each require two numbers: the modulus and the appropriate exponent.

Calculating n and ϕ

Having two large prime numbers, steps 2 and 3 of the key generation algorithm are trivial. Here’s the beginnings of my generate_key function:

def generate_key(num_bits: int) -> RSAKey:
prime1 = random_prime(num_bits // 2)
prime2 = random_prime(num_bits // 2)
while prime2 == prime1:
prime2 = random_prime(num_bits // 2)
key_modulus = prime1 _ prime2
totient = (prime1 - 1) _ (prime2 - 1)

The totient, short for Euler’s totient represents the number of integers smaller than a number that are co-prime, such that the greatest common divisor of the two numbers needs to be 1.

The totient is an important link between modular arithmetic and a mathematical discipline known as group theory that is critical to proving cryptography algorithms, but not to understanding them.

For this purpose, we just need totient = (prime1 - 1) * (prime2 - 1). For prime numbers, the totient is prime - 1, and we can multiply the two prime totients to get that of the modulus.

Choose e, the public exponent

Researching this step was probably the most annoying part. Apparently, most algorithms just set e to 65537 and be done with it. The pub_exponent (as I prefer to call e) must have no common factors with the totient we just calculated in order to have a valid decryption key. My understanding is that most algorithms assign 65537 to pub_exponent before calculating the primes and totient. If the totient is invalid, they just pick different primes.

I felt like that was cheating! I’m no security expert, but it also seems less secure since it reduces the search space if you want to brute force the two primes. For fun, I decided to make pub_exponent choose a random number in my implementation since it matches the text description of the algorithm.

So pub_exponent and totient must have no common factors, which means the greatest common divisor must be 1.

I learned about gcd in grade school a quarter century ago, but that doesn’t mean I still remember how to do it. I could get away with using the math.gcd python library function, but I had it in my head to understand all the math I’m doing.

The algorithm to calculate a greatest common divisor between two numbers is known as the Euclidean Algorithm. It’s easy:

def gcd(a, b):
if b == 0:
return a
else:
return gcd_recursive(b, a % b)

I didn’t use this version, though. It would hit recursion limits with the ridiculously large numbers we are using. In the next section, I use an iterative version of the Extended Euclidean Algorithm to calculate modular inverse, so I get the gcd from that as well.

Assuming a working gcd function, Step 4 adds the following lines of code to generate_key:

pub_exponent = random_prime(num_bits - 2)
while pub_exponent >= totient or gcd(pub_exponent, totient) != 1:
pub_exponent = random_prime(num_bits - 2)

Subtracting 2 from num_bits ensures pub_exponent is lower than totient, a requirement for the algorithm to work.

Compute d to satisfy the congruence relation

Looking at step 5 in the RSA description, I originally thought that it was simple mathematics. After all Wikipedia actually uses the word “Simply”: “Simply to say : Calculate d = (1+kϕ(n))/e)”. Unfortunately, it’s not that simple because we don’t know what k is, so we can’t just plug it into the simple equation.

The value we are trying to find, priv_exponent (rather than d) is called the modular inverse of pub_exponent. The easy way to calculate this is to use the sympy.numbers.mod_inverse function. Honestly, if I were you, I’d just do that if I were you and skip the following details. ;-)

This site gives a pretty good English overview of what modular inverse is. It can be calculated using the Extended Euclidean Algorithm, described well here. This also gives us the gcd function to calculate the public key above.

In addition to the greatest common divisor returned by the normal Euclidean Algorithm, the extended version calculates two numbers, x and y such that the equation ax + by = gcd(a, b). If a is the pub_exponent and b is the totient, then the x returned by the algorithm is the modular inverse. This is only true if the gcd is 1, but that was guaranteed by the previous step.

The recursive definition of egcd is as follows:

def egcd(a, b):
if a == 0:
return (b, 0, 1)
else:
g, x, y = egcd(b % a, a)
return (g, y - (b // a) \* x, x)

It may be easy to understand, but it causes recursion errors on our huge numbers. The article I linked gives the full iterative algorithm, but I shortened it a bit since modinv doesn’t need the y. This code listing also includes the gcd and modinv functions called from generate_key:

def \_xgcd(b: int, a: int) -> typing.Tuple[int, int]:
x0, x1 = 1, 0
while a != 0:
x0, x1 = x1, x0 - b // a \* x1
b, a = a, b % a
return b, x0

def gcd(b: int, a: int) -> int:
return \_xgcd(b, a)[0]

def modinv(base: int, modulus: int) -> int:
gcd, x = \_xgcd(base, modulus)
if gcd != 1:
raise ValueError(f"No modular inverse for {base} {modulus}")
return x % modulus

With these helper functions in place, generate_key can be completed as follows(new lines highlighted):

def generate_key(num_bits: int) -> RSAKey:
prime1 = random_prime(num_bits // 2)
prime2 = random_prime(num_bits // 2)
while prime2 == prime1:
prime2 = random_prime(num_bits // 2)
key_modulus = prime1 _ prime2
totient = (prime1 - 1) _ (prime2 - 1)
pub_exponent = random_prime(num_bits - 2)
while pub_exponent >= totient or gcd(pub_exponent, totient) != 1:
pub_exponent = random_prime(num_bits - 2)
priv_exponent = modinv(pub_exponent, totient)
return RSAKey(key_modulus, pub_exponent, priv_exponent)

Encryption and Decryption

Encrypting and decrypting are as “easy” as raising the message to the power of the public or secret exponent mod the modulus. The pow function can do this modular arithmetic. It’s a good thing pow is a python builtin so it doesn’t violate my “don’t import any extra math modules” commitment!

Here are the encrypt and decrypt functions:

def encrypt(key: RSAKey, message: int) -> int:
return pow(message, key.pub_exponent, key.modulus)

def decrypt(key: RSAKey, ciphertext: int) -> int:
return pow(ciphertext, key.priv_exponent, key.modulus)

You’ll notice that message and ciphertext are both integers and may wonder how that is useful to encrypt a text message (or a jpg, for that matter). I’m not going to dive into the details, but the simple answer is they are just bits. Represent your string bytes as an integer and you’re set.

Almost. There are a couple caveats.

You can’t just throw int.to_bytes and int.from_bytes at it because the bytes won’t necessarily line up (you’ll end up with extra zeroes). Do a web search for “RSA padding schemes” to solve this (and improve the security).

In addition, the message (and ciphertext) integer must be smaller than the modulus. If your modulus is 2048 bits, that’s 256 bytes. That’s at most 256 characters using ASCII text, fewer if you like emojii. It’s not even enough to offend Twitter’s character limit. How do you send anything meaningful?

One obvious answer is to split the message into 256 byte chunks and encode each into packets to be reassembled at the other end. But that’s not the normal approach. Encryption using RSA is CPU intensive. There are much faster symmetric key encryption schemes (search for Advanced Encryption Standard). The problem with symmetric key encryption, compared to public key encryption is that you need to securely share the secret key with the receiving party. Asymmetric key cryptography doesn’t have that problem. If an attacker knows the public key, they still can’t decrypt the message without the private key.

So just take the best of both worlds. Create a random AES secret key, then encrypt and share it using the recipient’s public key. Once you both have the symmetric key, all communication can be securely encrypted using it. The maximum size of an AES key is 256 bits, which can easily be encrypted with a 2048-bit public key.

Conclusion

I hope this article helped clarify your understanding of RSA. Moreover, I hope you can implement it with less frustration than I experienced while reading the deceivingly simple descriptions I could find.

My implementation is definitely not production-grade. I designed it, as much as possible, to be readable and educational, not secure or optimal.

I am probably more of a novice in security and cryptography than some people reading this. Please point out any errors, omissions, or confusion in the comments and I’ll try to improve it.

All code available on GitHub.