Jonathan Leto: January 2009 Archives

Primes, Primality and Pseudoprimes

| | Comments (1) | TrackBacks (0)
Primes are simple, right? A prime is a number that is divisible by only 1 and itself. The devil is in the details, though.

What if your number is really big and someone else does not believe you that it is prime? How do you prove it? It turns out to be really easy to prove something is composite ( the opposite of prime, a.k.a not prime ), all you have to do is show them a single factor of the number.

If a shadowy figure emerges from a dark alley and shows you a strip of paper:

x =742376437546875526384762834762837468273648992

and says "I'll sell you this big fat prime for only five bucks" you can say "it is divisible by 2 and therefore composite. QED and go away."

But if you are trying to be an upstanding prime-number-selling business and therefore want to go the extra mile and prove that the numbers you sell are indeed prime, you will quickly go out of business. I guess that is why there are still sheisty guys selling primes in trench coats...

The reason is that to prove that x=N is a prime number, you are actually making a statement about x=N as well as saying that all numbers y such that y < N are not factors of x . This is a lot of information!

Currently it takes at most a few seconds to prove that large numbers are composite, but to prove that a similarly-sized number is prime could take thousands of CPU-hours!!!  For those that like specific numbers, it took 6 CPU-years to prove that  the 20,562-digit Mills' prime was indeed prime.

Even for numbers in the range of current cryptography systems (a few hundred digits), proving something is a prime is just too damn slow. Factoring integers is often a subalgorithm of many number theory and cryptography algorithms, it has to be as fast as possible. When a software algorithm factors a number, it often does some kind of check that all of the numbers it is returning are indeed prime, but it cannot afford to prove it. This is where pseudoprimes and primality testing come in, stage left.

Primality testing is mainly concerned with finding algorithms that are fast but have low probability to incorrectly declare that a composite number is prime. A pseudoprime is such a number that "fakes out" a primality test, also known as a false positive. Obviously, people developing cryptography software should be very concerned with how often they are going to produce pseudoprimes, more on this later.

Modern primality testing started with Pierre de Fermat, but Chinese mathematicians described the beginnings of these algorithms 2000 years before Fermat was on the scene. Numbers which "fake out" the Fermat Primality Test are called Carmichael numbers,  and in 1994 a group of mathematicians proved that there are infinitely many of them, i.e. there is not only a few isolated numbers that "fake out" the test, the more your search for them, the more you find.

Some software packages may still use the Fermat Primality test for some algorithms, but it has been mostly superceded by the Rabin-Miller primality test.

This test allows you to run k repetitions of the algorithm and on average, the probability of it incorrectly concluding that a composite number is prime is P = 4^{-k} . For k=20 this is roughly a chance of one in a trillion (1000000000000, a 1 with 12 zeros) that it will be wrong. This ain't bad.

But, as it turns out, there is a better way.

The better way is called  The Baillie-PSW primality test (BPSW) which is actually just four independent steps which can be parrallelized quite easily. The BPSW first checks that the number is not 1, 2 or an even number, then checks to see if it has any small prime divisors, where small is defined as p < 1000 . It then performs a single Rabin-Miller primality test in base 2 and then a Lucas Primality test, which is another primality test that I have not described that uses Lucas Numbers (very closely related to Fibonacci numbers) as the basis for a test.

According to MathWorld "the multiple Rabin-Miller test in bases 2 and 3 combined with a Lucas pseudoprime test as the primality test have been used by the function PrimeQ[n]" since Mathematica 2.2 .

So why is the BPSW test so cool? For starters, people have looked hard and still cannot find one BPSW pseudoprime. According to recent findings, there are no BPSW pseudoprimes less than N=10^{15} . As for software that uses the Miller-Rabin test (for instance the mpz_probab_prime_p function in GMP ), Thomas R. Nicely (of pentium fdiv bug fame) has done extensive research and discovered many pseudoprimes in GMP as well as providing his data openly before publication, which is really cool! An example of GPL software that uses BPSW is the isprime() function of GP/PARI, a number theory library originally written by Henri Cohen and other members of the University of Bordeaux Mathematics Department.

The problem with using software that use Miller-Rabin tests is such:  the bases and number of repetitions used internally may change from one version of the software to the next, which means that different versions of your software may think the same number is either prime or composite. This is obviously Very Bad. If you always specify the bases and number of repetitions to use in the Miller-Rabin test then this will not occur, but that is not the case for most existing software.

So it would almost seem that BPSW is the perfect primality test, it is pretty fast and perhaps there is no number can fake out all four steps of the test. All is not peaches and cream, though. Carl Pomerance, the P in BPSW and one of the mathematicians that proved there are infinitely many Carmichael numbers, published a small article in 1984 giving an informal proof that BPSW pseudoprimes must exist, even though no one has ever found one. He could not construct one then and no has been able to in last 25 years either, but that is why mathematicians prove stuff. To be sure. One day someone will find a BPSW pseudoprime, how long will it take?

So what is the moral of the story? Modern cryptography algorithms should be using some variant of the BPSW Primality Test for fast primality testing, always keeping a lookout for a pseudoprime and perhaps another step to add to the test to make it even better.




About this Archive

This page is a archive of recent entries written by Jonathan Leto in January 2009.

Jonathan Leto: December 2008 is the previous archive.

Jonathan Leto: February 2009 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Clicky Web Analytics Screw you, spammers! 42