Recently in math Category

Primes, Primality and Pseudoprimes

| | Comments (1) | TrackBacks (0)
Primes are simple, right? A prime is a number that is divisible by only 1 and itself. The devil is in the details, though.

What if your number is really big and someone else does not believe you that it is prime? How do you prove it? It turns out to be really easy to prove something is composite ( the opposite of prime, a.k.a not prime ), all you have to do is show them a single factor of the number.

If a shadowy figure emerges from a dark alley and shows you a strip of paper:

x =742376437546875526384762834762837468273648992

and says "I'll sell you this big fat prime for only five bucks" you can say "it is divisible by 2 and therefore composite. QED and go away."

But if you are trying to be an upstanding prime-number-selling business and therefore want to go the extra mile and prove that the numbers you sell are indeed prime, you will quickly go out of business. I guess that is why there are still sheisty guys selling primes in trench coats...

The reason is that to prove that x=N is a prime number, you are actually making a statement about x=N as well as saying that all numbers y such that y < N are not factors of x . This is a lot of information!

Currently it takes at most a few seconds to prove that large numbers are composite, but to prove that a similarly-sized number is prime could take thousands of CPU-hours!!!  For those that like specific numbers, it took 6 CPU-years to prove that  the 20,562-digit Mills' prime was indeed prime.

Even for numbers in the range of current cryptography systems (a few hundred digits), proving something is a prime is just too damn slow. Factoring integers is often a subalgorithm of many number theory and cryptography algorithms, it has to be as fast as possible. When a software algorithm factors a number, it often does some kind of check that all of the numbers it is returning are indeed prime, but it cannot afford to prove it. This is where pseudoprimes and primality testing come in, stage left.

Primality testing is mainly concerned with finding algorithms that are fast but have low probability to incorrectly declare that a composite number is prime. A pseudoprime is such a number that "fakes out" a primality test, also known as a false positive. Obviously, people developing cryptography software should be very concerned with how often they are going to produce pseudoprimes, more on this later.

Modern primality testing started with Pierre de Fermat, but Chinese mathematicians described the beginnings of these algorithms 2000 years before Fermat was on the scene. Numbers which "fake out" the Fermat Primality Test are called Carmichael numbers,  and in 1994 a group of mathematicians proved that there are infinitely many of them, i.e. there is not only a few isolated numbers that "fake out" the test, the more your search for them, the more you find.

Some software packages may still use the Fermat Primality test for some algorithms, but it has been mostly superceded by the Rabin-Miller primality test.

This test allows you to run k repetitions of the algorithm and on average, the probability of it incorrectly concluding that a composite number is prime is P = 4^{-k} . For k=20 this is roughly a chance of one in a trillion (1000000000000, a 1 with 12 zeros) that it will be wrong. This ain't bad.

But, as it turns out, there is a better way.

The better way is called  The Baillie-PSW primality test (BPSW) which is actually just four independent steps which can be parrallelized quite easily. The BPSW first checks that the number is not 1, 2 or an even number, then checks to see if it has any small prime divisors, where small is defined as p < 1000 . It then performs a single Rabin-Miller primality test in base 2 and then a Lucas Primality test, which is another primality test that I have not described that uses Lucas Numbers (very closely related to Fibonacci numbers) as the basis for a test.

According to MathWorld "the multiple Rabin-Miller test in bases 2 and 3 combined with a Lucas pseudoprime test as the primality test have been used by the function PrimeQ[n]" since Mathematica 2.2 .

So why is the BPSW test so cool? For starters, people have looked hard and still cannot find one BPSW pseudoprime. According to recent findings, there are no BPSW pseudoprimes less than N=10^{15} . As for software that uses the Miller-Rabin test (for instance the mpz_probab_prime_p function in GMP ), Thomas R. Nicely (of pentium fdiv bug fame) has done extensive research and discovered many pseudoprimes in GMP as well as providing his data openly before publication, which is really cool! An example of GPL software that uses BPSW is the isprime() function of GP/PARI, a number theory library originally written by Henri Cohen and other members of the University of Bordeaux Mathematics Department.

The problem with using software that use Miller-Rabin tests is such:  the bases and number of repetitions used internally may change from one version of the software to the next, which means that different versions of your software may think the same number is either prime or composite. This is obviously Very Bad. If you always specify the bases and number of repetitions to use in the Miller-Rabin test then this will not occur, but that is not the case for most existing software.

So it would almost seem that BPSW is the perfect primality test, it is pretty fast and perhaps there is no number can fake out all four steps of the test. All is not peaches and cream, though. Carl Pomerance, the P in BPSW and one of the mathematicians that proved there are infinitely many Carmichael numbers, published a small article in 1984 giving an informal proof that BPSW pseudoprimes must exist, even though no one has ever found one. He could not construct one then and no has been able to in last 25 years either, but that is why mathematicians prove stuff. To be sure. One day someone will find a BPSW pseudoprime, how long will it take?

So what is the moral of the story? Modern cryptography algorithms should be using some variant of the BPSW Primality Test for fast primality testing, always keeping a lookout for a pseudoprime and perhaps another step to add to the test to make it even better.




A metric tensor g_{mn} is used to measure distances based on a given coordinate system. In terms of the Jacobian, the metric tensor can be found from g_{mn} = (J^T J)_{mn} where J^T is the transpose of the Jacobian. Since J^T J is a symmetric matrix for any matrix J , the metric tensor is always symmetric. (In fancy-pants math lingo this is called a symmetric bilinear form.) What is the real-life consequences of this? The distance from a to b is always the same as the distance from b to a , no matter what kind of crazy coordinate system you are living in!
If we want to calculate the length of a parameterized curve x^r = x^r(u) where u is a parameter with respect to some coordinate system, then we can write an infintesmal displacement element as dx^r = p^r(u) du . The length of this displacement is ds = \sqrt{g_{mn} p^m p^n} du  and the length of the curve from u=u_1 to u=u_2 is L =  \int_{u_1}^{u_2} ds = \int_{u_1}^{u_2} \sqrt{g_{mn} p^m p^n} du .
So we need the metric tensor to define distance along a curve when we are in non-cartesian coordinate systems, such as spherical or toroidal. From the metric tensor one can then start to study the "curvature" of a coordinate system. More soon!
In Tensor Calculus, a Jacobian is a matrix defined asJ_{mn} = \frac{dy^m}{dx^n}  where the y^i 's are a new coordinate system defined in terms of the original coordinate system, the x^i 's. (Note that we are using subscripts, x^2 denotes the second element of the \mathbf{x} vector.) The determinant of this matrix J=det([J_{mn}]) is important because this represents the constant of proportionality between volumes in the old coordinates and in the new coordinates. As a simple example polar coordinates is the transformation  (x,y)= (r \cos {\theta} , r \sin {\theta} ) . When we want to perform an integration in cartesian coordinates by transforming into polar coordinates, we write \int f(x,y) dx dy = \int f(r,\theta) r dr d\theta  , the factor of r is precisely the Jacobian. So in general, to convert an integral in the x^i coordinate system to the y^i coordinate system we have \int f(\mathbf{x}) d\mathbf{x} = \int    f(\mathbf{y}) J(\mathbf{y}) d\mathbf{y}  where \mathbf{x}=(x^1,x^2,\cdots,x^n)  and \mathbf{y}=(y^1,y^2,\cdots,y^n) .

In differential equation theory, the Jacobian matrix plays a key role in defining the stability of solutions. As a simple example, consider the matrix ordinary differential equation \dot{\mathbf{x}} = A \mathbf{x}  where A = \left(\begin{array}{cc}a&b\\c&d\end{array} 
ight) . Because this is a linear system, the solution will always be a linear combination of exponentials \mathbf{x} = \mathbf{v_1} e^{\lambda_1 t} + \mathbf{v_2} e^{\lambda_2 t } where \mathbf{v_{1,2}} are the eigenvectors and \lambda_{1,2} are the eigenvalues, and t is time, which is positive. Since t > 0 , we must have  Re(\lambda_{1,2}) < 0 for the solution to decay to a steady state. If this is violated for any \lambda_i then that exponential "blows up" as t 
ightarrow \infty , since it is an exponential with an ever-increasing postive argument. Note that if the eigenvalues are complex then the imaginary part of the eigenvalue is related to the oscillatory part of the solution.

A  \lambda_i  would be called an unstable eigenvalue of the system and also forms part of the vector space called the unstable manifold. Instability is defined as the tendency for a system to shoot away from a certain state when it is slightly disturbed (perturbed) from that state. Stability is the feature of a system to come back to a certain state if it is slightly perturbed from that state.

Stability has important consequences for applications, because it can determine if your chemical/combustion reaction will go out of control, of if the species in your mathematical model go extinct, etc, depending on what the differential equations describe. Whatever your equations describe, knowing if the solutions are stable is pretty important and usually the first step of an analysis. If you have stable solutions, then you can reasonably trust numerics but if you are trying to numerically simulate an unstable solution of an equation, you must be much more careful. In these situations you want to use specialized integration algorithms that preserve certain properties of your solutions, like if it is symplectic. More about this later!

About this Archive

This page is a archive of recent entries in the math category.

language is the previous category.

movabletype is the next category.

Find recent content on the main index or look in the archives to find all content.

Clicky Web Analytics Screw you, spammers! 42