Polynomial root-finding

Finding the roots of polynomials is a long-standing problem that has been extensively studied throughout the history and substantially influenced the development of mathematics. It involves determining either a numerical approximation or a closed-form expression of the roots of a univariate polynomial, i.e., determining approximate or closed form solutions of $x$ in the equation

$a_{0}+a_{1}x+a_{2}x^{2}+\cdots +a_{n}x^{n}=0$

where $a_{i}$ are either real or complex numbers.

Efforts to understand and solve polynomial equations led to the development of important mathematical concepts, including irrational and complex numbers, as well as foundational structures in modern algebra such as fields, rings, and groups. Despite of being historically important, finding the roots of higher degree polynomials no longer play a central role in mathematics and computational mathematics, with one major exception in computer algebra.^[1]

Overview

Closed-form formulas

Closed-form formulas exist only when the degree of the polynomial is less than 5. The quadratic formula has been known since antiquity, and the cubic and quartic formulas were discovered in full generality during the 16th century.

When the degree of polynomial is at least 5, a closed-form expression for the roots by the polynomial coefficients does not exist if we only uses additions, subtractions, multiplications, divisions, and radicals (taking n-th roots). This is due to the celebrated Abel-Ruffini theorem. On the other hand, fundamental theorem of algebra shows that all nonconstant polynomials have at least one root. Therefore, root-finding algorithms consists of finding numerical solutions in most cases.

Numerical algorithms

Root-finding algorithms can be broadly categorized according to the goal of the computation. Some methods aim to find a single root, while others are designed to find all complex roots at once. In certain cases, the objective may be to find roots within a specific region of the complex plane. It is often desirable and even necessary to select algorithms specific to the computational task due to efficiency and accuracy reasons. See Root Finding Methods for a summary of the existing methods available in each case.

History

Closed-form formulas

The root-finding problem of polynomials was first recognized by the Sumerians and then the Babylonians. Since then, the search for closed-form formulas for polynomial equations lasted for thousands of years.

The quadratics

The Babylonions and Egyptians were able to solve specific quadratic equations in the second millennium BCE, and their solutions essentially correspond to the quadratic formula.^[2]

However, it took 2 millennia of effort to state the quadratic formula in an explicit form similar to the modern formulation, provided by Indian Mathematician Brahmagupta in his book Brāhmasphuṭasiddhānta 625 CE. The full recognition of the quadratic formula requires the introduction of complex numbers, which took another a millennia.

The cubics and the quartics

The first breakthrough in a closed-form formula of polynomials with degree higher than 2 took place in Italy. In the early 16th century, the Italian mathematician Scipione del Ferro found a closed-form formula for cubic equations of the form $x^{3}+mx=n$ , where $m,n$ are nonnegative numbers. Later, Niccolò Tartaglia also discovered methods to solve such cubic equations, and Gerolamo Cardano summarized and published their work in his book Ars Magna in 1545.

Meanwhile, Cardano's student Lodovico Ferrari discovered the closed-form formula of the quartic equations in 1540. His solution is based on the closed-form formula of the cubic equations, thus had to wait until the cubic formula to be published.

In Ars Magna, Cardano noticed that Tartaglia's method sometimes involves extracting the square root of a negative number. In fact, this could happen even if the roots are real themselves. Later, the Italian mathematician Rafael Bombelli investigated further into these mathematical objects by giving an explicit arithmetic rules in his book Algebra published in 1569. These mathematical objects are now known as the complex numbers, which are foundational in mathematics, physics, and engineering.

Insolvability of the quintics

Since the discovery of cubic and quartic formulas, solving quintic equations in a closed form had been a major problem in algebra. The French lawyer Viete, who first formulated the root formula for cubics in modern language and applied trigonometric methods to root-solving, believed that his methods generalize to a closed-form formula in radicals for polynomial with arbitrary degree. Descartes also hold the same opinion.^[3]

However, Lagrange noticed the flaws in these arguments in his 1771 paper Reflections on the Algebraic Theory of Equations, where he analyzed why the methods used to solve the cubics and quartics would not work to solve the quintics. His argument involves studying the permutation of the roots of polynomial equations. Nevertheless, Lagrange still believed that closed-form formula in radicals of the quintics exist. Gauss seems to have been the first prominent mathematician who suspected the insolvability of the quintics, stated in his 1799 doctoral dissertation.

The first serious attempt at proving the insolvability of the quintic was given by the Italian mathematician Paolo Ruffini. He published six versions of his proof between 1799 and 1813, yet his proof was not widely accepted as the writing was long and difficult to understand, and turned out to have a gap.

The first rigorous and accepted proof of the insolvability of the quintic was famously given by Niels Henrik Abel in 1824, which made essential use of the Galois theory of field extensions. In the paper, Abel proved that polynomials with degree more than 4 do not have a closed-form root formula by radicals in general. This puts an end in the search of closed form formulas of the roots of polynomials by radicals of the polynomial coefficients.

Numerical methods

Since finding a closed-form formula of higher degree polynomials is significantly harder than that of quadratic equations, the earliest attempts to solve cubic equations are either geometrical or numerical. Also, for practical purposes, numerical solutions are necessary.

Iterative methods

The earliest iterative approximation methods of root-finding were developed to compute square roots. In Heron of Alexandria's book Metrica (1st-2nd century CE), approximate values of square roots were computed by iteratively improving an initial estimate.^[4] Jamshīd al-Kāshī presented a generalized version of the method to compute $n$ th roots. A similar method was also found in Henry Briggs's publication Trigonometria Britannica in 1633. Franciscus Vieta also developed an approximation method that is almost identical to Newton's method.

Newton further generalized the method to compute the roots of arbitrary polynomials in De analysi per aequationes numero terminorum infinitas (written in 1669, published in 1711), now known as Newton's method. In 1690, Joseph Raphson published a refinement of Newton's method, presenting it in a form that more closely aligned with the modern version used today.^[5]

In 1879, the English mathematician Arthur Cayley noticed the difficulties in generalizing Newton's method to complex roots of polynomials with degree greater than 2 and complex initial values in his paper The Newton–Fourier imaginary problem. This opened the way to the study of the theory of iterations of rational functions.

Real-root isolation methods

A class of methods of finding numerical value of real roots is based on real-root isolation. The first example of such method is given by René Descartes in 1637. It counts the roots of a polynomial by examining sign changes in its coefficients. In 1807, the French mathematician François Budan de Boislaurent generalized Descarte's result into Budan's theorem which counts the real roots in a half-open interval (a, b]. However, both methods are not suitable as an effective algorithm.

The first complete real-root isolation algorithm was given by Jacques Charles François Sturm in 1829, known as the Sturm's theorem.

In 1836, a mathematician named Mr. Vincent proposed a method for isolating real roots of polynomials using continued fractions, a result now known as Vincent's theorem. The work was largely forgotten until it was rediscovered over a century later by J. V. Uspensky, who included it in his 1948 textbook Theory of Equations. The theorem was subsequently brought to wider academic attention by the American mathematician Alkiviadis G. Akritas, who recognized its significance while studying Uspensky's account.^[6]^[7] The first implimentation of real-root isolation method by modern computer is given by G.E. Collins and Alkiviadis G. Akritas in 1976, where they proved an effective version of Vincent's theorem. Variants of the algorithm were subsequently studied.^[8]

Root-finding algorithms

Finding one root

The most widely used method for computing a root of any differentiable function $f$ is Newton's method, in which an initial guess $x_{0}$ is iteratively refined. At each iteration the tangent line to $f$ at $x_{n}$ is used as a linear approximation to $f$ , and its root is used as the succeeding guess $x_{n+1}$ :

x_{n+1}=x_{n}-{\frac {f(x_{n})}{f'(x_{n})}},

In general, the value of $x_{n}$ will converge to a root of $f$ .

In particular, the method can be applied to compute a root of polynomial functions. In this case, the computations in Newton's method can be accelerated using Horner's method or evaluation with preprocessing for computing the polynomial and its derivative in each iteration.

Though the rate of convergence of Newton's method is generally quadratic, it might converge much slowly or even not converge at all. In particular, if the polynomial has no real root, and $x_{0}$ is chosen to be a real number, then Newton's method cannot converge. However, if the polynomial has a real root, which is larger than the larger real root of its derivative, then Newton's method converges quadratically to this largest root if $x_{0}$ is larger than this larger root (there are easy ways for computing an upper bound of the roots, see Properties of polynomial roots). This is the starting point of Horner's method for computing the roots.

Closely related to Newton's method are Halley's method and Laguerre's method. Both use the polynomial and its two first derivations for an iterative process that has a cubic convergence. Combining two consecutive steps of these methods into a single test, one gets a rate of convergence of 9, at the cost of 6 polynomial evaluations (with Horner's rule). On the other hand, combining three steps of Newtons method gives a rate of convergence of 8 at the cost of the same number of polynomial evaluation. This gives a slight advantage to these methods (less clear for Laguerre's method, as a square root has to be computed at each step).

When applying these methods to polynomials with real coefficients and real starting points, Newton's and Halley's method stay inside the real number line. One has to choose complex starting points to find complex roots. In contrast, the Laguerre method with a square root in its evaluation will leave the real axis of its own accord.

Finding all complex roots

Methods using complex-number arithmetic

Both the Aberth method and the similar yet simpler Durand–Kerner method simultaneously find all of the roots using only simple complex number arithmetic. The Aberth method is presently the most efficient method. Accelerated algorithms for multi-point evaluation and interpolation similar to the fast Fourier transform can help speed them up for large degrees of the polynomial.

A free implementation of Aberth's method is available under the name of MPSolve. This is a reference implementation, which can find routinely the roots of polynomials of degree larger than 1,000, with more than 1,000 significant decimal digits.

Another method with this style is the Dandelin–Gräffe method (sometimes also ascribed to Lobachevsky), which uses polynomial transformations to repeatedly and implicitly square the roots. This greatly magnifies variances in the roots. Applying Viète's formulas, one obtains easy approximations for the modulus of the roots, and with some more effort, for the roots themselves.

Methods using linear algebra

Arguably, the most reliable method to find all roots of a polynomial is to find the eigenvalues of the companion matrix of monic polynomial, which coincides with the roots of the polynomial. There are plenty of algorithms for computing the eigenvalue of matrices. The standard method for finding all roots of a polynomial in MATLAB uses the Francis QR algorithm to compute the eigenvalues of the corresponding companion matrix of the polynomial.^[9]

In principle, can use any eigenvalue algorithm to find the roots of the polynomial. However, for efficiency reasons one prefers methods that employ the structure of the matrix, that is, can be implemented in matrix-free form. Among these methods are the power method, whose application to the transpose of the companion matrix is the classical Bernoulli's method to find the root of greatest modulus. The inverse power method with shifts, which finds some smallest root first, is what drives the complex (cpoly) variant of the Jenkins–Traub algorithm and gives it its numerical stability. Additionally, it has fast convergence with order $1+\varphi \approx 2.6$ (where $\varphi$ is the golden ratio) even in the presence of clustered roots. This fast convergence comes with a cost of three polynomial evaluations per step, resulting in a residual of $O (| f (x)| 2+3 φ)$ , that is a slower convergence than with three steps of Newton's method.

Limitations of iterative methods for finding all roots

The oldest method of finding all roots is to start by finding a single root. When a root $r$ has been found, it can be removed from the polynomial by dividing out the binomial $x - r$ . The resulting polynomial contains the remaining roots, which can be found by iterating on this process. This idea, despite being common in theoretical deriviations, does not work well in numerical computations because of the phenomenon of numerical instability: Wilkinson's polynomial shows that a very small modification of one coefficient may change dramatically not only the value of the roots, but also their nature (real or complex). Also, even with a good approximation, when one evaluates a polynomial at an approximate root, one may get a result that is far to close to zero. For example, if a polynomial of degree 20 (the degree of Wilkinson's polynomial) has a root close to 10, the derivative of the polynomial at the root may be of the order of $10^{20};$ this implies that an error of $10^{-10}$ on the value of the root may produce a value of the polynomial at the approximate root that is of the order of $10^{10}.$

Finding all real roots

Finding the real roots of a polynomial with real coefficients is a problem that has received much attention since the beginning of 19th century, and is still an active domain of research.

Methods for finding all complex roots can provide the real roots. However, because of the numerical instability of polynomials, it may need arbitrary-precision arithmetic to decide whether a root with a small imaginary part is real or not. Moreover, as the number of the real roots is, on the average, proportional to the logarithm of the degree,^[10] it is a waste of computer resources to compute the non-real roots when one is interested in real roots.

The standard way of computing real roots is to compute first disjoint intervals, called isolating intervals, such that each one contains exactly one real root, and together they contain all the roots. This computation is called real-root isolation. Having an isolating interval, one may use fast numerical methods, such as Newton's method for improving the precision of the result.

The oldest complete algorithm for real-root isolation results from Sturm's theorem. However, it appears to be much less efficient than the methods based on Descartes' rule of signs and its extensions—Budan's and Vincent's theorems. These methods divide into two main classes, one using continued fractions and the other using bisection. Both method have been dramatically improved since the beginning of 21st century. With these improvements they reach a computational complexity that is similar to that of the best algorithms for computing all the roots (even when all roots are real).

These algorithms have been implemented and are available in Mathematica (continued fraction method) and Maple (bisection method), as well as in other main computer algebra systems (SageMath, PARI/GP) . Both implementations can routinely find the real roots of polynomials of degree higher than 1,000.

Finding roots in a restricted domain

Several fast tests exist that tell if a segment of the real line or a region of the complex plane contains no roots. By bounding the modulus of the roots and recursively subdividing the initial region indicated by these bounds, one can isolate small regions that may contain roots and then apply other methods to locate them exactly.

All these methods involve finding the coefficients of shifted and scaled versions of the polynomial. For large degrees, FFT-based accelerated methods become viable.

The Lehmer–Schur algorithm uses the Schur–Cohn test for circles; a variant, Wilf's global bisection algorithm uses a winding number computation for rectangular regions in the complex plane.

The splitting circle method uses FFT-based polynomial transformations to find large-degree factors corresponding to clusters of roots. The precision of the factorization is maximized using a Newton-type iteration. This method is useful for finding the roots of polynomials of high degree to arbitrary precision; it has almost optimal complexity in this setting.^{[citation needed]}

Finding complex roots in pairs

If the given polynomial only has real coefficients, one may wish to avoid computations with complex numbers. To that effect, one has to find quadratic factors for pairs of conjugate complex roots. The application of the multidimensional Newton's method to this task results in Bairstow's method.

The real variant of Jenkins–Traub algorithm is an improvement of this method.

Polynomials with rational coefficients

For polynomials whose coefficients are exactly given as integers or rational numbers, there is an efficient method to factorize them into factors that have only simple roots and whose coefficients are also given in precise terms. This method, called square-free factorization, is based on the multiple roots of a polynomial being the roots of the greatest common divisor of the polynomial and its derivative.

The square-free factorization of a polynomial p is a factorization $p=p_{1}p_{2}^{2}\cdots p_{k}^{k}$ where each $p_{i}$ is either 1 or a polynomial without multiple roots, and two different $p_{i}$ do not have any common root.

An efficient method to compute this factorization is Yun's algorithm.

References

^ Pan, Victor Y. (January 1997). "Solving a Polynomial Equation: Some History and Recent Progress". SIAM Review. 39 (2): 187–220. doi:10.1137/S0036144595288554. ISSN 0036-1445.
^ Berriman, A. E. (1956). "The Babylonian Quadratic Equation". The Mathematical Gazette. 40 (333): 185–192. doi:10.2307/3608807. ISSN 0025-5572.
^ Brown, Jim (2000). "Abel and the insolvability of the quintic" (PDF).{{cite web}}: CS1 maint: url-status (link)
^ Fowler, David; Robson, Eleanor (November 1998). "Square Root Approximations in Old Babylonian Mathematics: YBC 7289 in Context". Historia Mathematica. 25 (4): 366–378. doi:10.1006/hmat.1998.2209.
^ Cajori, Florian (1911-02-01). "Historical Note on the Newton-Raphson Method of Approximation". The American Mathematical Monthly. 18 (2): 29–32. doi:10.1080/00029890.1911.11997596. ISSN 0002-9890.
^ Akritas, Alkiviadis G.; Danielopoulos, Stylianos D. (1978-11-01). "On the forgotten theorem of Mr. Vincent". Historia Mathematica. 5 (4): 427–435. doi:10.1016/0315-0860(78)90211-2. ISSN 0315-0860.
^ Uspensky, J. V. (James Victor) (1948). Theory of equations. --. Internet Archive. New York : McGraw-Hill Book Co.
^ Rouillier, Fabrice; Zimmermann, Paul (January 2004). "Efficient isolation of polynomial's real roots". Journal of Computational and Applied Mathematics. 162 (1): 33–50. doi:10.1016/j.cam.2003.08.015.
^ "Polynomial roots - MATLAB roots". MathWorks. 2021-03-01. Retrieved 2021-09-20.
^ Nguyen, Hoi; Nguyen, Oanh; Vu, Van (2016). "On the number of real roots of random polynomials". Communications in Contemporary Mathematics. 18 (4): 1550052. arXiv:1402.4628. doi:10.1142/S0219199715500522. ISSN 0219-1997.

[:1-1] Pan, Victor Y. (January 1997). "Solving a Polynomial Equation: Some History and Recent Progress". SIAM Review. 39 (2): 187–220. doi:10.1137/S0036144595288554. ISSN 0036-1445.

[2] Berriman, A. E. (1956). "The Babylonian Quadratic Equation". The Mathematical Gazette. 40 (333): 185–192. doi:10.2307/3608807. ISSN 0025-5572.

[3] Brown, Jim (2000). "Abel and the insolvability of the quintic" (PDF).{{cite web}}: CS1 maint: url-status (link)

[4] Fowler, David; Robson, Eleanor (November 1998). "Square Root Approximations in Old Babylonian Mathematics: YBC 7289 in Context". Historia Mathematica. 25 (4): 366–378. doi:10.1006/hmat.1998.2209.

[5] Cajori, Florian (1911-02-01). "Historical Note on the Newton-Raphson Method of Approximation". The American Mathematical Monthly. 18 (2): 29–32. doi:10.1080/00029890.1911.11997596. ISSN 0002-9890.

[6] Akritas, Alkiviadis G.; Danielopoulos, Stylianos D. (1978-11-01). "On the forgotten theorem of Mr. Vincent". Historia Mathematica. 5 (4): 427–435. doi:10.1016/0315-0860(78)90211-2. ISSN 0315-0860.

[7] Uspensky, J. V. (James Victor) (1948). Theory of equations. --. Internet Archive. New York : McGraw-Hill Book Co.

[8] Rouillier, Fabrice; Zimmermann, Paul (January 2004). "Efficient isolation of polynomial's real roots". Journal of Computational and Applied Mathematics. 162 (1): 33–50. doi:10.1016/j.cam.2003.08.015.

[roots-9] "Polynomial roots - MATLAB roots". MathWorks. 2021-03-01. Retrieved 2021-09-20.

[10] Nguyen, Hoi; Nguyen, Oanh; Vu, Van (2016). "On the number of real roots of random polynomials". Communications in Contemporary Mathematics. 18 (4): 1550052. arXiv:1402.4628. doi:10.1142/S0219199715500522. ISSN 0219-1997.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

v t e Root-finding algorithms
Bracketing (no derivative)	Bisection method Regula falsi ITP method
Householder	Newton's method Halley's method
Quasi-Newton	Broyden's method Secant method Newton–Krylov method Steffensen's method
Hybrid methods	Brent's method Ridders' method
Polynomial methods	Aberth method Bairstow's method Bernoulli's method Durand–Kerner method Graeffe's method Jenkins–Traub algorithm Lehmer–Schur algorithm Laguerre's method Splitting circle method
Other methods	Fixed-point iteration Inverse quadratic interpolation Muller's method Sidi's generalized secant method