1. Introduction
The inversion of a nonlinear monotonic functionf(x) is a fundamental mathematical problem that appears in many scientific and technological applications. A famous example is Kepler’s equation [1,2,3], describing bound orbital motion in the two body approximation,
y=x−esinx≡f(x),
wheree∈[0,1)is the eccentricity;y≡hp2(e2−1)32t , called the mean anomaly, is a dimensionless quantity—more precisely, an angular measure—of the time t elapsed since periapsis; h and p are the angular momentum per unit mass and the distance at periapsis, respectively (see, e.g., page 151 of reference [3]); and x is the eccentric anomaly, which is related to the angleθbetween the direction of periapsis and the position of the orbiting body at time t (also called the true anomaly) through the equation,
tanx2=1−e1+etanθ2.
Therefore, obtaining the angular position of an orbiting body for any time is equivalent to computing the inverse of the functionf(x) in Equation (1),x=f−1(y).
In general, a common approach is to solve the equationy=f(x) iteratively for each value of y, using generalizations of the Newton–Raphson algorithm [4,5,6,7,8], or multipoint methods [9,10,11,12,13,14,15,16,17], or a combination of k-vector search with another inversion method, such as Newton–Raphson [18,19,20]. Moreover, special algorithms can be designed for a given function. In the case of Kepler’s equation, several numerical algorithms have been crafted throughout the centuries [1]. Efforts of the last few decades have been focused on developing ever faster computational strategies, mostly based on the Newton–Raphson’s method for the inversion, combined either with a previous analytical stage [21,22], or with the use of pre-computed tables for efficient first guesses [23,24,25], or with CORDIC-like methods for avoiding the evaluation of transcendental functions [26]. As discussed in reference [26], these efforts can be important for tasks that require the solution of Kepler’s equation for a large number of points, such as the search for exoplanets, the modeling of planet formation, or the study of the evolution of star clusters.
Recently, a fast switch and spline inversion (FSSI) scheme has been proposed as an efficient method to compute the values of the inverse of a monotonic function over an entire interval or for a large number of points [27]. This algorithm does not require an initial guess, it is non-iterative, and it solves the equationy=f(x)for all points y at once over a given co-domain[ymin,ymax] . At the algorithm’s core, are the computations of the spline coefficients and breakpoints of the inverse function interpolation. As implemented, it produces a much lower error than that predicted from a naive and direct application of spline theory [27]. Shortly after its first publication in preprint form, the FSSI method [27] was cited and used for high-throughput calculation of many high-eccentricity orbits [28], and considered as a viable high-performance option for accelerating orbital computations of exoplanet modeling [29].
In reference [27], the FSSI scheme was implemented in Python, by using library routines from SciPy [30,31] for the evaluation of the piecewise polynomial in each point y of the co-domain. These library routines use the bisection method [32]—also called the binary search method—for finding the interval between two different breakpointsyjandyj+1of the FSSI spline that contains the evaluation point y such thatyj≤y<yj+1.
In this work, significant improvements in the FSSI algorithm are described. The first improvement is the use of a faster search method for finding the interval of j of the FSSI spline to which the evaluation point belongs. The use of a k-vector search for bracketing, followed by the last step of binary search, enables a significant execution time reduction compared to the use of only a bisection method.
Note that both the FSSI and k-vector methods can be used to invert functions in combination with another method. The FSSI, as described in reference [27], requires the use of a search algorithm, while the scheme proposed in references [18,19,20] uses k-vector search followed by a Newton–Raphson inversion. The combination of the FSSI with the k-vector search is quite natural, considering that the setup phases of both methods can be merged. The resulting reduction in generation time can be important, without significantly increasing the pre-processing time cost of either the FSSI or the k-vector algorithms. A further reduction in the execution time can also be obtained when the inverse function is computed for a sorted array Y consisting of a very large number of N elements. In the latter case, the FSSI implementations that use a bisection method with or without a k-vector bracketing become equivalent in terms of generation time. These improvements can be applied to any monotonic function.
For the case of Kepler’s equation, we designed a multistep method that reduces the number of breakpoints used by the FSSI. This reduction is important for values of the eccentricityeclose to the limite→1, and allows the FSSI to solve Kepler’s equation accurately for every value of the eccentricitye∈[0,1−ϵ], withϵbeing the intrinsic computational error (ϵ=2.22×10−16in double precision arithmetic). Fore≲0.9, the accuracy can be set to10−15rad, while fore≳0.9it can be as low asO(10−14−10−13)radfor most values ofeand y. Only foreclose to the limit1−ϵ , which is also stiff in other methods [33,34,35], is there a small region,y≲10−9rad, for which the accuracy is found to be limited to ∼2×10−11rad.
For the numerical calculations, all the routines have been implemented in Cython [36], a superset of the Python language. This language provides special data types and control-structure semantics for efficient and optimized translation to pure C language with Python bindings (through Python’s C/API function call library). The result of this strategy is that when the Cython code is compiled as a shared-object binary (with the GNU C-compiler gcc), the code executes with the speed of a normal C binary (without the Python interpreter) but can be imported and called from a Python script.
2. Overview of the FSSI Algorithm
Letf(x)be an input function that is monotonic over a domainx∈xmin,xmax. Hereafter,f(x)will also be assumed to be ascending, as in the case of Kepler’s equation. The inversion of a descending function can be computed similarly, with simple changes, or by noting thatf−1=−(−f)−1and applying to−fthe results that will be obtained for an ascending function.
The FSSI algorithm [27] can be used to obtain a piecewise cubic spline interpolationS(y)of the inverse functionf−1(y)for any y belonging to the co-domainymin,ymax, withymin=f(xmin)andymax=f(xmax). This algorithm consists of two parts, which will be called “setup” and “generation” procedures, respectively.
2.1. Setup Procedure
When the functionf(x)and its derivativef′(x)are known continuous functions, such as the case of Kepler’s problem, the setup procedure of the FSSI algorithm consists of the following basic steps:
(i)
Grid generation. A grid{xj}ofn+1points is generated where(x0,⋯,xn), withx0=xminandxn=xmax . In the examples given in reference [27], these points were chosen to be equally spaced. Here, the step sizeshj=xj+1−xj, forj=0,⋯,n−1, are allowed to be a function of j. The values ofhj are chosen in order to control the error at the desired level over the entire interval, as shown in Section 3.
(ii)
Computation of the breakpoints and coefficients. First, the values of the function,yj=f(xj), which will provide the breakpoints of the spline interpolation off−1, and of the derivative of its inverse,dj=1/f′(xj), are computed on the grid{xj}. This is done given the assumption thatf′(xj)≠0over the entire domain. In practice, this demands that|f′(xj)|should be larger than the numerical errorϵover the intervalxmin,xmax . When applied to the solution of Kepler’s equation (1), this constraint becomesf′(x)=1−ecosx>ϵfor every x, or equivalently,e<1−ϵ. In fact, the multistep version of the FSSI algorithm provides a manner for obtaining the solution of Kepler’s equation even in the limiting casee=1−ϵwith an accuracy down to ∼10−11rad.
Second, following reference [27], and definingδyj=yj+1−yj, the coefficients of the spline interpolationSj(y)=∑q=03 cj(q)y−yjqof the inverse function in the intervalyj≤y<yj+1are computed using the equations
cj(0)=xj,cj(1)=dj,cj(2)=−2dj+dj+1δyj+3hj(δyj)2,cj(3)=dj+dj+1(δyj)2−2hj(δyj)3,
forj=0,⋯,n−1. These formulas have been derived assuming that none of the differencesδyj=yj+1−yjis zero. Numerically, the expressions for the coefficientscj(2)andcj(3) in Equation (3) seem to require not only|δyj|≳ϵ, but also(δyj)2≳ϵand|δyj |3≳ϵfor every j. In double precision arithmetic, this would imply that the grid spacing should be controlled to keep|δyj|≳ϵ1/3≃3×10−5. Fortunately, this requirement can be relaxed sincedj=1/f′(xj)=hj/δyj+O(hj) , so that the opposite signs in the last two Equation (3) imply that
cj(2)=hj (δyj)2−2dj+dj+1δyj hj+3=Ohj2 (δyj)2,cj(3)=hj (δyj)3dj+dj+1δyj hj−2=Ohj2 (δyj)3.
Since these two coefficients have to be multiplied by(y−yj)2and(y−yj)3, respectively, withy∈[yj,yj+1), their contributions to the interpolation of the inverse function,S(x), are expected to be at most of the orderhj2. Therefore,cj(2)andcj(3)can be set to zero whenhj2is smaller than the numerical precision, i.e., whenhj<1.49×10−8given in double precision arithmetic. These considerations are later used in the numerical implementation of the algorithm.
(iii)
Computation of the k-vector. If the generation procedure includes an implementation of the k-vector search algorithm [18,19,20] (as described below), the k-vector should also be computed at the setup stage. As discussed in Section 3, the main part of this computation can be performed during the j loop in point (ii) above. The implementation of the k-vector search in the FSSI algorithm is quite natural since they both use a setup process to minimize their generation time.
Since the array y of then+1 breakpoints obtained in point (ii) is given in ascending order, following references [18,19,20] we can define its associated k-vector k as an array of sizen+1whose elementklis given by the maximum index j, such thatyj≤ml+q, forl=0,⋯,n. The constantsξ, m, and q are chosen to beξ=ϵ(ymax−ymin),m=(ymax−ymin+2ξ)/nandq=ymin−ξ. Thusml+qdescribes a line of equally spaced points such thaty0=ymin−ξandyn=ymax+ξ, which can be used for a fast interpolation search as discussed below in point (i) of the generation procedure.
These setup computations are executed once for a given functionf(x)and a given error level, thereby generating the breakpointsyjand the coefficients of the spline interpolation S of the inverse functionf−1 , along with the values of the k-vector. These values can be stored as tables of constants either in a file or in the RAM memory. As seen in Section 4 for the Kepler equation, high accuracy can be obtained with onlyO(103−104)grid points.
2.2. Generation Procedure
The spline breakpoints and coefficients obtained in the setup stage can then be used to evaluateS(Y)on any set (or array, in computer language) of N pointsY={Ya},a=0,⋯,N−1belonging to the intervalymin,ymax. The generation procedure to compute the values ofS(Y)on the output array Y consists of the following steps:
(i)
Search for the interval. For each pointYa∈Y, find the indexjaof the breakpointsyjsuch thatyja ≤Ya<yja+1, withjaan integer in the range0,⋯,n−1and the possible values of a beinga=0,⋯,N−1 . This search can be executed using the bisection method [32], as in the examples of reference [27], or with a different search algorithm, such as the combination of the k-vector search and the bisection method (see Section 3). In the latter case, the k-vector is used for a first bracketing of the possible values ofjaby computing the integer part (floor)jy=mYa+qand the k-vector valueskjy andkjy+1 . Following references [18,19,20], the indexjais constrained to be in the intervalkjy ≤ja<kjy+1 . As discussed in Section 4, a higher accuracy can be obtained with FSSI by selecting a slightly more conservative choice, including one more point in the bracketing,kjy −1≤ja<kjy+1. Finally, a binary search is performed within this reduced interval to obtain the value of the indexja. This last execution is very fast since it only involves a very small number of points.
(ii)
Evaluation. Evaluate the values ofS(Ya)fora=0,⋯,N−1with the formula
S(Ya)=∑q=03cja(q)Ya−yja q.
This task only requires a few multiplications and a sum involving numbers that have been computed previously and are given in a table; thus, it is very fast.
2.3. Errors of the FSSI Algorithm
If the functionf(y) is infinitely differentiable, as for Kepler’s equation, an excellent approximate upper limit for the accuracy of the FSSI algorithm can be given analytically [27].
|f−1(y)−S(y)|⪅1384max0≤j≤n−1xj+1−xj4×maxx0≤x≤xn−15f″ (x)3f′ (x)3+10f(3)(x)f″(x)f′ (x)2−f(4)(x)f′(x),
wheref′,f″,f(3), andf(4) are the derivatives of f of order from 1 to 4, respectively. As said in reference [27], this error is in general several orders of magnitude smaller than that expected from general spline interpolation theory.
Equation (6) can be applied to the errorEj=maxyj≤y≤yj+1|f−1(y)−S(y)|in each intervalxj≤x<xj+1, so that
Ej⪅hj4384maxxj≤x≤xj+1−15f″ (x)3f′ (x)3+10f(3)(x)f″(x)f′ (x)2−f(4)(x)f′(x).
This is due to the fact that the FSSI algorithm can be used to invert the function in each grid interval, for whichxminandxmaxcan be replaced withxjandxj+1, respectively. Once the numerical interpolationS(y)of the inverse function is obtained, this prediction for the error can be compared with the numerical error, which can be computed a posteriori by evaluatingS(Y)−S(f(S(Y))) on the output array Y [27].
One consequence of Equation (7) is that the error of the FSSI algorithm,Ej, scales as the fourth power of the grid intervalhj , which can be thought of as the error of x previous to executing the FSSI spline interpolation. As mentioned above, and shown in Section 4, the spline interpolation provides an average accuracy improvement of more than 10 orders of magnitude, from an averagehj=O(10−4)−O(10−3)toE=O(10−15)−O(10−14), with a marginal additional cost in the generation time.
Equation (7) can also be used to optimize the FSSI algorithm by choosing a variable grid spacinghj=h(xj). The maximum step size compatible with a given global error levelE can be obtained by inverting Equation (7) as,
hjmax=384E1/4 maxxj≤x≤xj+1−15f″ (x)3f′ (x)3+10f(3)(x)f″(x)f′ (x)2−f(4)(x)f′(x)−1/4.
This optimal choice of the grid spacinghj , entering point 1 in the setup procedure as outlined above, is applied to Kepler’s equation in Section 3.
3. Methods
In this section, the detailed implementation of the FSSI algorithm is discussed, including improvements such as the multistep optimization and the use of the k-vector search. Simple pseudocode listings are provided with array indices following similar conventions to those used in C and Python languages. Most of these routines can be applied to a general monotonic functionf(x)with minimal or no modifications. An exception is the multistep optimization, which is given specifically for Kepler’s problem, although the procedure described in this case can be adapted to more general monotonic functions.
In the case of Kepler’s problem, the functionf(x)=x−esinxsatisfies the equationsf(−x)=−f(x), andf(x+qπ)=f(x)+qπ, for every integer q. This also implies the periodicity equations
f−1(−y)=−f−1(y)andf−1(y+2qπ)=f−1(y)+2qπ,
for every integer q. Therefore, without any loss of generality, it is sufficient to invert f in the interval [0,π], withxmin=ymin=0andxmax=ymax=π, and then the solution for everyx,y can be obtained using Equation (9).
For convenience, the eccentricity is considered an input variable to the program, such that the function f and its derivativef′≡ dfdx are functions of(e,x) . The eccentricity is denoted ec in the code to distinguish it from Euler’s number. The pseudocode defining Kepler’s functions is shown in Listing 1.
3.1. Grid Generation with Multistep Optimization for Kepler’s Equation
In the case of Kepler’s Equation (1), the function f and its derivatives are known analytically, and the expression (8) for the step size compatible with a given error level can be written as
hj≤hjmax=384E1/4 minxj≤x≤xj+1(1−ecosx)3(1−15e2+6e2 cos2x+8ecosx)esinx1/4.
This expression has been obtained from Equation (6), which has been shown to provide a good approximation whenhj is small enough [27]. Therefore, it has to be implemented with a few changes to enforce the validity of this approximation and to avoid divisions by small numbers that may be close to machine precision. A slightly more conservative—i.e., smaller— expression forhj than the maximum value given in Equation (10) is shown in Listing 2.
The values of ec=e , ec2=e2 , and hgfac=4.4E1/4 shall be passed with the calling of the function, as shown below. The last if statement has been chosen empirically to keephjsmall enough so that the approximations for the error are still valid for everye.
Function hstep_kepler, as shown in Listing 2, can be used to generate the array of grid intervals hj. For a small enoughhj , the minimum value in Equation (10) is obtained in one of the endpoints,xjorxj+1 , except in a few intervals. The additional factors that have been introduced in the function hstep_kepler, as compared to Equation (10), have been found to be sufficient to ensure that the choice of the endpoint that minimizes hstep_kepler always gives an error belowE, if the latter is set at a level ≥10−13rad, except foreequal or very close to1−ϵand y very close to zero, in which case a limiting accuracy ∼2×10−11rad is found when using double precision arithmetic—see Section 4. Remarkably, this function works for everye∈[0,1−ϵ], thereby including the highly nonlinear regime of eccentricity values close to 1 that is within the computational error in double precision.
The array of grid intervalshj can then be generated with the pseudocode of Listing 3.
In a computer implementation of the algorithm, the array h can be initialized by dynamic memory allocation, or by assigning a size that is greater than the maximum value of n that may be required fore∈[0,1−ϵ]and for error level down to10−15rad . This value can be chosen to be 26,000, as shall be seen in Section 4. After running the routine hstep_kepler with the while loop shown above, the value of the required number of grid intervals n will be available, along with the values ofhj≡ h[j]. This value of n will be used to allocate the memory for the arrays of the breakpoints, of the coefficients, and the k-vector.
3.2. Computation of the Breakpoints and Coefficients
Pseudocode implementing the formulas given in Section 2 for the computation of the arrays of the breakpointsyj , called y[j] in the code, and of the coefficientscj(q) , called c[q,j], is given in Listing 4.
As written, the routine FSSI_coef of Listing 4 is applied here to Kepler’s function f(ec, x). However, it can also be used to obtain the breakpoints of more general monotonic functions, if an array of grid steps h and the values of the derivatives are provided.
The arrays y and c have to be initialized by allocatingn+1and4×(n+1)“double” values, respectively, with the value of n that has been obtained in stage 1 of the setup phase. Then the breakpoints and the coefficients for the spline interpolation of the inverse of Kepler’s function can be obtained with the call,
y, c = FSSI_coef(0., PI, n, ec, h) |
by providing the number of grid intervals n, the array of grid steps h, and the input eccentricity ec.
3.3. Computation of the k-Vector
When the interval search in the generation procedure is performed using the k-vector method, a pre-processing computation of the k-vector has to be implemented in the setup phase. This can be done within the routine FSSI_coef, using the same for loop to compute the contributions to the k-vector. The pseudocode for computing breakpoints, coefficients of the spline, and k-vector is called with the following routine:
function FSSI_coef_kv(xmin, xmax, n, ec, h, mkv, qkv): |
This function is based upon the pseudocode FSSI_coef, but with the modifications in the for loop over igrid that are provided in Listing 5.
In Listing 5, the dots … represent the same code as in routine FSSI_coef.
As in the previous case, the arrays y and c must be initialized together with the function FSSI_coef, while the integer arrays kv and deltakv of sizen+1 are allocated and initialized to zero. Similar to FSSI_coef, this routine can be applied to any function and is not specific to Kepler’s equation as described here.
3.4. Generation Procedure: Search of the Interval
The search of the indexjacorresponding to the interval to which a given pointYa∈Ybelongs, i.e.,yja ≤Ya<yja+1 , can be executed using the bisection method [32] with the routine described by the pseudocode shown in Listing 6.
Here, left and right are two indices that are known to bracket the correct interval, and that have to be defined in the function call of the routine, as shown below. Alternatively, as discussed in Section 2, the search can be performed using the k-vector method followed by a bisection search, as shown in Listing 7.
When the evaluation array Y is sorted and very large, these two routines, find_interval_BS and find_interval_kv, can be made significantly faster by adding the following line in the beginning of the routine:
if y[left + 1] > yval: return left |
Indeed, for Y large and sorted, many successive pointsYaandYa+1∈Yfall into the same interval between breakpoints, so thatja+1=ja . As such, it is convenient to check this condition, thereby exiting further evaluations in the loop if true. The versions of the routines find_interval_BS and find_interval_kv with this additional conditional check are referred to as find_interval_BS_us and find_interval_kv_us, respectively, where the suffix _us denotes “use sorted”.
3.5. Generation Procedure: Evaluation
Once the correct interval has been found, the value of the inverse function in each of the N points of the array Y (called Yp below) can be performed as shown in Listing 8.
c and y are the coefficients and breakpoints arrays that have been computed in the setup phase. The if yval < yvali environment makes use of the result of the previous step for a first bracketing, and it works both for sorted and for unsorted Y arrays. Similar pseudocode uses the k-vector routine find_interval_kv for the search. In this case, the values m, q, and the k-vector arraykvhave to be passed in the definition of the function,
function poly_evaluate_kv(c, y, kx, mkx, qkv, Yp): |
The only remaining changes with respect to the routine poly_evaluate_BS will be lines calling find_interval_kv, which will be modified as shown in Listing 9.
Note, this generation part of the code is not specific to function inversion. It provides a general method for cubic spline interpolation when the coefficients and breakpoints are known. If these coefficients and breakpoints are those computed with the setup phase of the FSSI algorithm when applied to a function f, the result is the interpolation of the inversef−1.
4. Results
All the routines described in Section 2 and Section 3 have been implemented from scratch in Cython [36] using double-precision arithmetic, and have been applied to the solution of Kepler’s equation for different values of the eccentricitye.
Briefly, a Cython program resembles a Python program but contains specialized directives, data type syntax, and function commands (that act as wrappers, called decorators). This collection of extra syntax allows the CPython interpreter (together with other standard libraries) to generate a pure C program file that can be compiled with the GNU gcc compiler, thereby producing a shared-object binary file. In all our computer code described here (of Section 2 and Section 3), we wrote highly customized Cython code with explicit type and function referencing that would guarantee optimized compilation results that would produce the lowest function call overhead possible.
Additionally, the translated C file from the Cython code is compiled using appropriate gcc optimization switches that could in principle take advantage of low-level vectorization. The result is a shared-object binary that can be imported into a normal Python script, but when executed, runs as a C binary. As mentioned, on appropriate hardware, further acceleration can be obtained by compiling the Cython code with specialized C libraries such as the Intel vector extension and the Intel Math Kernel library. The numerical results presented below were obtained on a standard laptop computer with modest hardware (a 64 bit Intel i7-8565U CPU 1.8 GHz, with 16 GB memory, and with a Linux Mint operating system with 4.15.0-118 kernel).
Figure 1 shows the interpolationS(y)of the solutionf−1(y)of Kepler’s equation for three representative values ofe, namely,e=0.5,0.9, and1−ϵ, withϵ=2.22×10−16. The diagonal line representsx=y, corresponding to the solution fore=0, describing a circular orbit. Higher deviations of the solution from this linear regime are observed for increasing values ofe , corresponding to more severe nonlinearity, a well-known behaviour that is described in textbooks [1,2,3]. For this reason, the multistep method that we have described in Section 3 is most important in the regimee≳0.9, especially for values close to the limite→1.
Table 1 shows the values of the minimum (fifth column) and maximum (sixth column) step sizes, along with the total number n of steps (fourth column), obtained with the multistep routine for Kepler’s equation that has been described in Section 3. These results are given for four different values of the eccentricity (first column) and five different values of the input error level (second column). Fore≤0.9, the actual error (third column), obtained by computing the maximum value of|S(y)−S(f(S(y)))|over a large array of y points, is always smaller or at most equal to the input error level because of the conservative assumptions that were made in the design of the routine. Fore=0.9and input accuracy10−15radthey become equal. For valuese>0.9, and even close to the limite→1within the numerical error, this multistep routine is able to reach an overall error as small as ∼2×10−11rad, which can be reduced toO(10−14−10−13)radexcept for values ofevery close to1−ϵand fory≲10−9rad.
Obtaining higher accuracy, if needed, would require further adjustments or data types higher than double precision.
Fore≤0.9, the values ofhminandhmaxdo not vary by more than a factor 10. In this case, the higher cost in setup time of the multistep routine is not compensated by the reduction of the number of steps, and a single step routine with step size equal tohminmay be preferred, as shown below. However, the multistep routine is very important fore>0.9, and especially close to the limite→1, where the ratiohmax/hmincan be very large. Fore=1−ϵand input error level10−13rad, for example, a single step routine would requiren=π/hmin∼2.2×109steps, while the multitep routine can attain the same precision with justn=7874steps. From a practical point of view, we can conclude that the multistep routine is only needed fore>0.9, and that it is more and more necessary the closer iseto 1.
Figure 2 shows the results for the CPU execution time,tCPU, of the Cython code corresponding to different variants of the FSSI algorithm, when applied to computing the solutionf−1(y)of Kepler’s equation for eccentricitye=0.9for an increasing number N of points of the co-domain. The input accuracy level is fixed to the value10−15rad.
The prefix ms- indicates the routines using the multistep method described in Section 3, as opposed to those indicated with the prefix ss- that use a single steph=(xmax−xmin)/nss, withnss=13,500 (fore=0.9) to attain the same error level.
The suffixes _bs and _kv label the search routines that are used to find the interval for building the spline in each case, based on the bisection method (_bs), or on the use of k-vector followed by bisection (_kv), respectively. These routines do not exploit the fact the Y array is sorted; it usually is since it is the time variable. However, they can be modified to account for sorted arrays with the addition of the conditional test, if x[left + 1] > xval: return left, as mentioned in Section 3. The resulting _bs_us and _kv_us routines turn out to have the same performance, within the uncertainty; thus, they are indicated with the suffix _us and without the indices _bs and _kv in Figure 2.
ForN≲104, the CPU execution time is essentially dominated by the setup time. In this regime, fore=0.9 , the single-step routines are faster than the multistep implementations by a factor ∼1.6. In this N regime, and for these values of the eccentricity and error level, the _bs and _kv routines have approximately the same speed.
In the large N regime, the CPU execution time is dominated by the generation time. In this case, the single-step and multistep versions of any of the FSSI routines have the same speed, within a computational error. In this regime, the determining factor of execution time is the search routine. Without the _us enhancement, the FSSI_kv routines using k-vector search followed by bisection are faster than those using bisection, FSSI_bs, by a factor ∼1.6. For sorted Y arrays, however, the _us versions are significantly faster than the "bare" routines _bs and _kv.
For reference, the CPU execution times for a Cython implementation of the Newton–Raphson method—called Newton—are also given in Figure 2. (A similar result, with almost the same values for the execution time, can be obtained with an implementation of Brent’s method [10], which is then not indicated in the figure). Moreover, Newton_us indicates a routine that uses the result of the previous search as a starting guess for each y, resulting in a significant reduction—by a factor ∼3.3 forN=108 —of the execution time for large N and sorted Y array, similar to the improvement that has been obtained with the _us variant for the FSSI algorithm. (Such an improvement is not obtained for the Brent method). The routines ms-FSSI_bs_us, ss-FSSI_bs_us, ms-FSSI_kv_us, and FSSI_kv_us are faster by a factor ∼28, forN=108 , than the routine Newton_us. Without the _us enhancement, the routines ss-FSSI_kv and ms-FSSI_kv are ∼37 times faster than Newton.
Similar results for the execution time can be obtained for any value ofe. In this case, the main difference is the comparison between the single-step and the multistep methods, which becomes more and more favorable to the multistep routine fore→1, as discussed previously.
Deeper insight into the performance and the convergence properties of the FSSI algorithm and its variants can be obtained by analyzing how its setup and generation times depend on the accuracy.
Figure 3 shows the dependence of the “setup time,”tsetupCPU, on the input error levelE. This is defined as the part of the CPU execution time that is dedicated to the computation of the spline coefficients and breakpoints and includes the time spent in the multistep routine. In theory,tsetupcan be expected to be proportional to n, which is proportional toE−1/4. As shown in the figure, this dependence is satisfied, to a good approximation, for input accuracy below ∼10−10rad. The deviation from this dependence above this input error level is due to the introduction of the minimum step appearing in the actual multi-step routine, which also implies the—conservative—discrepancy betweenEand the actual error. For this value ofe , the multistep routine still requires a larger setup time than the single-step routine. As said above and shown in Table 1, this is not the case for higher values of the eccentricity, where the multistep routine becomes more convenient. Finally, Figure 3 shows the small additional CPU-time cost of the setup of the k-vector, as compared with the setup routine that is used together with the bisection method. This additional cost intsetupCPUmay be justified when the number of evaluation points N is such that the generation time is higher than the setup time.
Finally, Figure 4 shows the dependence of the “generation time” per point of the output array Y,tgenerationCPU/N, on the input error levelE. More precisely, the value given corresponds totCPU/NforN=107points. Even with the modest hardware, we used in our numerical computations,tgenerationCPU/N is just a few nanoseconds per point. Remarkably, for the routines using the k-vector method or the _us enhancement for sorted arrays,tgenerationCPU/N turns out to be practically independent of the error level. This is because in these cases the error level is fixed by the setup phase, while the generation phase only requires a few arithmetic operations between numbers that have already been computed in the setup phase, together with a search that, for the _kv and _us methods, only involves a small number of points. In contrast, for the _bs routine,tgenerationCPU/Nbehaves similarly to the bisection method, whose execution time is proportional tolog(n) [32], which is proportional to−14logE . This behavior can be observed in Figure 4 forE≲10−10 , the deviations for higher error level being due to the imposition of a maximum value of h in the multistep routine discussed in Section 3.
5. Discussion
When applied to just a single grid interval,n=1, the FSSI algorithm provides a simple analytical approximation of the inverse function:
S(y)=∑q=03c(q) y−yminq,
whose coefficients can be computed through Equation (3); note that in this casexj=x0=xmin,dj=d0=1/f′(xmin),dj+1=d1=1/f′(xmax),hj=h=xmax−xmin, andδyj=f(xmax)−f(xmin). For Kepler’s equation, this gives,xj=x0=0,dj=d0=1/(1−e),dj+1=d1=1/(1+e),hj=h=π, andδyj=π; thus,
c(0)=0,c(1)=11−e,c(2)=−e(1+3e)π(1−e2),c(3)=2e2π2(1−e2).
This approximation can be shown to have an accuracy ∼0.06 rad fore=0.5, but becomes completely unreliable fore≳0.9 . Recently, excellent approximate analytical solutions of Kepler’s equation using a cubic polynomial have been proposed in the literature [37]. The FSSI algorithm, when applied to Kepler’s equation, can be thought of as a generalization of such solutions, using piecewise cubic approximations in suitable sub-intervals.
The FSSI can be convenient for computing the inverse of a monotonic function over an entire interval or a large set of points. If the inversion has to be repeated many times in different sessions, the FSSI method is efficient once the coefficients, the breakpoints, and possibly the k-vector are computed and stored. Then, all subsequent evaluations for obtaining values of the inverse function at each point are very fast, since they would correspond to generation times in the range of a few nanoseconds per point, even with the modest hardware. However, if the values of the inverse function are only required in a reduced number of points, point-by-point methods may be preferred since they can have lower setup times.
A common strategy for accelerating expensive computations of a nonlinear function is to use pre-computed lookup tables (LUT) at evenly distributed intervals and then use interpolation [38,39]. Similarly, the FSSI may constitute a useful tool for creating libraries for inverse functions. Similar libraries could also be created by using other methods, such as Newton–Raphson, by obtaining the inverse in a given grid of points and then extrapolating. However, the FSSI has unique advantages for this purpose, since the values of the inverse function in the n grid points are obtained nearly for free, consisting of just an interchange ofxjandyjwith no generated error, other than the intrinsic computational errorϵ . Moreover, the FSSI algorithm, with its choice of the breakpoints and coefficients, has been shown to provide a huge enhancement in precision, as compared to the prediction of error with general spline interpolation [27].
Although the FSSI algorithm, as described in reference [27] and here, has been designed to invert monotonic functions, such as Kepler’s equation, it may be interesting to explore its possible use in more general cases. In this sense, it is interesting to note that the k-vector method has been used for the multi-valued inversion of non-monotonic functions [20] when applied in conjunction with the Newton–Raphson algorithm. It would then be natural to explore the possibility of replacing the latter with the FSSI and applying the methods of reference [20] to the combination of the FSSI and the k-vector search for inverting non-monotonic functions.
6. Conclusions
In contrast to point-by-point methods that solve the equationy=f(x) for each value of y, the FSSI algorithm [27] generates a piecewise polynomial interpolation of the inversef−1(y)of a monotonic functionf(x)over an entire interval at once. Here, two significant improvements to this algorithm have been described. First, we introduced the use of a k-vector search for bracketing, followed in the last step by a binary search, for finding the interval of the FSSI spline to which an evaluation point y belongs. Second, in the particular case of Kepler’s equation for orbital motion, we designed a multistep mechanism that significantly reduces the number of breakpoints needed for the interpolation of the inverse function.
For eccentricitye≲0.9, the accuracy of the FSSI can be set to10−15rad. In this case, a single step routine can be preferred to a multistep method since its setup time can be significantly smaller. However, fore>0.9, the multistep mechanism becomes ever more efficient. It can solve Kepler’s equation with an error ofO(10−14−10−13)radfor most values of the eccentricitye>0.9and mean anomaly y. Only foreclose to the limit1−ϵ, whereϵis the intrinsic computational error, is there a small region,y≲10−9rad, for which the accuracy is found to be limited to ∼2×10−11rad.
The use of a k-vector search for bracketing followed by bisection for finding the correct interval significantly reduces the generation time, especially when very high accuracy is desired. When the array of y points (i.e., that which contains the inverse function to be evaluated) is large and sorted, the generation time can be reduced considerably more. As this is an important case, we have separate search algorithm routines with this variant denoted _us.
Even with modest computer hardware, the Cython implementations of these versions of the FSSI algorithm efficiently solve Kepler’s equation for a very large number of N of y values with a generation time of the order of a few nanoseconds per point. Remarkably, the generation times of the implementations of the FSSI algorithm using k-vector bracketing, or using the _us variants of the search routines, are practically independent of the error level. Moreover, in the large N regime, these generation times are smaller by more than an order of magnitude than those of point-to-point inversion methods like Newton–Raphson’s or Brent’s.
These results may be of interest, e.g., for the orbital modeling of exoplanets, which requires solving Kepler’s equation for a large number of points [28,29].
Figure 1. Solutionx=f-1(y)of Kepler's equation for different values of the eccentricity (ϵ=2.22×10-16).
Figure 2. CPU execution time(s) for different routines as a function of the number N of y values in which the inverse off(x)=x-esinxhas to be computed. These results correspond toe=0.9and an error level10-15rad. All results have been obtained with the computer hardware specified in the text.
Figure 3. CPU execution time (s) as a function of the error level for the setup of different routines, as applied to the inversion off(x)=x-esinxwithe=0.9. These results have been obtained with the computer hardware specified in the text.
Figure 4. Generation time per point(s/N)as a function of the error level for different routines, as applied to the inversion off(x)=x-esinxwithe=0.9. The plotted values correspond to the CPU execution time for generatingN=107 points, divided by N. These results have been obtained with the computer hardware specified in the text.
e | Error Level | Actual Error | n | hmin | hmax |
---|---|---|---|---|---|
0.5 | 1.0 ×10−7 | 5.3 ×10−8 | 49 | 5.2 ×10−2 | 7.1 ×10−2 |
0.5 | 1.0 ×10−9 | 5.3 ×10−10 | 144 | 1.6 ×10−2 | 4.2 ×10−2 |
0.5 | 1.0 ×10−11 | 5.3 ×10−12 | 450 | 5.2 ×10−3 | 1.7 ×10−2 |
0.5 | 1.0 ×10−13 | 5.3 ×10−14 | 1416 | 1.6 ×10−3 | 6.8 ×10−3 |
0.5 | 1.0 ×10−15 | 8.9 ×10−16 | 4469 | 5.2 ×10−4 | 2.8 ×10−3 |
0.9 | 1.0 ×10−7 | 3.5 ×10−8 | 104 | 2.1 ×10−2 | 3.9 ×10−2 |
0.9 | 1.0 ×10−9 | 3.5 ×10−10 | 293 | 6.6 ×10−3 | 3.2 ×10−2 |
0.9 | 1.0 ×10−11 | 3.5 ×10−12 | 922 | 2.1 ×10−3 | 1.5 ×10−2 |
0.9 | 1.0 ×10−13 | 3.6 ×10−14 | 2905 | 6.6 ×10−4 | 5.2 ×10−3 |
0.9 | 1.0 ×10−15 | 1.0 ×10−15 | 9177 | 2.1 ×10−4 | 2.1 ×10−3 |
0.99 | 1.0 ×10−7 | 3.1 ×10−8 | 151 | 8.2 ×10−3 | 3.5 ×10−2 |
0.99 | 1.0 ×10−9 | 3.1 ×10−10 | 435 | 2.6 ×10−3 | 3.5 ×10−2 |
0.99 | 1.0 ×10−11 | 3.1 ×10−12 | 1366 | 8.2 ×10−4 | 1.3 ×10−2 |
0.99 | 1.0 ×10−13 | 3.3 ×10−14 | 4311 | 2.6 ×10−4 | 5.2 ×10−3 |
0.99 | 1.0 ×10−15 | 2.7 ×10−15 | 13,621 | 8.2 ×10−5 | 2.2 ×10−3 |
1−ϵ | 1.0 ×10−7 | 3.0 ×10−8 | 271 | 1.6 ×10−6 | 3.4 ×10−2 |
1−ϵ | 1.0 ×10−9 | 3.1 ×10−10 | 813 | 1.1 ×10−7 | 3.4 ×10−2 |
1−ϵ | 1.0 ×10−11 | 2.0 ×10−11(3.2 ×10−12) * | 2572 | 7.6 ×10−9 | 1.3 ×10−2 |
1−ϵ | 1.0 ×10−13 | 2.0 ×10−11(2.4 ×10−13) * | 7874 | 1.4 ×10−9 | 5.6 ×10−3 |
1−ϵ | 1.0 ×10−15 | 2.0 ×10−11(2.2 ×10−13) * | 25,305 | 4.3 ×10−10 | 2.0 ×10−3 |
Author Contributions
Conceptualization, D.T. and D.N.O.; methodology, D.T. and D.N.O.; software, D.T. and D.N.O.; validation, D.T. and D.N.O.; formal analysis, D.T. and D.N.O.; investigation, D.T. and D.N.O.; resources, D.T. and D.N.O.; data curation, D.T. and D.N.O.; writing-original draft preparation, D.T. and D.N.O.; writing-review and editing, D.T. and D.N.O. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by Ministerio de Economia, Industria y Competitividad, Spain, grant number FIS2017-83762-P, and by Conselleria de Educación, Universidade e Formacion Profesional, Xunta de Galicia, grant number ED431B 2018/57.
Acknowledgments
We thank Angel Paredes for discussions.
Conflicts of Interest
The authors declare no conflict of interest.
1. Colwell, P. Solving Kepler's Equation Over Three Centuries; Willmann-Bell Inc.: Richmond, VA, USA, 1993.
2. Prussing, J.E.; Conway, B.A. Orbital Mechanics, 2nd ed.; Oxford University Press: Oxford, UK, 2012.
3. Curtis, H.D. Orbital Mechanics for Engineering Students, 3rd ed.; Elsevier: Amsterdam, The Netherlands, 2014.
4. Mathews, J.H.; Fink, K.D. Numerical Methods Using MATLAB, 4th ed.; Pearson: London, UK, 2004.
5. Gerlach, J. Accelerated convergence in Newton's method. Siam Rev. 1994, 36, 272-276.
6. Palacios, M. Kepler equation and accelerated Newton method. J. Comput. Appl. Math. 2002, 138, 335-346.
7. Abbasbandy, S. Improving Newton-Raphson method for nonlinear equations by modified Adomian decomposition method. Appl. Math. Comput. 2003, 145, 887-893.
8. Abbasbandy, S.; Tan, Y.; Liao, S. Newton-homotopy analysis method for nonlinear equations. Appl. Math. Comput. 2007, 188, 1794-1800.
9. Ostrowski, A. Solutions of Equations and System of Equations; Academic Press: New York, NY, USA, 1960.
10. Brent, R.P. Algorithms for Minimization without Derivatives; Prentice-Hall: Englewood Cliffs, NJ, USA, 1973.
11. Amat, S.; Busquier, S.; Gutierrez, J. Geometric constructions of iterative functions to solve nonlinear equations. J. Comput. Appl. Math. 2003, 157, 197-205.
12. Neta, B.; Petkovic, M. Construction of optimal order nonlinear solvers using inverse interpolation. Appl. Math. Comput. 2010, 217, 2448-2455.
13. Zheng, Q.; Li, J.; Huang, F. An optimal steffensen-type family for solving nonlinear equations. Appl. Math. Comput. 2011, 217, 9592-9597.
14. Petkovic, M.; Neta, B.; Petkovic, L.; Dzunic, J. Multipoint Methods for Solving Nonlinear Equations; Academic Press, Elsevier: Amsterdam, The Netherlands, 2013.
15. Kansal, M.; Kanwar, V.; Bhatia, S. New modifications of Hansen-Patrick's family with optimal fourth and eighth orders of convergence. Appl. Math. Comput. 2015, 269, 507-519.
16. Sharma, J.R.; Arora, H. A new family of optimal eighth order methods with dynamics for nonlinear equations. Appl. Math. Comput. 2016, 273, 924-933.
17. Sharifi, S.; Salimi, M.; Siegmund, S.; Lotfi, T. A new class of optimal four-point methods with convergence order 16 for solving nonlinear equations. Math. Comput. Simul. 2016, 119, 69-90.
18. Mortari, D.; Neta, B. K-vector range searching techniques. Adv. Astronaut. Sci. 2000, 105, 449-464.
19. Mortari, D.; Rogers, J. A k-vector approach to sampling, interpolation, and approximation. J. Astronaut. Sci. 2013, 60, 686-706.
20. Arnas, D.; Mortari, D. Nonlinear function inversion using k-vector. Appl. Math. Comput. 2018, 320, 754-768.
21. Boyd, J.P. Rootfinding for a transcendental equation without a first guess: Polynomialization of Kepler's equation through Chebyshev polynomial equation of the sine. Appl. Numer. Math. 2007, 57, 12-18.
22. Raposo-Pulido, V.; Pelaez, J. An efficient code to solve the Kepler equation. Elliptic case. Mon. Not. R. Astron. Soc. 2017, 467, 1702-1713.
23. Fukushima, T. A method solving Kepler's equation without transcendental function evaluations. Celest. Mech. Dyn. Astron. 1997, 66, 309-319.
24. Fukushima, T. Fast procedure solving universal Kepler's equation. Celest. Mech. Dyn. Astron. 1999, 75, 201-226.
25. Feinstein, S.A.; McLaughlin, C.A. Dynamic discretization method for solving Kepler's equation. Celest. Mech. Dyn. Astron. 2006, 96, 49-62.
26. Zechmeister, M. CORDIC-like method for solving Kepler's equation. Astron. Astrophys. 2018, 619, A128.
27. Tommasini, D.; Olivieri, D.N. Fast switch and spline scheme for accurate inversion of nonlinear functions: The new first choice solution to Kepler's equation. Appl. Math. Comput. 2020, 364, 124677.
28. Makarov, V.V.; Veras, D. Chaotic rotation and evolution of asteroids and small planets in high-eccentricity orbits around white dwarfs. Astrophys. J. 2019, 886, 127.
29. Eastman, J.D.; Rodriguez, J.E.; Agol, E.; Stassun, K.G.; Beatty, T.G.; Vanderburg, A.; Gaudi, S.; Collins, K.A.; Luger, R. Exofastv2: A public, generalized, publication-quality exoplanet modeling code. arXiv 2019, arXiv:1907.09480.
30. Numpy. Version 1.19. Available online: https://numpy.org/doc/stable/reference/generated/numpy.searchsorted.html (accessed on 1 October 2020).
31. SciPy. Version 1.5.3. Available online: https://docs.scipy.org/doc/scipy-1.5.3/reference/generated/scipy.interpolate.PPoly.html (accessed on 1 October 2020).
32. Knuth, D. Sorting and searching. In The Art of Computer Programming, 2nd ed.; Addison-Wesley Professional: Boston, MA, USA, 1998.
33. Conway, B.A. An improved algorithm due to Laguerre for the solution of Kepler's equation. Celest. Mech. 1986, 39, 199-211.
34. Charles, E.D.; Tatum, J.B. The convergence of Newton-Raphson iteration with Kepler's equation. Celest. Mech. Dyn. Astron. 1998, 69, 357-372.
35. Stumpf, L. Chaotic behaviour in the Newton iterative function associated with Kepler's equation. Celest. Mech. Dyn. Astron. 1999, 74, 95-109.
36. Cython. Version 3.0.0. Available online: https://Cython.org/ (accessed on 1 October 2020).
37. Lynden-Bell, D. An approximate analytic inversion of Kepler's equation. Mon. Not. R. Astron. Soc. 2015, 447, 363-365.
38. Press, W.H.; Teukolsky, S.A.; Vetterling, W.T.; Flannery, B.P. Numerical Recipes: The Art of Scientific Computing, 3rd ed.; Cambridge University Press: Cambridge, UK, 2007.
39. Pharr, M.; Fernando, R. GPU Gems 2: Programming Techniques for High-Performance Graphics and General-Purpose Computation, Addison-Wesley Professional: Boston, MA, USA, 2005.
Daniele Tommasini1,* and David N. Olivieri2,3
1Applied Physics Department, School of Aeronautic and Space Engineering, Universidade de Vigo, As Lagoas s/n, 32004 Ourense, Spain
2Computer Science Department, School of Informatics (ESEI), Universidade de Vigo, As Lagoas s/n, 32004 Ourense, Spain
3Centro de Intelixencia Artificial, La Molinera, s/n, 32004 Ourense, Spain
*Author to whom correspondence should be addressed.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2020. This work is licensed under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Obtaining the inverse of a nonlinear monotonic functionf(x)over a given interval is a common problem in pure and applied mathematics, the most famous example being Kepler’s description of orbital motion in the two-body approximation. In traditional numerical approaches, this problem is reduced to solving the nonlinear equationf(x)−y=0in each point y of the co-domain. However, modern applications of orbital mechanics for Kepler’s equation, especially in many-body problems, require highly optimized numerical performance. Ongoing efforts continually attempt to improve such performance. Recently, we introduced a novel method for computing the inverse of a one-dimensional function, called the fast switch and spline inversion (FSSI) algorithm. It works by obtaining an accurate interpolation of the inverse functionf−1(y)over an entire interval with a very small generation time. Here, we describe two significant improvements with respect to the performance of the original algorithm. First, the indices of the intervals for building the spline are obtained by k-vector search combined with bisection, thereby making the generation time even smaller. Second, in the case of Kepler’s equation, a multistep method for the optimized calculation of the breakpoints of the spline polynomial was designed and implemented in Cython. We demonstrate results that accurately solve Kepler’s equation for any value of the eccentricitye∈[0,1−ϵ], withϵ=2.22×10−16, which is the limiting error in double precision. Even with modest current hardware, the CPU generation time for obtaining the solution with high accuracy in a large number of points of the co-domain can be kept to around a few nanoseconds per point.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer