The nthexpansion of f around x0 is given by
f(x)=(k=0∑nk!f(k)(x0)(x−x0)k)+((n+1)!f(n+1)(ξx)(x−x0)n+1)
for ξx between x0 and x
Lagrange Interpolating Polynomials
Say we have a set of n+1 points x0,x1,x2,…,xn with known values f(x0),f(x1),…,f(xn). Then we can create a unique polynomial which goes through these points.
The construction is illustrated as follows. Say we have x0,x1,x2 with values y0,y1,y2. The constructed polynomial is
P(x)=(x0−x1)(x0−x2)(x−x1)(x−x2)⋅y0+(x1−x0)(x1−x2)(x−x0)(x−x2)⋅y1+(x2−x0)(x2−x1)(x−x0)(x−x1)⋅y2
The idea is that when, for instance, x=x2 then the (x−x2) multiplier in the first and third term will cause these terms to become zero, and the fraction in the second term will be one by symmetry of numerator and denominator, making the function as a whole produce y2.
This gives a degree-at-most-n polynomial.
The error term for this polynomial is
E(x)=(n+1)!f(n+1)(ξx)(x−x0)(x−x1)⋯(x−xn)
for some ξx. Note how similar it looks to Taylor error terms! Worth memorizing this one probably.
Neville’s Method (sec 3.2, PDF pg 133)
Let Pa,b,c be the interpolating polynomial agreeing with points xa,xb,xc, etc.
Choose i,j; then we have the identity
P0,1,…,n(x)=xj−xi(x−xi)P0,1,…,i−1,i+1,…,n(x)+(x−xj)P0,1,…,j−1,j+1,…,n(x)
Now say we have input points x0,x1,x2,x3 and know f(x0),…,f(x3). We define
P0(x)P1(x)P2(x)P3(x)=f(x0)=f(x1)=f(x2)=f(x3)
Now for any x we can use the mentioned identity to compute the table
P0(x)
P1(x)
P0,1(x)
P2(x)
P1,2(x)
P0,1,2(x)
P3(x)
P2,3(x)
P1,2,3(x)
P0,1,2,3(x)
where each column is computed based on the previous. The result is an approximation
P0,1,2,3(x)≈f(x)
Note that P0,1,2,3 is exactly the Lagrange interpolating polynomial agreeing with the input points.
Newton Divided Differences (sec 3.3, PDF page 140)
Again say we know f(x0),…,f(x3). Define
f[x0]f[x1]f[x2]f[x3]=f(x0)=f(x1)=f(x2)=f(x3)
and then
f[xn,…,xn+d]=xn+d−xnf[xn+1,…,xn+d]−f[xn,…,xn+d−1]
We can build up these values as successive columns of the table
f[x0]
f[x1]
f[x0,x1]
f[x2]
f[x1,x2]
f[x0,x1,x2]
f[x3]
f[x2,x3]
f[x1,x2,x3]
f[x0,x1,x2,x3]
Once the table is computed we can use the diagonal to express the interpolating polynomial P as
P(x)=f[x0]+f[x0,x1](x−x0)+f[x0,x1,x2](x−x0)(x−x1)+f[x0,x1,x2,x3](x−x0)(x−x1)(x−x2)
Hermite Polynomials (sec 3.4; PDF page 152)
Given n+1 datapoints (xi,f(xi),f′(xi)), the Hermite polynomial H2n+1 is the unique polynomial of degree at most 2n+1 which agrees with fand whose derivatve agrees with f′ on all of x0…xn
Cubic Spline Interpolation (sec 3.5; PDF page 160)
Given n+1 datapoints (x0,f(xi)), we can approximate each subinterval [xi,xi+1] with a single cubic polynomial Si (ie, Si is of the form ax3+bx2+cx+d).
Each Si is required to agree with f at its endpoints, and also agree with its neighbors on value, derivative, and second derivative at its endpoints (eg, S0(x1)=S1(x1) and the same for derivatives).
The piecewise composition of all Si is denoted S.
We then choose to impose one of the following two conditions. The first is
S′′(x0)=S′′(xn)=0
in which case we say S has a “free” or “natural” boundary; or,
S′(x0)=f′(x0)andS′(xn)=f′(xn)
in which case we say S has a “clamped” boundary.
Imposing a boundary condition constrains S just enough to induce a unique solution.
The induced interpolating function:
Forward difference formula as follows. Also called backwards difference for h<0f′(x)=hf(x+h)−f(x)−2hf′′(ξ(x))
for ξ(x)∈[x,x+h].
Three-point endpoint formulaf′(x)=2h1(−3f(x)+4f(x+h)−f(x+2h))+3h2f(3)(ξ(x))
for ξ(x)∈[x,x+2h]Three-point midpoint formulaf′(x)=2h1(f(x+h)−f(x−h))−6h2f(3)(ξ(x))
for ξ(x)∈[x−h,x+h]Five-point endpoint formula and five-point midpoint formula also exist.
Richardson Extrapolation (sec 4.2, PDF page 201)
Say we have a value M we’re approximating by some N1(h), where h can be anything (typically some small “step” parameter)
M≈N1(h)
Say we know the exact error term. Say, for instance,
M=N1(h)+ah+bh2+ch3+⋯
Gives an O(h) approximation to M. (Here O(h) is as h→0)
Plugging in h/2 we get
M=N1(h/2)+21ah+41bh2+81ch3+⋯
Now subtracting the first equation from twice the second gives
M=[2N1(h/2)−N1(h)]+0ah−21bh2−43ch3−⋯
Note that the ahterm has been killed. This means that the approximation
M≈2N1(h/2)−N1(h)=:N2(h)
is O(h2). We have “upgraded” our approxmation from O(h) to O(h2).
We may repeat the process on N2 to produce an O(h3) approximation, and so on. (To do this, we would consider N2(h/2) and combine it with N2(h) in a way to eliminate the h2term.)
Numerical Quadrature (sec 4.3, PDF page 209)
This is just formulas for approximating integrals.
Typically we make approximations based off of equally-spaced nodes x0,x1,…,xn with xi=h+xi+1.
One approximation is the Trapezoidal rule, using two input nodes:
∫x0x1f(x)dx=2h(f(x0)+f(x1))−12h3f′′(ξ)
for ξ∈(x0,x1). Another is Simpson’s rule, using three input nodes:
∫x0x2f(x)dx=3h(f(x0)+4f(x1)+f(x2))−90h5f(4)(ξ)
for ξ∈(x0,x2)
Both of these are “closed”, meaning that they include the endpoints of the interval in their approximation. There are also “open” formulas which dont, such as the midpoint rule:
∫x−1x1f(x)dx=2hf(x0)+3h3f′′(ξ)
this uses the single node x0 in its approximation.
Accuracy of an approximation method is measured by its degree of precision (or degree of accuracy), which is the highest k for which the approximation is exact when applied to f(x)=xk. (It is a theorem that, given some niceness assumptions, this is equivalent to asking for exactness on all degree-k polynomials)
Composite Integration (sec 4.4)
If you have some way to approximate an integral, you can apply it n times to n subintervals to produce a better approximation.
For the following two we have h=(b−a)/n and xi=a+ih.
Composite Simpson’s rule (n must be even)
∫abf(x)dx=3h⎣⎡f(a)+2j=1∑(n/2)−1f(x2j)+4j=1∑n/2f(x2j−1)+f(b)⎦⎤−180b−ah4f(4)(μ)∃μ∈(a,b)
Composite Trapezoidal rule
∫abf(x)dx=2h[f(a)+2j=1∑n−1f(xj)+f(b)]−12b−ah2f′′(μ)∃μ∈(a,b)
For the last one we have h=(b−a)/(n+2) and xi=a+(i+1)h.
Composite Midpoint rule (n must be even)
∫abf(x)dx=2hj=0∑n/2f(x2j)+6b−ah2f′′(μ)∃μ∈(a,b)
Romberg Integration
Is just extrapolation applied to the composite trapezoidal rule
Adaptive Quadrature (4.6, pdf page 237)
Using equally-spaced nodes for computing integrals doesn’t make sense for a function with non-constant variabiliy.
We do some magic and find a way to, in some way, detect which parts of the function are more/less variable. We make some vaguely-strong guarantees about error bounds.
Gaussian Quadrature (4.7, pdf page 246)
The Trapezoidal rule, Simpson’s rule, midpoint rule, etc, all have the form
∫abf(x)dx≈i=1∑ncif(xi)
for some choice of {ci} and {xi}.
Gaussian Quadrature aims to find the optimal approximation of that form. It does so by maximizing the approximation degree of accuracy.
For n=2 and integrating over [−1,1] the resultant approximation is
∫−11f(x)dx≈f(−3/3)+f(3/3)
with degree of precision three.
In general the result for n nodes and integrating over [−1,1] is given as follows. {xi} are the roots of the nth Legendre polynomial (the first 5 are given by
P0(x)P1(x)P2(x)P3(x)P4(x)=1=x=x2−(1/3)=x3−(3/5)x=x4−(6/7)x2+(3/35)
) and {ci} are
ci=∫−11j=1,j=i∏nxi−xjx−xjdx
This approximation has degree of precision 2n−1.
The fact that the result approximation is over [−1,1] is superficial; we can refit it to any integral with
∫abf(x)dx=∫−11f(2(b−a)t+(b+a))2b−adt
The values {xi},{ci} can also be derived (without restricting to [−1,1]!) by solving for the 2n unknowns {xi},{ci} by constraining the solution to have degree of precision 2n−1.
4.8, pdf page 253
This is just Gaussian Quadrature applied in multiple dimensions
The technique is nothing beyond using
∬Rf(x,y)dA=∫ab∫c(x)d(x)f(x,y)dydx
on e.g. composite Simpson’s. The results are hairy though.
Integration techniques!
Chain rule
∫F(y(t),t)dtdydt=∫F(y)dy
(1) Isolate y: re-express problem as F(y(t),t)dtdy=G(t) (ie, all y on side with dy); (2) integrate both sides wrtt; (3) apply chain rule to left-hand side.
Example. To solve dtdy=ycost, first isolate y by dividing, to give y−1dtdy=cost. now integrate for ∫y−1dtdydt=∫costdt. The dtdy and dt cancel out, giving ∫y−1dy=∫costdt; almost done.
FTC:
∫dtdF(t)dt=F(t)
Integrating factors. (Streamlinization of the inverse product rule)
If wanting to solve
dtdy+p(t)y=q(t)
Then let v(t)=e∫p(t)dt (the integral is called the integration factor); multiply both sides by v and apply reverse product rule for
dtd(y⋅v(t))=v(t)q(t)
Then go from there
u-substitution
Choose u=u(t) so that (1) holds; then you may conclude (2).
∫F(t)dt=(1)∫F(u(t))dtdudt=(2)∫F(u)du
Integration by parts∫f(t)g′(t)dt=f(t)g(t)−∫g(t)f′(t)dt
Actually, I have yet for this one to be useful..
Initial-Value Problem An initial value problem provides a differential equation and initial value condition
dtdy=f(t,y)a≤t≤by(a)=y0
and asks what functionsy=y(t) abide by these constraints.
Lipschitz condition — if a functionf=f(t,y) admits a setD⊆R2 and constant L>0 so that
∣f(t,y1)−f(t,y2)∣≤L∣y1−y2∣
for (t,y1),(t,y2)∈D then f is said to satisfy a Lipschitz condition in y over D.
Intuition: if fsatisfies a Lipschitz condition in y over D, then we can bound a difference of images along each fixed-t line, without knowing t.
Convex — A subset D⊆R2 is convex if for every pair of points a,b∈D the entire line ab lies in D.
Perturbed problem, well-posed problem For an initial-value problem of the form
dtdy=f(t,y)a≤t≤by(0)=y0
a perturbed problem is one of the form
dtdy=f(t,y)+δ(t)a≤t≤by(0)=y0+δ0
A perturbed problem represents “the possibility of an error being introduced in the statement of the differential equation as well as an error (5o being present in the initial condition."
An initial-value problem is called well-posed if both:
(B) small perturbations induce correspondingly-small changes in the solution: specifically, if exists ε0,k>0 so that for any 0≤ε≤ε0 and δ=δ(t),δ0 with δcontinuous and ∣δ(t)∣,∣δ0∣≤ε then the problem perturbed by δ,δ0 has a unique solution z(t)satisfying∣z(t)−y(t)∣<kεa≤t≤b
Theorem If f is continuous a Lipschitz condition in y over D such a condition, then the initial-value problem
dtdy=f(t,y)a≤t≤by(0)=y0
is well-posed
Euler’s method (sec 5.2, pdf page 284)
Given an initial-value problem
dtdy=f(t,y)a≤t≤by(a)=y0
and choosing some step size h=Nb−a, we fix time values tk=a+kh,k∈[0,N] and approximate values y(tk) as
y(t0)y(ti+1)≈w0:=y0≈wk+1:=wk+hf(ti,wi)
This is essentially just “executing” the differential equation with a small time step.
Local truncation error, Taylor’s method (sec 5.3, pdf page 293)
Local truncation error. We characterize a difference method
w0:=y0wk+1:=wk+hϕ(tk,wk)
by its local truncation errorτk which measures the error in the approximation wk, assuming that wk−1 is exact. Letting y be the solution to the inital-value problem under approximation, we define
τk+1:=hy(tk+1)−[y(tk)+h⋅ϕ(tk,y(tk))]=hy(tk+1)−y(tk)−ϕ(tk,y(tk))
Note that the local truncation error is normalized wrth. Recalling that N=1/h, we can interpret the local truncation error as the expected absolute error of the method as a whole, if every step had the absolute error of the (k+1)st step. Alternatively, we can think of the normalization as acting to ensure that our truncation error is roughly invariant wrt the choice of h.
Taylor’s Method is an extension of Euler’s method. For any ordern, we approximate y(tk) with
y(tk)≈wk:=wi+h[f(ti,wi)+2h(∂tf)(ti,wi)+⋯+n!hn−1(∂tn−1f)(ti,wi)]
Taylor’s method of ordern has local truncation error O(hn). Euler’s method is just Taylor’s method with n=1.
Runge-Kutta Methods (sec 5.4, pdf page 300)
Uses multivariate versions of Taylor’s Theorem to derive difference methods with low local truncation error.
Midpoint methodw0wk+1:=y0:=wk+hf(tk+2h,wk+2hf(tk,wk))Modified Euler Methodw0wk+1:=y0:=wk+2h[f(tk,wk)+f(tk+1,wk+hf(tk,wk))]
with local truncation error O(h2)Heun’s Method gives local truncation error O(h3)But the most often used Runge-Kutta method is one with local truncation error O(h4), since it achieves an optimum of precision of results vs computational demands.
The Runge-Kutta-Fehlberg Method (sec 5.5, pdf page 312)
Does some algebra and some approximations to produce an adaptive method which chooses mesh points dynamically to approximately bound global error within some chosen ε
Multistep Methods (sec 5.6, pdf page 320)
Multistep Methods: Difference methods where wi is defined using more than one of the previous mesh points wk<i
A definition is given for multistep methods, alongside multistep versions of local truncation error and probably something else. None of it is interesting. Actually, not true. A little of it is interesting!
5.7
Dynamic multistep methods? Who cares
5.9
Runge-Kutta for systems of equations; ie, problems like
∂tu1∂tu2∂tun=f1(t,u1,…,um)=f2(t,u1,…,um)⋮=fn(t,u1,…,um)
over some a≤t≤b with initial conditions
u1(t)=α1⋯un(t)=αnNote that we can use such an algorithm to also solve problems like
a∂t3y+b∂t2y+c∂ty+dy+e=0
by eg. setting
u1(t)u2(t)u3(t)=y(t)=(∂ty)(t)=(∂t2y)(t)
so that
∂tu0∂tu1∂tu2=∂ty=u1=∂t2y=u2=∂t3y=−a1(b∂t2y+c∂ty+dy+e)=−a1(bu3+cu2+du1+e)
and then applying the algorithm.
Stability (sec 5.10, pdf page ~358)
Consistency: local truncation error →0 as h→0. (For multistep methods we also require αk→0)
Convergence: wi→y(ti) as h→0Stability: “small perturbations in the initial conditions produce correspondingly small changes in the approximations”
If a method is consistent, then it is stable iff it is convergent. (Thm 5.24)
For a single-step method wi+1=wi+hϕ(ti,wi,h), if ϕ is continuous and satisfies a Lipschitz condition in w with constant Lon a domain with hbounded as 0≤h≤h0 for some h0, then: (1) the method is stable; (2) consistent ⇔ convergent; (3) boundability of local convergence extends to boundability of global convergence
In particular, if exists T with ∣τk(h)∣≤T(h) when 0≤h≤h0, then
∣y(ti)−wi∣≤LT(h)eL(ti−a)
Local truncation error of a multistep method is as expected; eg of
wi+1=2wi−wi−1+21h(f(ti,wi)+f(ti,wi−1))
the L.T.E. is h1 times the difference between y(ti+1) and what wi+1 would be if wi were exact; ie,
τi+1(h)=hy(ti+1)−[2y(ti)−y(ti−1)+21h[f(ti,y(ti))+f(ti,y(ti−1))]]
The characteristic polynomial of a multistep method eg
wi+1=Awi+Bwi−1+Cwi−2+hF(ti,{wj})
is defined to be eg
P(λ)=λ3−Aλ2−Bλ−C
generalizing in the obvious manner.
If all roots of P are of magnitude ≤1, and if all roots with magnitude 1 are simple roots, then the method satisfies the root condition
A method is stable iff it satisfies the root condition.
Gaussian elimination: say you want to solve
⎣⎡12−13112−10−13−111−12214−3⎦⎤
where a row representsa1x1+a2x2+a3x3+a4x4=b.
Start by replacing row (2) with (2)−2⋅(1) to eliminate the first variable. Repeat on the rest of the rows. Now do the same to eliminate the second variable, etc, until the matrix is triangular (has values only on and above the diagonal). Now back-substitue.
If at any point you cannot continue because the relevant diagonal entry is zero, try swapping with a lower row. If none exist with a nonzero entry in that column, the system has no unique solution.
Pivoting (6.2)
When performing Gaussian elimination and eliminating the nth variable, the diagonal entry at (n,n) is called the pivot element. The pivot element is divided by to eliminate variables.
Hence, if the pivot is too small one can run into floating-point error. Strategies for avoiding this include:
partial pivoting — before doing an elimination, optionally swap the pivot row with a row below it to maximize magnitude of the pivot element
scaled partial pivoting — before doing an elimination, optionally swap the pivot row with a row below it to maximize magnitude of the ratio between the pivot element and the maximum element of that row
complete pivoting — before doing an elimination, optionally swap the pivot row with a row below it and the pivot column with one to the right (excluding the right-most column), in order to maximize the magnitude of the pivot element. (Note tha a column-swap induces a variable swap)
Rate of convergence: (an)→a with rate O(bn) if exsts K st ∣an−a∣≤K∣bn∣ always
Order of convergence: highest α st
n→∞lim∣pn−p∣α∣pn+1−p∣=λfor some “asymptotic error constnat” λ. If α=1,λ=0 then we have superlinearconvergence.
Polynomials n shit
“Taylor poly”: The nthexpansion of f around x0 is given by
f(x)=(k=0∑nk!f(k)(x0)(x−x0)k)+((n+1)!f(n+1)(ξx)(x−x0)n+1)for ξx between x0 and x
“Lagrange Interp Polys”: unique deg-at-most-n poly going through known (x,y) nodes. Given by
P(x)=(x0−x1)(x0−x2)(x−x1)(x−x2)⋅y0+(x1−x0)(x1−x2)(x−x0)(x−x2)⋅y1+(x2−x0)(x2−x1)(x−x0)(x−x1)⋅y2
“Hermite poly”: Given n+1 datapoints (xi,f(xi),f′(xi)), the Herm. pol." H2n+1 is unique poly of deg at most 2n+1 which agrees with fand whose derivatve agrees with f′ on all of x0…xn
“Neville’s Method”: Let Pa,b,c be the interpolating poly agreeing with points xa,xb,xc, etc. Choose i,j; then we have the identity
P0,1,…,n(x)=xj−xi(x−xi)P0,1,…,i−1,i+1,…,n(x)+(x−xj)P0,1,…,j−1,j+1,…,n(x)USe to compute P0…n from lower P⋅s.
Newton Divided Differences: Say we know f(x0),…,f(x3). Define f[xn]=f(xn) and
f[xn,…,xn+d]=xn+d−xnf[xn+1,…,xn+d]−f[xn,…,xn+d−1]then build up f[x0…x3] from lower forms and express
P(x)=f[x0]+f[x0,x1](x−x0)+f[x0,x1,x2](x−x0)(x−x1)+f[x0,x1,x2,x3](x−x0)(x−x1)(x−x2)
Cubic Spline Interpolation — approximate subintervals with cubic polys. Must agree with oracle at endpoints, and neighbors on value, derivative, and second derivative.
Composite Midpoint rule (n must be even, h=(b−a)/(n+2) and xi=a+(i+1)h)
∫abf(x)dx=2hj=0∑n/2f(x2j)+6b−ah2f′′(μ)∃μ∈(a,b)
“Romberg Integration”: extrapolation applied to the composite trapezoidal rule
“adaptive quad”: dynamically-chosen mesh points
review gaussian quad
Gaussian Quadrature: finds {ci}, {xi} optimizing deg of accuracy when formed to
∫abf(x)dx≈i=1∑ncif(xi). (Trapezoidal, Simpson’s, midpoint have this form for some {ci}, {xi}).
For n=2 get
∫−11f(x)dx≈f(−3/3)+f(3/3)with deg of precision 2n−1=3.
In general can find solution by constraining solution to have deg of precision 2n−1 (which is expressed with a number of formulas) and then solving for the 2n unknowns.
Richardon Extrapolation. Say we have appn M≈N1(h) with known error term, eg
M=N1(h)+ah+bh2+ch3+⋯Gives an O(h) approximation to M. Plugging in h/2 we get
M=N1(h/2)+21ah+41bh2+81ch3+⋯Now subtracting the first equation from twice the second gives
M=[2N1(h/2)−N1(h)]+0ah−21bh2−43ch3−⋯Note that the ahterm has been killed. This means that the approximation
M≈2N1(h/2)−N1(h)=:N2(h)is O(h2). We have “upgraded” our approxmation from O(h) to O(h2). Repeat to produce O(h3) appx’n, so on.
Integration techniques
Chain rule
∫F(y(t),t)dtdydt=∫F(y)dy(1) Isolate y: re-express problem as F(y(t),t)dtdy=G(t) (ie, all y on side with dy); (2) integrate both sides wrtt; (3) apply chain rule to left-hand side.
Example. To solve dtdy=ycost, first isolate y by dividing, to give y−1dtdy=cost. now integrate for ∫y−1dtdydt=∫costdt. The dtdy and dt cancel out, giving ∫y−1dy=∫costdt; almost done.
FTC:
∫dtdF(t)dt=F(t)
Integrating factors (inverse product). If wanting to solve
dtdy+p(t)y=q(t)Then let v(t)=e∫p(t)dt; multiply both sides by v and apply reverse product rule for
dtd(y⋅v(t))=v(t)q(t)
u-substitution. Choose u=u(t) so that (1) holds; then you may conclude (2).
∫F(t)dt=(1)∫F(u(t))dtdudt=(2)∫F(u)du
For D⊆R2 if exists L>0 st when (t,y1),(t,y2)∈D then ∣f(t,y1)−f(t,y2)∣≤L∣y1−y2∣, then f “satisfies a Lipschitz condition” in y over D
IVP “well-posed” if has unique solution over a≤t≤b and small changes in problem induce small changes in solution. (Thm: fcontinuous, Lipschitz ⇒ IVP well-posed)
Local truncation error: if wk+1=wk+hϕ(tk,wk) then τk+(h)=h1(y(tk+1)−y(tk))−ϕ(tk,y(tk))
“What abs error would be if every step had err of this step”
Taylor’s Method of ordern (local trunc error O(hn)): wk:=wi+h[f(ti,wi)+2h(∂tf)(ti,wi)+⋯+n!hn−1(∂tn−1f)(ti,wi)]
For n=1, recovers Eulers’method
Runge-Kutta methods (based on multivariate Taylor’s Theorem)