Chebyshev/Orthogonal Polynomial Model

Overview

Orthogonal polynomial approximation is a type of regression technique. Orthogonal polynomials minimize the autocorrelation between the response values that exist because of the sampling locations. An advantage of using orthogonal functions as a basis for fitting is that the inputs can be decoupled in the analysis of variance (ANOVA) (Nakajima, 2006).

Chebyshev orthogonal polynomials are a common type of orthogonal polynomials that are particularly useful for equally spaced sample points. They are used when the sampling strategy is an orthogonal array. Isight implements Taguchi’s method (Taguchi, 1987) of fitting Chebyshev polynomials from an orthogonal array.

Isight provides the capability to compute orthogonal polynomial approximations for other kinds of samplings. In such cases the following approximation models are available:

Chebyshev Polynomials: The model is constructed as a linear regression of standard Chebyshev polynomials.
Successive Orthogonal Polynomials: A model consisting of a set of successive orthogonal polynomials that are orthogonal on the sample points. The successive orthogonal polynomial technique generates a series of polynomials that are orthogonal with respect to the data provided. These polynomials are used as basis functions to obtain an approximation for the responses. These orthogonal basis-functions depend only on the sample locations not on the response values.

Chebyshev Polynomial Approximation

Chebyshev polynomials are a set of orthogonal polynomials that are solutions of a special kind of Sturm-Liouville differential equation called a Chebyshev differential equation.

The equation is

(1 - x^{2}) y'' - x {y'}^{} + n^{2} y = 0.

Chebyshev polynomials can be of two kinds. In one dimension these polynomials are defined as follows:

Polynomials of the first kind

\begin{array}{l} T_{0} (x) & = 1 \\ T_{1} (x) & = x \\ T_{n + 1} (x) & = 2 x T_{n} (x) - T_{n - 1} (x) . \end{array}

Polynomials of the second kind

\begin{array}{l} U_{0} (x) & = 1 \\ U_{1} (x) & = 2 x \\ U_{n + 1} (x) & = 2 x U_{n} (x) - U_{n - 1} (x) \end{array}

The roots of these polynomials are not equally spaced. Taguchi describes a set of one-dimensional polynomials, which he calls Chebyshev, that have equally spaced roots. When these equally spaced roots are assumed to be the factor levels in an orthogonal array, a quadrature procedure is available for approximating a response using Chebyshev polynomials as individual terms.

In general, the quadrature method of fitting an approximation is more efficient and stable compared to a regression-based approach. However, the quadrature approach dictates that the function being approximated be evaluated at pre-defined locations. For Chebyshev polynomials, these positions correspond exactly to a sample obtained using an orthogonal array.

The following equations show the Chebyshev polynomials with equally spaced roots in one dimension:

\begin{array}{l} T_{1} (x) = (x - \bar{x)} \\ T_{2} (x) = {(x - \bar{x})}^{2} - b_{2} \\ T_{3} (x) = {(x - \bar{x})}^{3} - b_{31} (x - \bar{x}) \\ T_{4} (x) = {(x - \bar{x})}^{4} - b_{41} {(x - \bar{x})}^{2} \\ T_{5} (x) = {(x - \bar{x})}^{5} - b_{51} {(x - \bar{x})}^{3} - b_{52} (x - \bar{x}) \end{array}

where x is the average value of the levels. Taguchi generates multivariate polynomials by taking products of Chebyshev polynomials in each variable as listed above. Taguchi also provides tables for computing the coefficients of these terms for an orthogonal array.

For example, suppose we have three variables $x_{1}$ , $x_{2}$ , and $x_{3}$ to which we want to fit a response $g$ . We can generate the following multivariate polynomial basis:

\begin{array}{l} Linear: & {T_{1} (x_{1}); T_{1} (x_{2}); T_{1} (x_{3})} & (3 pure terms) \\ Quadratic: & {T_{2} (x_{1}); T_{2} (x_{2}); T_{2} (x_{3}); & (3 pure terms) \\ (T_{1} (x_{1}) T_{1} (x_{2})); (T_{1} (x_{1}) T_{1} (x_{3})); (T_{1} (x_{2}) T_{1} (x_{3}))} & (3 cross terms) \\ Cubic: & {T_{3} (x_{1}); T_{3} (x_{2}); T_{3} (x_{3}); & (3 pure terms) \\ (T_{2} (x_{1}) T_{1} (x_{2})); (T_{1} (x_{1}) T_{2} (x_{2})); (T_{2} (x_{2}) T_{1} (x_{3})); \\ (T_{1} (x_{2}) T_{2} (x_{3})); (T_{2} (x_{2}) T_{1} (x_{3})); (T_{1} (x_{2}) T_{2} (x_{3})); & (7 cross terms) \\ (T_{1} (x_{1}) T_{1} (x_{2}) T_{1} (x_{3}))} \end{array}

Therefore, the function $g$ is approximated as

\begin{array}{l} g (x_{1}, x_{2}, x_{2}) \approx \hat{g} = a_{11} T_{1} (x_{1}) + \dots + a_{13} T_{1} (x_{3}) + \\ a_{21} T_{2} (x_{1}) + \dots + a_{24} (T_{1} (x_{1}) T_{1} (x_{2})) + \dots + a_{26} (T_{1} (x_{2}) T_{1} (x_{3})) + \\ a_{31} T_{3} (x_{1}) + \dots + a_{34} (T_{2} (x_{1}) T_{1} (x_{2})) + \dots + a_{2, 10} (T_{1} (x_{1}) T_{1} (x_{2}) T_{1} (x_{3})) . \end{array}

Isight uses Taguchi’s tables to calculate the coefficients $a_{i j}$ for orthogonal array sampling and least squares regression to calculate all other sampling techniques. A term-by-term ANOVA can also be computed for a Chebyshev polynomial approximation when orthogonal array sampling is used.

Note: All points in an orthogonal array have to be available to use Taguchi’s method. Therefore, the error value obtained using the cross-validation approach is no longer meaningful for orthogonal arrays. When cross-validation is used, the sample set for fitting the approximation no longer contains all the points. Therefore, Isight uses the regression approach to build the approximations for cross validation, whereas Isight uses Taguchi’s approach for models with all the points.

Successive Orthogonal Polynomial Approximation

Isight provides the capability to use an arbitrary set of orthogonal polynomials to construct an approximation. Taguchi describes a multi-variable method for only the linear case, but the approach has been extended for an arbitrary degree in Isight. For example, suppose we have three variables $x_{1}$ , $x_{2}$ , and $x_{3}$ to which we want to fit a response $g$ . To simplify the expressions, assume that $x_{1}$ , $x_{2}$ , and $x_{3}$ have a mean of zero.

We can generate the following sequence of orthogonal functions (sometimes called a contrast):

\begin{array}{c} linear: & f_{1} = x_{1} \\ f_{2} = x_{2} - b_{2, 1} f_{1}, LT (f_{2}) = x_{2} \\ f_{3} = x_{3} - b_{3, 1} f_{1} - b_{3, 2} f_{2}, LT (f_{3}) = x_{3} \\ quadratic: & f_{4} = x_{1}^{2} - b_{4, 1} f_{1} - b_{4, 2} f_{2} - b_{4, 3} f_{3}, LT (f_{4}) = x_{1}^{2} \\ f_{5} = x_{2}^{2} - b_{5, 1} f_{1} - b_{5, 2} f_{2} - b_{5, 3} f_{3} - b_{5, 4} f_{4,} LT (f_{5}) = x_{2}^{2} \\ f_{6} = x_{3}^{2} - b_{6, 1} f_{1} - b_{6, 2} f_{2} - b_{6, 3} f_{3} - b_{6, 4} f_{4} - b_{6, 5} f_{5}, LT (f_{6}) = x_{3}^{2} \\ f_{7} = x_{1} x_{2} - \sum_{i = 1}^{6} b_{7, i} - f_{i}, LT (f_{7}) = x_{1} x_{2} \\ f_{8} = x_{1} x_{3} - \sum_{i = 1}^{7} b_{8, i} - f_{i}, LT (f_{8}) = x_{1} x_{3} \\ f_{9} = x_{2} x_{3} - \sum_{i = 1}^{8} b_{9, i} - f_{i}, LT (f_{9}) = x_{2} x_{3} \end{array}

Here LT indicates the Leading Term of that polynomial. If $x_{1}$ , $x_{2}$ , and $x_{3}$ are permuted, the general form of the polynomial will be different. The coefficients in the orthogonal polynomial ${{b_{i, j}}_{}; i = 1 \dots n, j = 1 \dots (i - 1)}$ can be computed using the discrete orthogonality condition as follows ( ${\bar{X}}^{k}$ is the $k^{t h}$ vector in the input data):

\begin{array}{c} \sum_{k = 1}^{N} f_{i} ({\bar{X}}^{k}) f_{j} ({\bar{X}}^{k}) = 0 \\ \Rightarrow \sum_{k = 1}^{N} f_{j} ({\bar{X}}^{k}) L T_{i} ({\bar{X}}^{k}) - b_{i, j} \sum_{k = 1}^{N} {f_{j}^{2}}_{}^{} ({\bar{X}}^{k}) = 0 \\ \Rightarrow b_{i, j} = \frac{\sum_{k = 1}^{N} f_{j} ({\bar{X}}^{k}) L T_{i} ({\bar{X}}^{k})}{\sum_{k = 1}^{N} f_{j}^{2} ({\bar{X}}^{k})} \end{array}

After solving for these values, we can obtain the coefficients ${a_{i} : i = 1 \dots n}$ of the fit

g (x_{1}, x_{2}, x_{3}) ≅ \sum_{i = 1}^{n} a_{i} f_{i}

using the following equation:

a_{i} = \frac{\sum_{k = 1}^{N} f_{i} ({\bar{X}}^{k}) g ({\bar{X}}^{k})}{\sum_{k = 1}^{N} f_{i}^{2} ({\bar{X}}^{k})} .

The time taken to generate these successive orthogonal polynomials increases drastically with an increase in the number of input variables of the approximation. Therefore, this approximation is best suited for models with a small number of input variables and a large number of data points (or responses).