.. currentmodule:: sofia_redux.toolkit.fitting.polynomial Generally, polynomial algorithms in the :mod:`sofia_redux.toolkit.fitting` module either fit polynomial coefficients to a sample distribution, or evaluate polynomial coefficients at a given location. The :func:`polyfitnd` function is used to derive polynomial coefficients, while :func:`poly1d` or :func:`polynd` functions evaluate coefficients in 1 or N dimensions respectively. The :class:`Polyfit` class is capable of both deriving and evaluating polynomials. However, if one only wishes to derive and then evaluate a polynomial fit to data (i.e. resample) but does not require the actual coefficients using piecewise fits, it is recommended to instead refer to :mod:`sofia_redux.toolkit.resampling`, developed for just that purpose. Polynomial Coefficients ----------------------- In this scheme, coefficients are stored as an N-dimensional array with the size of each dimension equal to the polynomial degree (order + 1) set for that dimension. For example, if we wish to fit a set of 2-dimensional data with a 2nd order (degree=3) polynomial in x, and a 3rd order (degree=4) polynomial in y, an array of shape (3, 4) would be used to contain the polynomial coefficients necessary to model the data with the desired polynomial. Polynomial functions model distribution :math:`x` as: .. math:: f(x) = \sum_{p_{1}=0}^{o_{1}} \sum_{p_{2}=0}^{o_{2}} ... \sum_{p_{N}=0}^{o_{N}} a_{(p_{1},p_{2},...,p_{N})} x_{1}^{p_{1}} x_{2}^{p_{2}} ... x_{N}^{p_{N}} in :math:`N` dimensions where :math:`o_{i}` is the polynomial order fitted to dimension :math:`i`, and :math:`a` are the polynomial coefficients. For example, polynomial coefficients could be used to model the function .. math:: f(x) = 5 + 2x + 3xy + 0.5y^{2} as .. math:: a = \begin{bmatrix} 5 & 2 & 0 \\ 0 & 3 & 0 \\ 0.5 & 0 & 0 \end{bmatrix} using a second order polynomial in both x and y. The following example uses :func:`polyfitnd` to derive these coefficients: .. code-block:: python from sofia_redux.toolkit.fitting.polynomial import polyfitnd import numpy as np y, x = np.mgrid[:5, :5] z = 5 + (2 * x) + (3 * x * y) + (0.5 * y ** 2) a = polyfitnd(x, y, z, 2) assert np.allclose(a, [[5, 2, 0], [0, 3, 0], [0.5, 0, 0]]) These coefficients can then be evaluated using :func:`polynd`: .. code-block:: python from sofia_redux.toolkit.fitting.polynomial import polynd fitted = polynd(np.stack((y, x)), a) assert np.allclose(z, fitted) Both derivation and evaluation can be performed by the :class:`Polyfit` class: .. code-block:: python from sofia_redux.toolkit.fitting.polynomial import Polyfit pfit = Polyfit(x, y, z, 2) assert np.allclose(pfit.get_coefficients(), a) assert np.allclose(pfit(x, y), z) Please note however, that in all the above examples, we have used the default redundant polynomial representation such that no coefficients will be calculated for :math:`a` when :math:`\sum_{i=1}^{N}{p_i} > max(o)`. So for a redundant (default) polynomial fit of order 2 in both dimensions .. math:: a = \begin{bmatrix} 5 & 2 & 0 \\ 0 & 3 & - \\ 0.5 & - & - \end{bmatrix} where values marked with :math:`-` will always be set to zero since they are defined as redundant. This default redundancy behaviour may be overridden by explicitly providing terms for which coefficients must be derived. The format of these terms should be provided as an integer array of shape (n_terms, n_dimensions), with each value corresponding to the power of a single term in a specific dimension. For example, in 3-dimensions, the value [1, 2, 3] would result in a coefficient being calculated for term :math:`xy^2z^3`. The :func:`sofia_redux.toolkit.utilities.func.polyexp` may be used for the purposes of creating either full or redundant set of terms. .. code-block:: python from sofia_redux.toolkit.fitting.polynomial import polyexp # Create a redundant set of terms redundant = polyexp(2, ndim=2) print(redundant) # [[0 0] # [1 0] # [2 0] # [0 1] # [1 1] # [0 2]] The above redundant set results in a function of the form: .. math:: f(x) = a_{0,0} + a_{1,0}x + a_{2,0}x^2 + a_{0,1}y + a_{1,1}xy + a_{0,2}y^2 .. code-block:: python # Create a full set of terms full = polyexp([2, 2]) print(full) # [[0 0] # [1 0] # [2 0] # [0 1] # [1 1] # [2 1] # [0 2] # [1 2] # [2 2]] The above full set results in a function of the form: .. math:: f(x) = a_{0,0} + a_{1,0}x + a_{2,0}x^2 + a_{0,1}y + a_{1,1}xy + a_{2,1}x^2y + a_{0,2}y^2 + a_{1,2}xy^2 + a_{2,2}x^2y^2 Alternatively, terms may be explicitly defined by the user: .. code-block:: python user = [[2, 0], [1, 1]] In this case, the user has specified that coefficients should only by derived for the terms in the following function: .. math:: f(x) = a_{2,0}x^2 + a_{1,1}xy For example, attempting to fit the function .. math:: f(x) = 1 + 2xy + 0.1x^2y^2 can only be done with either a full or user defined set of terms since :math:`x^2y^2` will never appear in a redundant set of a 2nd order 2-dimensional polynomial function: .. code-block:: python y, x = np.mgrid[:5, :5] z = 1 + (2 * x * y) + (0.1 * x ** 2 * y ** 2) redundant_pfit = Polyfit(x, y, z, 2) print(redundant_pfit.get_coefficients().round(decimals=2)) # [[ 3.8 -3.2 0.6] # [-3.2 3.6 0. ] # [ 0.6 0. 0. ]] full_pfit = Polyfit(x, y, z, polyexp([2, 2]), set_exponents=True) print(full_pfit.get_coefficients().round(decimals=2)) # [[ 1. -0. 0. ] # [-0. 2. -0. ] # [ 0. -0. 0.1]] user_pfit = Polyfit(x, y, z, [[0, 0], [1, 1], [2, 2]], set_exponents=True) print(user_pfit.get_coefficients().round(decimals=2)) # [[1. 0. 0. ] # [0. 2. 0. ] # [0. 0. 0.1]] As can be seen above, the default redundant set of terms failed to derive the correct coefficients, but made its best attempt using those terms available. The full and user defined sets both returned the correct values. Note that in order to override the default redundant behaviour, `set_exponents=True` must be set as a keyword argument. Also note that while the redundant fit failed, the resulting fit is the "best fit" with the terms available as can be seen below: .. plot:: :include-source: import matplotlib.pyplot as plt import numpy as np from sofia_redux.toolkit.fitting.polynomial import Polyfit y, x = np.mgrid[:5, :5] z = 1 + (2 * x * y) + (0.1 * x ** 2 * y ** 2) redundant_pfit = Polyfit(x, y, z, 2) plt.figure(figsize=(5, 5)) plt.plot(z, label='Original') plt.plot(redundant_pfit(x, y), '--', label="Redundant fit") plt.xlabel('x') plt.ylabel('f(x, y)') plt.title("Redundant term set fit") plt.legend() Plots for the full and user defined sets would reproduce the original data. Fit Statistics -------------- In addition to performing and evaluating a polynomial fit, :class:`Polyfit` can also return useful statistics including covariance terms. The :attr:`Polyfit.stats` attribute is a namespace containing the following statistics: - `n`: Number of samples - `dof`: Degrees of Freedom - `fit`: An array containing the fit to the original data - `residuals`: An array containing the difference between the data and the fit - `sigma`: An array containing the error of each polynomial coefficient - `rms`: The root-mean-square error on the fit - `chi2`: The :math:`\chi^2` statistic on the fit - `rchi2`: The reduced :math:`\chi^2` statistic on the fit - `q`: Goodness of fit, or survival function. The probability (0->1) that one of the samples is greater than :math:`\chi^2` away from the fit. These can be accessed directly or printed to screen by applying the standard Python :func:`print` function to an initialized :class:`Polyfit` object. For example: .. plot:: :include-source: import numpy as np import matplotlib.pyplot as plt from sofia_redux.toolkit.fitting.polynomial import Polyfit # Create function x = np.linspace(0, 2, 256) y = -0.5 + (x - 2) ** 2 # add noise with a normal distribution rand = np.random.RandomState(42) noise = rand.normal(loc=0, scale=0.5, size=x.size) y += noise # Fit a 2nd order polynomial to the noisy data # Since we know the scale of the error, it may be included pfit = Polyfit(x, y, 2, error=0.5) fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(10, 5)) fig.tight_layout() ax[0].plot(x, y, '.', label="Samples") ax[0].plot(x, pfit.stats.fit, '-', label="Fit") ax[0].legend() ax[0].set_title("Polynomial fit to noisy data") ax[1].plot(x, pfit.stats.residuals, '.') ax[1].set_title("Residuals to the fit") print(pfit) Textual output:: Name: Polyfit Statistics -------------------------------- Number of original points : 256 Number of NaNs : 0 Number of outliers : 0 Number of points fit : 256 Degrees of freedom : 253 Chi-Squared : 240.743406 Reduced Chi-Squared : 0.951555 Goodness-of-fit (Q) : 0.300082 RMS deviation of fit : 0.485822 Exponents : Coefficients -------------------------------- (0,) : 3.396036 +/- 0.093022 (1,) : -3.840006 +/- 0.214879 (2,) : 0.958356 +/- 0.104002 Covariance ^^^^^^^^^^ The full covariance matrix is available in the :attr:`Polyfit.covariance` attribute. Note that this is an (n_coefficient, n_coefficient) giving the covariance between terms, with each axis ordered in the same manner as the terms. The order of the terms can be determined from the :attr:`Polyfit.exponents` attribute. For example: .. code-block:: python import numpy as np from sofia_redux.toolkit.fitting.polynomial import Polyfit # Create a function y, x = np.mgrid[:10, :10] z = 1 - (5 * x) + (3 * y) pfit = Polyfit(x, y, z, 1) print("Exponents:\n%s\n" % pfit.exponents) print("Covariance:\n%s" % pfit.covariance) Output:: Exponents: [[0 0] [1 0] [0 1]] Covariance: [[ 5.90909091e-02 -5.45454545e-03 -5.45454545e-03] [-5.45454545e-03 1.21212121e-03 4.33680869e-19] [-5.45454545e-03 -0.00000000e+00 1.21212121e-03]] In the above example, the rows and columns of the covariance matrix are ordered as :math:`a_{0,0}, a_{1, 0}, a_{0,1}` for the terms :math:`x^0y^0, x, y`. Disabling Statistics ^^^^^^^^^^^^^^^^^^^^ If statistics are not required, there is no need for a covariance matrix, and processing speed is important, statistical calculations may be disabled by setting `stats=False` and/or covar=False during intialization. If statistics are disabled, they will no longer be calculated or displayed. If covariance calculations are disabled, the :attr:`Polyfit.covariance` attribute will be set to `None`. .. code-block:: python import numpy as np from sofia_redux.toolkit.fitting.polynomial import Polyfit y, x = np.mgrid[:10, :10] z = 1 - (5 * x) + (3 * y) pfit = Polyfit(x, y, z, 1, stats=False, covar=False) assert pfit.covariance is None print(pfit) Output:: Name: Polyfit Exponents : Coefficients -------------------------------- (0, 0) : 1.000000 (1, 0) : -5.000000 (0, 1) : 3.000000 However, if robust outlier rejection is enabled, by necessity, both statistics and covariance will be calculated regardless of user wishes. Robust Outlier Rejection ^^^^^^^^^^^^^^^^^^^^^^^^ Robust outlier rejection may be enabled by setting `robust` > 0 during :class:`Polyfit` initialization. If set, outliers are identified through an iterative process: fitting the data then excluding any samples that fall outside `robust` sigmas from the fit before repeating. Iterations are terminated after: 1. A certain number of iterations have occurred. 2. The relative delta between successive residual RMS values is less than a set value. 4. Too many samples are excluded such that fitting the desired order of polynomial will not be possible. 5. The fit failed. Whether the iteration succeeded can be determined from the :attr:`Polyfit.success` attribute, while the termination condition encountered is in the :attr:`Polyfit.termination` attribute. Which samples were excluded during the rejection process can be determined from the :attr:`Polyfit.mask` attribute. Additional statistics will also be displayed if statistics are printed to screen. For example: .. plot:: :include-source: import numpy as np from sofia_redux.toolkit.fitting.polynomial import Polyfit # Create a function x = np.linspace(0, 5, 256) y = 1 + x rand = np.random.RandomState(42) # Add noise noise = rand.normal(loc=0, scale=0.5, size=y.shape) y += noise # Add some obvious outliers, throwing off the fit inds = rand.randint(0, 255, 25) y[inds] += 5 # Standard fit pfit = Polyfit(x, y, 1) # Robust fit with 3 sigma outlier rejection rfit = Polyfit(x, y, 1, robust=3) outliers = np.argwhere(~rfit.mask)[:, 0] plt.figure(figsize=(5, 5)) plt.plot(x, y, '.', label='Samples') plt.plot(x, pfit.stats.fit, label='Standard Fit') plt.plot(x, rfit.stats.fit, label='Robust Fit') plt.plot(x[outliers], y[outliers], 'x', color='red', label='Outliers') plt.title("Robust Outlier Rejection") plt.xlabel("x") plt.ylabel("f(x)") plt.legend() # Display robust statistics print(rfit) Output:: Name: Polyfit Statistics -------------------------------- Number of origial points : 256 Number of NaNs : 0 Number of outliers : 24 Number of points fit : 232 Degrees of freedom : 230 Chi-Squared : 53.825532 Reduced Chi-Squared : 0.234024 Goodness-of-fit (Q) : 0.000000 RMS deviation of fit : 0.482712 Outlier sigma threshold : 3 eps (delta_sigma/sigma) : 0.01 Iterations : 3 Iteration termination : delta_rms = 0 Exponents : Coefficients -------------------------------- (0,) : 0.941421 +/- 0.129497 (1,) : 1.026215 +/- 0.045540 The above report indicates 24 outliers were found in which the residual was greater than the :math:`3\sigma` limit. 3 iterations were required, terminated when no further outliers could be found (delta_rms = 0). Special Case 1-D and 2-D Functions ---------------------------------- For quick and easy derivation and evaluation of polynomial coefficients in one or two dimensions, several functions are available that deviate from the more general API presented above. 1-Dimensional Polynomial Derivation and Evaluation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For fitting a 1-dimensional polynomial to data where no special considerations are necessary, there are already many functions available in standard Python packages such as :func:`numpy.polyfit`, so no attempt has been made to recreate another version. :func:`polyfit` is perfectly capable of handling 1 dimensional cases with the added benefits of :class:`sofia_redux.toolkit.fitting.polynomial.Polyfit` functionality. However, for 1-dimensional evaluation, :func:`poly1d` is available to evaluate coefficients, and optional variance if either the full covariance matrix is provided, or the diagonal of the covariance matrix is given. .. plot:: :include-source: import numpy as np import matplotlib.pyplot as plt from sofia_redux.toolkit.fitting.polynomial import poly1d, polyfitnd x = np.linspace(0, 4 * np.pi, 1000) y = np.sin(x) error = np.random.RandomState(41).normal(loc=0, scale=0.1, size=x.size) y += error # use polyfitnd to fit a polynomial and get the covariance on the fit # coefficients coeffs, cvar = polyfitnd(x, y, 7, covar=True) yfit, yvar = poly1d(x, coeffs, covar=cvar) error = np.sqrt(yvar) plt.figure(figsize=(7, 5)) plt.plot(x, y, '.', markersize=3, label='data', color='blue') plt.fill_between(x, yfit - error, yfit + error, color='red', label=r'$1\sigma$ fit error') plt.plot(x, yfit, label='fit', color='lime') plt.legend(loc='lower left') plt.title("7th order polynomial fit and fit error") plt.xlabel('x') plt.ylabel('y') Please note that :func:`polynd` and :class:`Polyfit` are both capable of producing the same results as :func:`poly1d`. However, :func:`poly1d` is a lighter version using :func:`numpy.poly1d` to evaluate coefficients rather than the engine used by :func:`polynd`. 2-Dimensional Polynomial Derivation and Evaluation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ For 2-dimensional data, the :func:`polyfit2d` and :func:`poly2d` distinguish between the full set of polynomial coefficients, and the more robust redundancy controlled term coefficients in the same way as :func:`polyfitnd`. This distinction is controlled by the `full` keyword. The following table gives an example of the coefficients for a 2\ :sup:`nd` order polynomial with `full=True` and `full=False`. Note that the order of coefficients in the output table is different from :func:`polyfitnd`, with x coefficients running along columns and y coefficients by rows, where the exponent of the coefficient term is given by it's index. The order of polynomial must be given separately for the x and y directions via the `kx` and `ky` keyword arguments (default=2). .. list-table:: **Polynomial Coefficients** (``full=True``) :header-rows: 0 * - - **x**\ :sup:`0` - **x**\ :sup:`1` - **x**\ :sup:`2` * - **y**\ :sup:`0` - c\ :sub:`0,0` - c\ :sub:`1,0` - c\ :sub:`2,0` * - **y**\ :sup:`1` - c\ :sub:`0,1` - c\ :sub:`1,1` - c\ :sub:`2,1` * - **y**\ :sup:`2` - c\ :sub:`0,2` - c\ :sub:`1,2` - c\ :sub:`2,2` .. list-table:: **Polynomial Coefficients** (``full=False``) :header-rows: 0 * - - **x**\ :sup:`0` - **x**\ :sup:`1` - **x**\ :sup:`2` * - **y**\ :sup:`0` - c\ :sub:`0,0` - c\ :sub:`1,0` - c\ :sub:`2,0` * - **y**\ :sup:`1` - c\ :sub:`0,1` - c\ :sub:`1,1` - * - **y**\ :sup:`2` - c\ :sub:`0,2` - - 2-Dimensional Examples ^^^^^^^^^^^^^^^^^^^^^^ - :func:`polyfit2d` derives polynomial coefficients describing a surface: >>> import numpy as np >>> from sofia_redux.toolkit.fitting.polynomial import polyfit2d >>> y, x = np.mgrid[:5, :5] >>> z = 0.5 + (0.4 * x) + (0.3 * y) + (0.2 * x * y) + (0.1 * x ** 2) >>> c = polyfit2d(x, y, z, kx=2, ky=2) >>> print(np.abs(np.round(c, decimals=3))) [[0.5 0.4 0.1] [0.3 0.2 0. ] [0. 0. 0. ]] - :func:`poly2d` evaluates 2-dimensional polynomial coefficients: >>> from sofia_redux.toolkit.fitting.polynomial import poly2d >>> np.allclose(poly2d(x, y, c), z) True - :func:`polyinterp2d` derivates and evaluates 2-D polynomial coefficients: >>> from sofia_redux.toolkit.fitting.polynomial import polyinterp2d >>> np.allclose(polyinterp2d(x, y, z, x, y, ky=2, kx=2), z) True These 2-dimensional functions are designed as lighter versions of :func:`polyfitnd` and :func:`polynd`, so should fit and evaluate data slightly faster.