Mask and NaN Image Filling¶

There are two different algorithms for filling holes in an image. Holes are defined by either a 2-D array mask of bool values, the same shape as the image, in which False indicates a hole or bad value, and True indicates a value that is valid. The alternative is to use NaN values in the image itself to identify holes in the image that should be filled in.

Aside from the functions mentioned here, the sofia_redux.toolkit.convolve module contains classes and functions that would perform similarly. However, the user would have to select the correct kernel for their purposes and ensure it is sufficiently large enough to cover the largest hole.

2-D Clough-Tocher Interpolation (`image_naninterp()`)¶

image_naninterp() implements the Clough-Tocher scheme which first creates a triangulation of the available (non-masked/finite) data values in the image. For a single triangle, a 12 term cubic polynomial is created consisting of the value and derivatives at each vertex (\(f, f_x^\prime, f_y^\prime\)) and the cross-boundary derivatives (\(\partial f / \partial n\)) at the edge midpoints.

The final result can be evaluated fairly quickly as a piecewise polynomial, \(C^1\) surface with quadratic precision. Unfortunately, since this is only generated for the convex hull of points, extrapolation beyond the convex hull is not possible. Therefore, if a hole exist at the edge of the image, it may not be able to be filled using this method, or the same interpolation rules cannot be guaranteed.

import numpy as np
import matplotlib.pyplot as plt
from sofia_redux.toolkit.image.fill import image_naninterp
from sofia_redux.toolkit.utilities.func import julia_fractal
from astropy.modeling.models import Gaussian2D

image = julia_fractal(300, 300)
g = Gaussian2D(x_mean=250, y_mean=250, x_stddev=10, y_stddev=10)
image += g(*np.meshgrid(np.arange(300), np.arange(300)))
original = image.copy()

# add a few single pixel holes
rand = np.random.RandomState(41)
mask = rand.rand(*image.shape[:2]) < 0.1
image[mask] = np.nan

# add some larger holes
image[210:235, 0:25] = np.nan  # edge example
image[100:140, 160:200] = np.nan  # large structure example
image[230:255, 230:255] = np.nan  # smooth example
bad = image.copy()
image = np.clip(image_naninterp(image), 0, 1)

fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(15, 5))
c = 'cubehelix' #'jet'
ax[0].imshow(original, cmap=c)
ax[0].set_title("Original Image")
ax[1].imshow(bad, cmap=c)
ax[1].set_title("Holey Image")
ax[2].imshow(image, cmap=c)
ax[2].set_title("Corrected Image (image_naninterp)")

(Source code, png, hires.png, pdf)

In the above example it can be seen that all small holes were filled fairly well, but the larger holes near the left and middle of the image show obviously incorrect structure, with severe striations near the edge. However, the smooth feature to the bottom-right was reconstructed nicely. Bear in mind that nan_interp() can only fit to quadratic precision, so the missing section of the Gaussian near the bottom-right was fit adequately, but it would be impossible to correct for fine structure over a large scale. Therefore, care should be taken when using nan_interp() on noisy images or when prominent small scale features are present.

Iterative Filling (`maskinterp()`)¶

The maskinterp() function is a Python adaption of the IDL spextool function of the same name with added functionality. The function takes an image in which bad values are marked by either NaNs, or a supplied mask, but not both. Be careful if supplying a mask to an image containing NaNs where a mask value of True corresponds to a NaN value in the image.

The algorithm works iteratively: Each bad pixel is corrected using the surrounding pixels within a circular aperture centered over the bad pixel. The function used for this task is up to the user (polynomial fit by default). There are three conditions that need to be met before a replacement value for the bad pixel is determined:

1. There are enough good (finite/unmasked) pixels within the aperture to perform a successful fit. For example, a 3rd order polynomial fit requires 16 pixels (4 in the x and y directions). The minimum number of points allowable is set by the minpoints parameter.

2. The bad pixel is within a certain distance from the center-of-mass of good pixels within the aperture. For example, this would stop extrapolation in the case where there were enough good pixels to theoretically perform a fit, but they were all to one side of the bad pixel. The default distance limit is 1 pixel and may be set with the cdis parameter.

3. The fraction of unmasked/finite pixels inside the aperture must be greater than minfrac (default=0.2).

(Source code, png, hires.png, pdf)

If a fit cannot be performed due to one of the above limitations, the aperture radius is increased by apstep pixels (default=1) and the procedure is repeated. Iteration is halted once all bad values have been filled or the aperture radius is greater than maxap. If a pixel cannot be filled, it is replaced with cval (NaN by default).

Example¶

import matplotlib.pyplot as plt
import numpy as np
import imageio
from sofia_redux.toolkit.image.fill import maskinterp

image = imageio.imread('imageio:camera.png').astype(float)
image /= image.max()
original = image.copy()
rand = np.random.RandomState(41)
badpix = rand.rand(100, 100) > 0.5
cut = image[75:175, 180:280]
cut[badpix] = np.nan

result = maskinterp(image, kx=2, ky=2, minpoints=9)
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(15, 5))
c = 'copper'
ax[0].imshow(original, cmap=c)
ax[0].set_title("Original Image")
ax[1].imshow(image, cmap=c)
ax[1].set_title("Holey Image")
ax[2].imshow(result, cmap=c)
ax[2].set_title("Corrected Image (maskinterp)")
for a in ax:
    a.set_xlim(165, 295)
    a.set_ylim(190, 60)

(Source code, png, hires.png, pdf)

The above example fits a second order polynomial using the maskinterp algorithm. We require that a minimum of 9 points is used to fit good points (3 in X * 3 in Y).

Maskinterp Fitting Function¶

The fitting function can be set with the func parameter. There are two types of function available:

Statistical: fit = f(pixel_values, **kwargs)
Dependent: fit = f(pixel_values, cin, cout, **kwargs)

Statistical functions only require the pixel values within the aperture, while dependent functions require the input pixel coordinates and output pixel coordinates. For maskinterp() cin and cout must always be (N, 2) arrays where cin[:, 0] contains the y coordinates of the input pixels, and cin[:, 1] contains the x coordinates of the input pixels. i.e., create a fit based on the data values and cin, then interpolate onto any values contained in cout.

For example, the following function defines the default polynomial fitting function:

def spline_interp_2dfunc(d, cin, cout, **kwargs):
    return polyinterp2d(
        cin[:, 1], cin[:, 0], d, cout[:, 1], cout[:, 0], **kwargs)

A statistical function could be defined as shown in the following example:

median_filled = maskinterp(image, func=np.median, statistical=True)
plt.figure(figsize=(5, 5))
plt.imshow(median_filled[60:190, 165:295], cmap='copper')
plt.title("Maskinterp with user defined median function")

(Source code, png, hires.png, pdf)

In order to declare that the function being used does not rely on input/output coordinates, statistical=True must be set.

Performance¶

The processing time required for maskinterp() is highly dependent on the number of masked points in an image, the scale of any holes (number of iterations), and the function used to fit. image_naninterp() is only dependent on the size of the image. Therefore, if performance is a consideration, testing is required. Otherwise, the sofia_redux.toolkit.convolve module also allows replacement of masked/NaN values through convolution of valid points with a given kernel including a Savitzky-Golay polynomial approximation.

Polygon Filling¶

The polyfillaa() function is used find all pixels within a given polygon, or sets of multiple polygons. It can optionally also return the area of the pixel enclosed by a/each polygon.

Output results¶

By default all pixels on an infinite grid contained or partially contained by a polygon will be reported in the output results. If this is not desired, the limits of the grid should be set by the xrange and yrange parameters.

If a single polygon is supplied, the pixels contained within will be returned as an (N, 2) array of N total pixels where result[i, 0] corresponds to the y-coordinate of pixel i and result[i, 1] corresponds to the x-coordinate of pixel i. If the area is returned, it will be reported as a 1-dimensional array of shape (N) where area[i] is the area of pixel i enclosed by the polygon.

If multiple polygons were supplied, the output result is a dict where the key corresponds to the polygon index and the values are the same as those described for the single polygon output. For example, result[0] would contain the results for the first polygon specified by the user.

Single Polygons¶

The px and py arguments are used to pass in the x and y coordinates of the polygon vertices. For example,

import matplotlib.pyplot as plt
import numpy as np
from sofia_redux.toolkit.image.fill import polyfillaa

# Define a polygon
px = [5, 3, 9]
py = [2, 10, 5]

pixels, areas = polyfillaa(px, py, area=True)
grid = np.full((11, 11), np.nan)
grid[pixels[:, 0], pixels[:, 1]] = areas

fig, ax = plt.subplots(figsize=(7, 5))
ax.set_xticks(np.arange(-0.5, 10, 1))
ax.set_yticks(np.arange(-0.5, 10, 1))
ax.set_xticklabels(np.arange(11))
ax.set_yticklabels(np.arange(11))

ax.grid(which='major', axis='both', linestyle='--',
        color='k', linewidth=1)
img = ax.imshow(grid, cmap='cividis', origin='lower')

ax.plot(np.array(px + [px[0]]) - 0.5, np.array(py + [py[0]]) - 0.5,
        '-o', color='red', linewidth=3, markersize=10)
fig.colorbar(img, ax=ax)
ax.set_title("Pixels within polygon and fractional area")

(Source code, png, hires.png, pdf)

In the above plot, the fraction of pixel within the triangle is color coded.

Multiple Polygons¶

If multiple polygons are to be calculated, there are two ways of indicating which points belong to which polygons. The first method allows px and py to still be provided as 1-dimensional arrays, but an additional parameter start_indices, is used to provide the starting index of each polygon within. For example:

px = [0, 0, 2, 2, 4, 2, 8, 12, 12, 14, 14]
py = [0, 2, 2, 0, 2, 9, 5,  4,  5,  5,  4]
#    |-- 0 ----||--- 1 --||------2-------|
start_indices = [0, 4, 7]

The second method uses a 2-level nested list, where each sublist contains the points for a polygon. For example:

px = [[0, 0, 2, 2], [4, 2, 8], [12, 12, 14, 14]]
py = [[0, 2, 2, 0], [2, 9, 5], [4,  5,  5,  4]]

The above examples all define the same shapes. The first is a square with vertices at (0, 0) (0, 2) (2, 2) (2 0), the second is a triangle (5, 2) (3, 9) (9, 5), and the list is a rectangle (100, 300) (100, 400) (200, 400) (200, 300).

Results will be output as a dict as discribed above. For example:

from sofia_redux.toolkit.image.fill import polyfillaa
import numpy as np
import matplotlib.pyplot as plt

px = [[0, 0, 2, 2], [4, 2, 8], [10, 10, 12, 12]]
py = [[0, 2, 2, 0], [2, 9, 5], [4,  5,  5,  4]]

# Add a pentagon
def mypoly(x, y, r, n):
    ang = (np.arange(n) + 1) * 2 * np.pi / n
    return list(r * np.cos(ang) + x), list(r * np.sin(ang) + y)

hx, hy = mypoly(10.5, 10.5, 3, 5)
px.append(hx)
py.append(hy)

result, areas = polyfillaa(px, py, area=True)

grid = np.full((15, 15), np.nan)
for i in range(len(result.keys())):
    grid[result[i][:,0], result[i][:,1]] = areas[i]

fig, ax = plt.subplots(figsize=(7, 5))
ax.set_xticks(np.arange(-0.5, 15, 1))
ax.set_yticks(np.arange(-0.5, 15, 1))
ax.set_xticklabels(np.arange(16))
ax.set_yticklabels(np.arange(16))

ax.grid(which='major', axis='both', linestyle='--',
        color='k', linewidth=1)
img = ax.imshow(grid, cmap='cividis', origin='lower')

for i in range(len(px)):
    x = px[i] + [px[i][0]]
    y = py[i] + [py[i][0]]
    ax.plot(np.array(x) - 0.5, np.array(y) - 0.5,
            '-o', color='red', linewidth=3, markersize=7)

fig.colorbar(img, ax=ax)
ax.set_title("Multiple polygons and fractional area")

(Source code, png, hires.png, pdf)

Note, that the center of a pixel cell is defined at (x + 0.5, y + 0.5) while (x, y) is defined as the bottom-left corner of a pixel cell. matplotlib defined (x, y) as the center of a cell, which is why there are various 0.5 corrections dotted throughout the examples above for plotting.

Navigation

Mask and NaN Image Filling¶

2-D Clough-Tocher Interpolation (image_naninterp())¶

Iterative Filling (maskinterp())¶

Example¶

Maskinterp Fitting Function¶

Performance¶

Polygon Filling¶

Output results¶

Single Polygons¶

Multiple Polygons¶

2-D Clough-Tocher Interpolation (`image_naninterp()`)¶

Iterative Filling (`maskinterp()`)¶