Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having trouble with natural neighbor gridding #346

Open
SBFRF opened this issue Mar 13, 2017 · 32 comments
Open

Having trouble with natural neighbor gridding #346

SBFRF opened this issue Mar 13, 2017 · 32 comments
Labels
Area: Gridding Pertains to calculating values on a regular grid Type: Bug Something is not working like it should

Comments

@SBFRF
Copy link

SBFRF commented Mar 13, 2017

Hello,
I'm trying to do a natural neighbor interpolation. I'm a new user to the library, so maybe I'm doing something stupid. I've tried a couple files, and I thought also it might be that I can't hand it a 0 value, that didnt' seem to fix the problem. It seems as though the algorithms are producing redundant points and trying to triangulate them So not being all that familiar with the internals of computing the delaunay triangles I was hoping some one could help me out. Is there something I can preprocess to prevent this?

@jrleeman
Copy link
Contributor

Hello! What kind of data/format are you using? Is there a minimal example of the NN interpolation failing that we can use as a jumping off point on this?

@SBFRF
Copy link
Author

SBFRF commented Mar 13, 2017

sure lemme write out the variables to a text file, and I'll attach them

@SBFRF
Copy link
Author

SBFRF commented Mar 13, 2017

Attached are 3 files, the xnodes(modelGridX = gridX.txt), ynodes(modelGridY=gridY.txt) of the grid and the input data in csv (xIn xyzGridparms.txt[0], yIn=xyzGridparms.txt[1], zIn=xyzGridparms.txt[2])

    modelXX, modelYY = np.meshgrid(modelGridX, modelGridY)
    newBathyGrid = gridding.natural_neighbor(xIn, yIn, zIn, modelXX, modelYY)

gridX.txt
gridY.txt
xyzGridparms.txt

@jrleeman
Copy link
Contributor

Thanks - having a look now.

@jrleeman
Copy link
Contributor

Are you seeing a bunch of warnings and finally:

    205 
    206     if d_div == 0:
--> 207         raise ZeroDivisionError
    208 
    209     d_inv = 0.5 / d_div

ZeroDivisionError:

If so - I can reproduce. Using interpolate I can get linear interpolation just fine.

@dopplershift
Copy link
Member

Sorry it's not working for you--we're still working out the kinks. Thanks for the sample data--that should make it easier to see where it's failing.

@ahaberlie Any insight you could add on why it would hit that particular error?

@jrleeman
Copy link
Contributor

jrleeman commented Mar 13, 2017

I can get it to work for about 10 data points (with a warning resulting from zero division), above that it blows up. More error dump:

Grids less than 10km may be slow to load at synoptic scale.
Error during processing of a grid. Interpolation will continue but be mindful of errors in output. qhull precision warning: 
The initial hull is narrow (cosine of min. angle is 1.0000000000000000).
Is the input lower dimensional (e.g., on a plane in 3-d)?  Qhull may
produce a wide facet.  Options 'QbB' (scale to unit box) or 'Qbb' (scale
last coordinate) may remove this warning.  Use 'Pp' to skip this warning.
See 'Limitations' in qh-impre.htm.
QH6114 qhull precision error: initial simplex is not convex. Distance=-5.6

While executing:  | qhull i Qt
Options selected for Qhull 2015.2.r 2016/01/18:
  run-id 1952136003  incidence  Qtriangulate  _pre-merge  _zero-centrum
  _max-width 1.3e+17  Error-roundoff 86  _one-merge 4.3e+02
  _near-inside 2.1e+03  Visible-distance 1.7e+02  U-coplanar-distance 1.7e+02
  Width-outside 3.4e+02  _wide-facet 1e+03  _narrow-hull  0

precision problems (corrected unless 'Q0' or an error)
      1 flipped facets

The input to qhull appears to be less than 2 dimensional, or a
computation has overflowed.

Qhull could not construct a clearly convex simplex from points:
- p2(v2):    52 2.4e+02
- p0(v1): 1.3e+17    -0
- p1(v0):    49 2.3e+02

The center point is coplanar with a facet, or a vertex is coplanar
with a neighboring facet.  The maximum round off error for
computing distances is 86.  The center point, facets and distances
to the center point are as follows:

center point 4.267e+16    158.4

facet p0 p1 distance= -1.9
facet p2 p1 distance= -3.8e+16
facet p2 p0 distance= -1.9

These points either have a maximum or minimum x-coordinate, or
they maximize the determinant for k coordinates.  Trial points
are first selected from points that maximize a coordinate.

The min and max coordinates for each dimension are:
  0:     48.96  1.28e+17  difference= 1.28e+17
  1:        -0     240.4  difference= 240.4

If the input should be full dimensional, you have several options that
may determine an initial simplex:
  - use 'QJ'  to joggle the input and make it full dimensional
  - use 'QbB' to scale the points to the unit cube
  - use 'QR0' to randomly rotate the input for different maximum points
  - use 'Qs'  to search all points for the initial simplex
  - use 'En'  to specify a maximum roundoff error less than 86.
  - trace execution with 'T3' to see the determinant for each point.

If the input is lower dimensional:
  - use 'QJ' to joggle the input and make it full dimensional
  - use 'Qbk:0Bk:0' to delete coordinate k from the input.  You should
    pick the coordinate with the least range.  The hull will have the
    correct topology.
  - determine the flat containing the points, rotate the points
    into a coordinate plane, and delete the other coordinates.
  - add one or more points to make the input full dimensional.

@dopplershift
Copy link
Member

Definitely getting some weird points in Qhull. I can say with certainty that this has been tested with plenty of points, so that's not the issue by itself. I wonder if there's something weird in the input points. @SBFRF Can you shine more light on the source of the input locations?

@jrleeman
Copy link
Contributor

I have make sure all values in my array of 10 points are positive values (shouldn't be an issue) and have obtained a slightly different message: Error during processing of a grid. Interpolation will continue but be mindful of errors in output. QH6114 qhull precision error: initial simplex is not convex. Distance=0

@dopplershift
Copy link
Member

@SBFRF it looks like the input data are on some kind of grid already, is this correct?

While I feel like the code should work for that case, you may be better off from a performance perspective just using linear interpolation. Natural neighbor gives the most benefits for irregularly-spaced points.

@jrleeman
Copy link
Contributor

jrleeman commented Mar 13, 2017

Looks like a pseudo-grid. Looks like ship tracks? I see a few deviations and grid spacing changes. Either way with 36k + points NN will take a LONG time.

data_coords

@dopplershift
Copy link
Member

Ok. With that spacing, though, linear is still just as good as natural neighbor.

@SBFRF
Copy link
Author

SBFRF commented Mar 15, 2017

Thanks but I'm trying to create an automated, universal process in which sometimes the data are gridded (at different resolutions) and sometimes they data are gridded and in transects.

@jrleeman
Copy link
Contributor

Sorry you're having trouble with this. I'm continuing to look into it this morning and hopefully we can get a workable solution. Thank you for your patience!

@jrleeman
Copy link
Contributor

Quick update - Error is resulting from the calculation of the triangles when the grid requested goes up against the edge of the data domain. Basically, we're making triangles that have identical y coordinates, but separate x coordinates. Verifying that changing the grinding domain works with your data now. I'll post an example and the dive in for the root cause in the triangles bits.

@jrleeman
Copy link
Contributor

jrleeman commented Mar 15, 2017

Ok - after playing some and chatting with @dopplershift we have it nailed down. The grid you are requesting has points in common with the data, resulting in malformed triangles. We're looking at what we can do to remedy this, but using a different grid things work fine. If you slightly offset your grid points it should work. Give this a shot and let me know:

import numpy as np
import matplotlib.pyplot as plt
from metpy.gridding import natural_neighbor

x_grid = np.loadtxt('SBFRF_Grid/gridX.txt', delimiter=',', usecols=[0])
y_grid = np.loadtxt('SBFRF_Grid/gridY.txt', delimiter=',', usecols=[0])
data_x, data_y, data_z = np.loadtxt('SBFRF_Grid/xyzGridparms.txt',
                                    delimiter=',', usecols=[0,1,2], unpack=True)

# Offset grid from data
x_grid += 0.5
y_grid += 0.5

x_grid, y_grid = np.meshgrid(x_grid, y_grid)
gz = natural_neighbor(data_x, data_y, data_z, x_grid, y_grid)

plt.imshow(gz)
plt.show()

Gridding to whole numbers will do it as well since they don't lie in your data. The initial thoughts about being outside of the domain were a red herring.

@jrleeman
Copy link
Contributor

@dopplershift : Here's the test case I made for this - at the very least we should raise an exception or warning that there are grid and data points in common, leading to failure. Ideally we can fix it to deal with those cases.
https://gist.github.com/jrleeman/0c09c66511075b163eba2e616c1aeccf

@ahaberlie
Copy link
Contributor

@dopplershift @jrleeman very nice catch!

This was my fault (yes, blame the intern) for assuming input data would be comprised of irregularly spaced observations, as @dopplershift mentioned, and not checking for coordinate conflicts or at least mentioning the caveat, since that would definitely result in an epic failure.

That being said, I can't think of a good way to address coordinate conflicts. An offset in one direction could then create a new conflict.

@jrleeman
Copy link
Contributor

I was thinking to check for them before starting the whole process. If identical coordinates are found, assign them the value that was measured there (since it really is truth), and remove it from the requested coordinates. Implementation may be a bit of a pain, but seems reasonable?

@dopplershift
Copy link
Member

Since the implementation isn't vectorized (we're looping over all request points IIRC), you could just special case it there. The challenge is defining "identical" for floating point, esp. in the context of the algorithm.

@ahaberlie
Copy link
Contributor

ahaberlie commented Mar 15, 2017

I think maybe this is a great case for units, because determining the tolerance for something like numpy.isclose would depend a lot on the grid spacing. For example, three decimal places if using kilometers (~1 meter) is not nearly as close as three decimal places when using meters (1 mm). Until xarray/units are implemented, might have to punt on that though and just use an ad hoc tolerance that seems reasonable.

And if 1 meter doesn't seem significant, the coordinates @SBFRF is using, if in meters, represent about ~5 m grid spacing with some of the obs spaced as low as 12 m from what I've seen. This could really influence the nn interpolation values for the expected grid coordinates.

Careful though--I did a bit of testing on the @jrleeman code and found an offset value of 0.00000001 still results in a failure.

@dopplershift
Copy link
Member

So floating point gives you 6 digits of precision (I think 15 for double), so the absolute magnitude of differences shouldn't be an issues--just the relative spacing. (i.e. cm-scale triangles are the same as m-scale triangles to the algorithm)

@jrleeman
Copy link
Contributor

Catching the zero division exception and placing a NaN in the output works, but results in pages of warnings.

@dopplershift
Copy link
Member

So would it be possible to easily do a linear interpolation somewhere in there?

@jrleeman jrleeman added this to the 0.5.0 milestone Mar 16, 2017
@SBFRF
Copy link
Author

SBFRF commented Mar 16, 2017

@jrleeman So i tried the code you posted, that seemed to work for me with my data. @ahaberlie the data are measured in meters. I appreciate all of the help here!

@jrleeman
Copy link
Contributor

Excellent! We're happy to help! If you have no data precisely on the even meter, I'd grid to that. Given your spacing here NN and linear will probably give about the same results. It would be an interesting comparison.

We are going to improve this based on this issue - with either fixed interpolation for points on an edge like this and/or with a more useful warning. Hopefully we'll squeeze it into 0.5 coming out at the end of the month.

@dopplershift dopplershift added Area: Gridding Pertains to calculating values on a regular grid Type: Bug Something is not working like it should labels Mar 16, 2017
@ahaberlie
Copy link
Contributor

@SBFRF that is great news! How long, roughly, did it take to process your entire domain?

I would like to echo what @jrleeman said, bringing up issues like these are a huge help! Thank you!

@SBFRF
Copy link
Author

SBFRF commented Mar 17, 2017

@jrleeman "the worked on my data" unfortunately was not a solution to my problem, rather i was confirming that i was getting the same results as you. I'm kind of limited to my grid locations need to be the same as grid (gridx, gridY) otherwise, as often as i'm doing this domain update, my grid would migrate down the coast.

@ ahaberlie it took ~3 minutes 45 seconds to grid the data on a quad core E5430@2.66Ghz (old pc)

@jrleeman
Copy link
Contributor

@SBFRF In that case, linear interpolation should yield the same result. Natural neighbor interpolation can't have interpolation points that lie directly on Delaunay edges which is what results here.

@dopplershift
Copy link
Member

@SBFRF I'm not sure what your timeline is, but we do plan on addressing this. It likely won't be in the short term as we're focused on a series of upcoming workshops. After things are in shape for that (say mid-April) we may be able to focus on a proper fix for this--it's just not a simple fix.

@SBFRF
Copy link
Author

SBFRF commented Mar 17, 2017

@dopplershift Thanks, I'll work on something in the intermediate term. but keep me posted as you all get a fix in place i'd love to work it into some testing that i'm doing. again many thanks!

@jrleeman jrleeman modified the milestones: 0.5.0, Summer 2017 Mar 20, 2017
@dopplershift dopplershift modified the milestones: 0.6, Fall 2017 Jul 19, 2017
@jrleeman jrleeman modified the milestones: Fall 2017, Winter 2017 Oct 26, 2017
@jrleeman jrleeman removed this from the 0.7 milestone Nov 15, 2017
@deeplycloudy
Copy link
Collaborator

deeplycloudy commented Oct 14, 2019

I found some more weirdness with NN interpolation. With the same input arrays, linear and nearest work fine, while I get a nearly uniform field for NN. I am taking polar coordinate data and oversampling to 3x the range resolution on a cartesian grid. The figure makes that obvious. All NN values are in the range -17.00046 to -16.40036.

I attach the data in question. MetPy version is 0.11.0

image

for interp_method in ('linear', 'nearest', 'natural_neighbor'):
    interp = interpolate_to_points(data_ctrs[subset,:],
                                   interp_var.data.flatten()[subset], 
                                   interp_ctrs,                    
                                   interp_type=interp_method)
np.savetxt('data_locations.txt', data_ctrs[subset,:])
np.savetxt('data_values.txt', interp_var.data.flatten()[subset])
np.savetxt('interp_locations.txt', interp_ctrs)

data_locations.txt
data_values.txt
interp_locations.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Gridding Pertains to calculating values on a regular grid Type: Bug Something is not working like it should
Projects
None yet
Development

No branches or pull requests

5 participants