Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring distance for OSM dataset, paper #77

Open
micvbang opened this issue Dec 12, 2016 · 2 comments
Open

Measuring distance for OSM dataset, paper #77

micvbang opened this issue Dec 12, 2016 · 2 comments

Comments

@micvbang
Copy link

micvbang commented Dec 12, 2016

Hello Simba developers,

I am currently writing my master's thesis on parallelization of spatial queries. During this work, I am considering using Simba as one of the DBMSs to test.

In your paper, you show how Simba compares to other systems when performing spatial joins on the OSM dataset. As far as I can tell, Simba does not support geospatial queries, but only spatial queries. How have you ensured that the distance of your spatial join is consistent across systems? I.e. that you search for objects that are within exactly 1000m from each other.

Have you e.g. used a equirectangular map projection, allowing you to estimate distances using euclidean distance across all systems? From your paper, I cannot seem to figure out how this was done!

Thank you,
Michael

@dongx-psu
Copy link
Member

Simba is currently an experimental system constructed on Euclidean space. Thus, in our current implementation, we don't have any equirectangular map projection to calculate distance in different coordinating systems.

@micvbang
Copy link
Author

Sorry, I forgot to reply to this issue.

Thank you for the quick response!

I think my solution will be to use the filter and refine approach. First, I will filter my data using Simba's CIRCLERANGE to compute an approximate distance between points using a a projection with units in menters, e.g. ESPG:3857, then refine using a more expensive and exact distance calculation, e.g. Haversine formula, implemented as a UDF. Perhaps using an existing implementation, such as spatial4j.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants