Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subAtlas(Polygon) missing reverse edge and performance #251

Merged
merged 9 commits into from
Oct 23, 2018

Conversation

matthieun
Copy link
Collaborator

@matthieun matthieun commented Oct 22, 2018

Description:

This fixes two issues:

  • subAtlas(Polygon) would in some cases include a reverse edge without its master edge, in a very corner case where JTS would match the reverse edge to the Polygon, but not the master edge.
org.openstreetmap.atlas.geography.atlas.exception.AtlasIntegrityException: Cannot build an Atlas with a negative edge without its positive counterpart: -XXXXXXXXX
	at org.openstreetmap.atlas.geography.atlas.packed.PackedAtlasBuilder.lambda$verifyNegativeEdgesHaveMasterEdge$1(PackedAtlasBuilder.java:319)
	at java.lang.Iterable.forEach(Iterable.java:75)
	at org.openstreetmap.atlas.geography.atlas.packed.PackedAtlasBuilder.verifyNegativeEdgesHaveMasterEdge(PackedAtlasBuilder.java:312)
	at org.openstreetmap.atlas.geography.atlas.packed.PackedAtlasBuilder.get(PackedAtlasBuilder.java:211)
	at org.openstreetmap.atlas.geography.atlas.BareAtlas.subAtlas(BareAtlas.java:629)
  • Improve subAtlas performance by caching the edges intersecting the polygon, instead of re-computing them 3 times.

Potential Impact:

A more accurate subAtlas in very corner cases. A faster subAtlas in most cases.

Unit Test Approach:

Added one unit test that exposes the corner case, and verifies that the master edge gets added.

Test Results:

Tests succeed.


In doubt: Contributing Guidelines

Copy link
Contributor

@adahn6 adahn6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just one question regarding the supplier logic

}
else
{
return (Iterable<Member>) Iterables.stream(memberIdentifiersIntersecting.get(0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about the purpose of the List wrapper around the Set here. Since this pattern limits the List to fetching only one set, why not just use the Set directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the object that goes from outside to inside the lambda has to be final. I cannot make the set final as I use the fact it might not be there as a cause for initializing it. So wrapping it in a final list is a trick that might be ugly but works :) I am open to any other more elegant suggestion!

Copy link
Contributor

@adahn6 adahn6 Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... wouldn't it be possible to just use the isEmpty() check for initializing? a final Set can still be modified via addAll(), so you could do something like

final Set<Long> intersecting = new HashSet<>();
@SuppressWarnings("unchecked")
final Supplier<Iterable<Member>> result = () -> {
     if(intersecting.isEmpty()) {
           intersecting.addAll(Iterables.asSet(source));
     }
     return (Iterable<Member>) Iterables.stream(intersecting);
}

Not sure that code compiles or works, just spitballing here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also curious to know if what @adahn6 suggests works or not.

lucaspcram
lucaspcram previously approved these changes Oct 23, 2018
// Add the edges
edgesIntersecting(boundary).forEach(
edge -> builder.addEdge(edge.getIdentifier(), edge.asPolyLine(), edge.getTags()));
// Add the edges. Use a consumer that makes sure master edges are always added first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally following how this consumer ensures the master edge is always added first. It seems to me that if neither edge 1 nor -1 have been added to the builder, and the consumer hits edge 1 first, then this logic will add edge -1 before adding edge 1. Where have I gotten confused?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read it the same way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if the 1 and -1 have not been added yet there are multiple use cases:

  • The consumer is told to add 1 first.
    • If 1 has a reverse edge, it will add -1 (if not added already), then 1 (if not added already)
    • If 1 does not have a reverse edge, it will just add 1 (if not added already).
  • The consumer is told to add -1 first.
    • It will add 1 first (if not added already) then -1 (if not added already)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in sub-bullet point 1 under bullet point 1, you still add the reverse edge before the master edge. Is this OK in this case? I guess I am just confused by the discrepancy between the comment and what actually happens.

Also, I am curious why we need to make sure the master gets added first. Sorry to be probing, I just want to make sure I understand!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a reverse edge always has a master, it is actually enforced by the PackedAtlasBuilder. The issue this bug fixed came from the PackedAtlasBuilder refusing to build an Atlas where the reverse edge was added, but the forward was not. Technically those two have the same geometry, so they should both be added, but sometimes JTS includes one polyline and excludes the other which caused the bug.

Copy link
Contributor

@lucaspcram lucaspcram Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh now I understand. The problem was that the Iterable returned by edgesIntersecting is sometimes missing the master edge. So it doesn't actually matter which is strictly added first, just that we always add the reverse edge/master edge together if both exist.

{
final Set<Long> intersecting = new HashSet<>();
memberIdentifiersIntersecting.add(intersecting);
return Iterables.stream(source).map(member ->
Copy link
Contributor

@lucaspcram lucaspcram Oct 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a little confusing that the call to map does not actually perform a mapping operation, and instead produces a side effect on some external state. I think you are doing this to avoid calling Iterables.stream() twice (Once for a forEach and once for the collect). So here, you are using map as a modified forEach that returns an identical stream (so that collect can be called in the same line)?

If this is the case, maybe leave a comment that explains this, since this use of map is bit non-idiomatic :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works :) I will add a comment.

private <Member extends AtlasEntity> Supplier<Iterable<Member>> getCachingSupplier(
final Iterable<Member> source, final ItemType type)
{
final List<Set<Long>> memberIdentifiersIntersecting = new ArrayList<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this also work as an Optional? Might make things a little clearer, since it's a singleton List anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it would not work as an optional given the fact that you cannot swap the internals of an Optional object without changing the optional handle... or can you?
Basically, like in https://github.com/osmlab/atlas/pull/251/files#r227476980, the issue is that I can share objects within the lambda only when they are final. Open to other ideas though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you can't, so that wouldn't work.... Darn

Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great change, echoing some of the comments already left and a question.

{
if (builder.peek().edge(edge.getIdentifier()) == null)
{
if (edge.getIdentifier() != 0 && edge.hasReverseEdge())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm probably missing something obvious, what is the significance of edge.getIdentifier() != 0? When can an edge identifier be 0? Are you looking for !isMasterEdge()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

It seems to me that swapping edge.getIdentifier() != 0 with !isMasterEdge() would correct this code segment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a lot of unit tests that unfortunately create edges with id 0 using @TestAtlas. The whole idea of having a + and - for forward and reverse edges breaks down unfortunately for edge 0, and all the tests break. This might be worth creating an issue to track it, however the impact is pretty small. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, it's impossible to know that without you explaining - I was trying to wrap my head around why 0 was there - it's worth putting that as a comment here. Otherwise, is it possible to clean up tests now? If not, creating an issue like you said to clean up the test and update this piece of code to not rely on 0 might be worthwhile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaning up the tests will be a full task. Created an issue #252, will add comments too.

// Add the edges
edgesIntersecting(boundary).forEach(
edge -> builder.addEdge(edge.getIdentifier(), edge.asPolyLine(), edge.getTags()));
// Add the edges. Use a consumer that makes sure master edges are always added first.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read it the same way.

final long pointNumber = Iterables.size(pointsWithin(boundary));
final long relationNumber = Iterables.size(relationsWithEntitiesIntersecting(boundary));
final double ratioBuffer = 1.2;
final long nodeNumber = Math.round(Iterables.size(nodesWithin(boundary)) * ratioBuffer);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should use the caching for all the other types as well!

MikeGost
MikeGost previously approved these changes Oct 23, 2018
Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

edgesIntersecting(boundary), ItemType.EDGE);
final Supplier<Iterable<Area>> areasIntersecting = getCachingSupplier(
areasIntersecting(boundary), ItemType.AREA);
final Supplier<Iterable<Line>> linesIntersecting = getCachingSupplier(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we can add the same caching strategy to the Predicate subAtlas call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the filtering is expensive and we re-filter many times I'd say yes. But we'd want to run the profiler on a big atlas first to make sure it is a worthwhile change.

Copy link
Contributor

@MikeGost MikeGost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @matthieun!

@MikeGost MikeGost merged commit 8e92621 into osmlab:dev Oct 23, 2018
@matthieun matthieun added this to the 5.2.0 milestone Oct 24, 2018
@matthieun matthieun deleted the hotspot branch November 12, 2018 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants