Skip to content

Commit

Permalink
Intersects: faster queries for negative case
Browse files Browse the repository at this point in the history
This PR decreases the time for me to build my road layer for Colorado
from 471 seconds to 11 seconds.

I'm not sure if this ought to be the new default, or if it should have a
config knob, so hoping to use this PR for discussion.

I'm starting to really get my hands dirty with tilemaker, and have come
across a performance hotspot with `Intersects` against a layer with many
(relatively) small polygons.

In #593 (reply in thread),
you mentioned:

> The Lua methods like FindIntersecting etc. will need some thought here.
> They're really useful for region-specific processing so I'd like to keep them,
> but they do require keeping those particular geometries in memory. Generally
> there won't be too many of them - rough country polygons etc. - so the impact
> on memory isn't huge.

I think my use differs from the norm a bit. I maintain two GeoJSON datasets
that list national parks (50K items) and city parks (150K items) for North America.

So instead of ~200 country polygons that densely cover the entire planet, I have
~200,000 park polygons that sparsely cover patches of the planet.

When rendering road/path/POIs, I test whether or not they're in a park to
write different attributes to my tileset.

I'm not very familiar with the R-Tree data structure. Some Googling
makes me think that it's expensive to query it, even in the case of a
negative result.

This PR makes negative queries faster for sparse layers. It maintains a
bitmap of your z14 (or whatever your basezoom is) tiles, indicating
which might possibly intersect with a shape in the layer.

This dramatically speeds up the sparse case, but at some cost:

- memory: a z14 layer requires an extra 33MB to store the bitmap
- runtime: when ingesting the shapes, we compute the shape's envelope
    and update the bitmap

For dense layers like countries, this will likely not speed things up
at all. In fact, I'd expect a small slowdown. It will also use memory,
varying by the amount of your basezoom.

For z14, it's 33MB. Not a big deal, IMO. If your basezoom was z16,
though, it'd be 528MB. Probably a big deal, IMO.

That said -- we don't have to index at the basezoom level. We could
index at the lower of the basezoom or z14.

Or we could introduce a knob in the layer config, like:

```
isIndexed: "sparse"
```

or

```
isIndexed: "dense"
```

that tries to capture whether or not it's worth creating the bitmap.
  • Loading branch information
cldellow committed Jan 6, 2024
1 parent 65829e4 commit ebf64a7
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 0 deletions.
2 changes: 2 additions & 0 deletions include/shp_mem_tiles.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ class ShpMemTiles : public TileDataSource
std::function<std::vector<IndexValue>(const RTree& rtree)> indexQuery,
std::function<bool(const OutputObject& oo)> checkQuery
) const;
bool mayIntersect(const std::string& layerName, const Box& box) const;
std::vector<std::string> namesOfGeometries(const std::vector<uint>& ids) const;

template <typename GeometryT>
Expand Down Expand Up @@ -65,6 +66,7 @@ class ShpMemTiles : public TileDataSource
std::map<uint, std::string> indexedGeometryNames; // | optional names for each one
std::map<std::string, RTree> indices; // Spatial indices, boost::geometry::index objects for shapefile indices
std::mutex indexMutex;
std::map<std::string, std::vector<bool>> bitIndices; // A bit it set if the z14 (or base zoom) tiles at x*width + y contains a shape. This lets us quickly reject negative Intersects queryes
};

#endif //_OSM_MEM_TILES
Expand Down
3 changes: 3 additions & 0 deletions src/osm_lua_processing.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,9 @@ double OsmLuaProcessing::AreaIntersecting(const string &layerName) {
template <typename GeometryT>
std::vector<uint> OsmLuaProcessing::intersectsQuery(const string &layerName, bool once, GeometryT &geom) const {
Box box; geom::envelope(geom, box);
if (!shpMemTiles.mayIntersect(layerName, box))
return std::vector<uint>();

std::vector<uint> ids = shpMemTiles.QueryMatchingGeometries(layerName, once, box,
[&](const RTree &rtree) { // indexQuery
vector<IndexValue> results;
Expand Down
50 changes: 50 additions & 0 deletions src/shp_mem_tiles.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,36 @@ vector<string> ShpMemTiles::namesOfGeometries(const vector<uint>& ids) const {

void ShpMemTiles::CreateNamedLayerIndex(const std::string& layerName) {
indices[layerName]=RTree();

bitIndices[layerName] = std::vector<bool>();
const uint8_t indexZoom = std::min(baseZoom, 14u);
bitIndices[layerName].resize((1 << indexZoom) * (1 << indexZoom));
}

bool ShpMemTiles::mayIntersect(const std::string& layerName, const Box& box) const {
// Check if any tiles in the bitmap might intersect this shape.
// If none, downstream code can skip querying the r-tree.
auto& bitvec = bitIndices.at(layerName);
const uint8_t indexZoom = std::min(baseZoom, 14u);

double lon1 = box.min_corner().x();
double lat1 = box.min_corner().y();
double lon2 = box.max_corner().x();
double lat2 = box.max_corner().y();

uint32_t x1 = lon2tilex(lon1, indexZoom);
uint32_t x2 = lon2tilex(lon2, indexZoom);
uint32_t y1 = lat2tiley(lat1, indexZoom);
uint32_t y2 = lat2tiley(lat2, indexZoom);

for (int x = std::min(x1, x2); x < std::max(x1, x2); x++) {
for (int y = std::min(y1, y2); y < std::max(y1, y2); y++) {
if (bitvec[x * (1 << indexZoom) + y])
return true;
}
}

return false;
}

void ShpMemTiles::StoreGeometry(
Expand Down Expand Up @@ -126,4 +156,24 @@ void ShpMemTiles::StoreGeometry(
indices.at(layerName).insert(std::make_pair(box, id));
if (hasName) { indexedGeometryNames[id] = name; }
indexedGeometries.push_back(*oo);

// Store a bitmap of which tiles at the basezoom might intersect
// this shape.
auto& bitvec = bitIndices.at(layerName);
const uint8_t indexZoom = std::min(baseZoom, 14u);
double lon1 = box.min_corner().x();
double lat1 = box.min_corner().y();
double lon2 = box.max_corner().x();
double lat2 = box.max_corner().y();

uint32_t x1 = lon2tilex(lon1, indexZoom);
uint32_t x2 = lon2tilex(lon2, indexZoom);
uint32_t y1 = lat2tiley(lat1, indexZoom);
uint32_t y2 = lat2tiley(lat2, indexZoom);

for (int x = std::min(x1, x2); x < std::max(x1, x2); x++) {
for (int y = std::min(y1, y2); y < std::max(y1, y2); y++) {
bitvec[x * (1 << indexZoom) + y] = true;
}
}
}

0 comments on commit ebf64a7

Please sign in to comment.