Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pmtiles root/leaf index size issue ? #653

Open
cquest opened this issue Jan 20, 2024 · 4 comments
Open

pmtiles root/leaf index size issue ? #653

cquest opened this issue Jan 20, 2024 · 4 comments

Comments

@cquest
Copy link

cquest commented Jan 20, 2024

I've compared a full planet pmtiles file generated by pmtiles and one generated directly by tilemaker.

The nginx log show regular multi megabytes ranges requested on the timaker version, and nothing like that on the pmtile version.

I suspect the tile index not being hierarchical enough.

You can query both files here if you want to have a look :

@systemed
Copy link
Owner

systemed commented Jan 20, 2024

I think this is probably because tilemaker doesn't generate clustered pmtiles archives:

Setting the clustered property of a PMTiles archive means that the ordering of the tile data on disk matches the directories

Because tilemaker's tile generation is multi-threaded, and some tiles may take much longer to generate than others (due to complex geometries), we can't guarantee that tiles will be output in any particular order. Therefore we don't get the efficiency gains that a clustered archive would give.

For a clustered archive, you'd need to create an .mbtiles with tilemaker, then use go-pmtiles to turn that into a clustered .pmtiles.

There's some discussion of this in the original PMTiles PR, #620, in particular:

Threading means tilemaker's tile output order isn't sequential, so we can't set .clustered

The only consequences here should the directories take up more bytes, and you can't use pmtiles extract on an output. For cloud storage there isn't a huge locality advantage in accessing nearby parts of the same file

@cquest
Copy link
Author

cquest commented Jan 20, 2024

Thanks Richard, I'll go that way, mbtiles + pmtiles conversion.

One way to deal with unordered generation, is to add some inbetween queue.
Threads fill the queue, another thread takes what is available and ordered in the queue to put that in the final pmtiles file.
(easy to say, more work to implement it)

@systemed
Copy link
Owner

(easy to say, more work to implement it)

😁 Yes, you're right. I'm slightly anxious about a queue getting blocked on a tile with a really horrible multipolygon geometry (Saimaa or the US National Forests, that sort of thing) but there are possibilities for the future.

@bdon
Copy link
Contributor

bdon commented Jan 31, 2025

I created a WIP branch of a new pmtiles cluster CLI command: protomaps/go-pmtiles#207 and printed out some diagnostics:

(Note: for small archives you will need the tilemaker main branch to get fix from #795)

for canada.pmtiles, which is 7 gigabytes total archive:

go run main.go cluster canada.pmtiles
total directory size 54945551
 100% |██████████████████████████████████████████████| (14237172/14237172, 107585 it/s)
2025/01/31 12:17:48 convert.go:350: # of addressed tiles:  14237172
2025/01/31 12:17:48 convert.go:351: # of tile entries (after RLE):  14187464
2025/01/31 12:17:48 convert.go:352: # of tile contents:  14071412
2025/01/31 12:17:51 convert.go:367: Root dir bytes:  9538
2025/01/31 12:17:51 convert.go:368: Leaves dir bytes:  11472102
2025/01/31 12:17:51 convert.go:369: Num leaf dirs:  3464
2025/01/31 12:17:51 convert.go:370: Total dir bytes:  11481640
2025/01/31 12:17:51 convert.go:371: Average leaf dir bytes:  3311
2025/01/31 12:17:51 convert.go:372: Average bytes per addressed tile: 0.81
total directory size 11481640 (20.896396% of original)

So the savings are almost 80% total for a large country (~55 MB -> 11 MB), should also mean partial directory downloads are much smaller. Welcome to experiment on the PR for different areas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants