Don't store (almost) duplicate geometry data #2927
Labels
No labels
Accessibility
Accessibility
Address
Address
Android
Android
Android Auto
Android Auto
Android Automotive (AAOS)
Android Automotive (AAOS)
API
API
AppGallery
AppGallery
AppStore
AppStore
Battery and Performance
Battery and Performance
Blocker
Blocker
Bookmarks and Tracks
Bookmarks and Tracks
Borders
Borders
Bug
Bug
Build
Build
CarPlay
CarPlay
Classificator
Classificator
Community
Community
Core
Core
CrashReports
CrashReports
Cycling
Cycling
Desktop
Desktop
DevEx
DevEx
DevOps
DevOps
dev_sandbox
dev_sandbox
Directions
Directions
Documentation
Documentation
Downloader
Downloader
Drape
Drape
Driving
Driving
Duplicate
Duplicate
Editor
Editor
Elevation
Elevation
Enhancement
Enhancement
Epic
Epic
External Map Datasets
External Map Datasets
F-Droid
F-Droid
Fonts
Fonts
Frequently User Reported
Frequently User Reported
Fund
Fund
Generator
Generator
Good first issue
Good first issue
Google Play
Google Play
GPS
GPS
GSoC
GSoC
iCloud
iCloud
Icons
Icons
iOS
iOS
Legal
Legal
Linux Desktop
Linux Desktop
Linux packaging
Linux packaging
Linux Phone
Linux Phone
Mac OS
Mac OS
Map Data
Map Data
Metro
Metro
Navigation
Navigation
Need Feedback
Need Feedback
Night Mode
Night Mode
NLnet 2024-06-281
NLnet 2024-06-281
No Feature Parity
No Feature Parity
Opening Hours
Opening Hours
Outdoors
Outdoors
POI Info
POI Info
Privacy
Privacy
Public Transport
Public Transport
Raw Idea
Raw Idea
Refactoring
Refactoring
Regional
Regional
Regression
Regression
Releases
Releases
RoboTest
RoboTest
Route Planning
Route Planning
Routing
Routing
Ruler
Ruler
Search
Search
Security
Security
Styles
Styles
Tests
Tests
Track Recording
Track Recording
Translations
Translations
TTS
TTS
UI
UI
UX
UX
Walk Navigation
Walk Navigation
Watches
Watches
Web
Web
Wikipedia
Wikipedia
Windows
Windows
Won't fix
Won't fix
World Map
World Map
No milestone
No project
No assignees
2 participants
Due date
No due date set.
Dependencies
No dependencies set.
Reference: organicmaps/organicmaps-tmp#2927
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
OM has 4 versions of geometry (points / triangles) for each feature.
The idea for that is to use simplified feature shapes on lower zooms to speed up by reading less data from the map file and rendering less elements.
geo0
is the simplest / worst geometry andgeo3
is the best / most detailed one.However it looks like that often
geo2
andgeo3
geometries are (almost) identical. Perhaps due to the original OSM geometry being detailed just enough to satisfy OM's geo2's precision requirements but not geo3's.So almost identical geo2 & 3 add storage size and cache overhead without actual performance benefits.
E.g. do you see a difference between those patches of forest in the middle and the river lines?

I do - but it took me a while to find one dot difference for the forest :)
For such situations it makes sense to omit geo2 and make the renderer use geo3.
(I guess for some simple features like a big square building all geo0..geo3 could be very well identical, so this optimization doesn't have to be limited to geo2/3 case only).
For comparison, this is how a difference between geo1 and geo2 (zl 12 & 13) looks like:

Drastic! 3 times less nodes perhaps.
We can define thresholds with some experimentation, but I think a lower geo should be at least twice more simple compared to the previous one.
Could the finer (more detailed) layers use the data of the coarser (less detailed) layers?
Geo1 would just contain the data necessary to refine geo0. Geo2 would in turn contain the data to refine the geo0 + geo1 data. This would solve the mentioned problem of duplicated geometry because the finer layers of the examples would contain almost no data.
It could also allow for more levels without high storage consumption, which would result in smoother transitions.
I don't know what algorithm is used to simplify the map files, but if the coarser maps just use a subset of the vertices, combining different layers shouldn't involve too much computational cost.
The problem is the algo will need to know somehow where to insert the refining elements.
Let's take a line as a simpler example.
E.g. we have just 3 points for geo0 and 10 more for geo1. We'll need to add extra data to geo1 to define where these extra points go, e.g. it could be
count
of points in each geo1 block andpn
a number of geo0 point after which they should be inserted.Depending on the points counts ratio between geo levels this extra info could take similar or even more space than just storing a full geo1 geometry.
Another approach is to add 2 extra bits for each point, these 2 bits store geo level index (0..3). So its easy to filter all points for e.g. geo0..2 except geo3. This approach is used already for short lines as their points are stored inline in feature's header whereas points for longer lines are stored "outside" at some offset in the file (each geo has its own offset).
If we use this approach for longer lines too then we won't have duplication, but we will lose a major benefit of having to read and load less data for lower zoom levels.
And it gets much more complicated for triangles (area objects) as for each geo level the area needs to be tessellated into triangles and run-time tesselation doesn't look like a good idea.
Well, there is a broad field for experimentation here for sure.
And actually there should be established best practices approaches for these problems as its not an OM-specific thing.
Do you know them maybe?
Its better to discuss in a separate issue or discussion thread perhaps.
Unfortunately not. I have looked a bit but it seems to be difficult to search for.
I stupidly didn't take into account that the vector data is not numbered. I don't know how efficient you save/compress the vector data but if we assume that you use a simple representation and use 24 bits of data for every vector, a very simple absolute count (where every vector is numbered) would add anywhere from 25% - 85% overhead depending on line length. I am sure that this can be greatly optimised at the cost of a bit more computation. How much space do geo0 - geo2 need compared to geo3?
This would have to be discussed with the community, but if it really is not possible to optimise it sufficiently, it could very well be that a slightly higher storage consumption is acceptable if you get more layers and smoother transitions.
Unfortunately, I don't have much time at the moment, the university exam period is starting now, so I can't start any major projects. Maybe I can take a closer look when I have more time.
TBH I don't think we need more geometry versions. Actually in my testing of #2798 I was surprised how little was the performance impact of resorting to a more detailed geo level instead of a "regular" one for a given zoom level.
What makes a "transition" look smooth is not number of geometry versions but at which point its being switched. A switch to a simpler geometry should take place not before it has enough points to look smooth at a given zoom level.
When I have a little more time, I will look into it. I am optimistic that it will be possible to save a little storage space compared to simply duplicating the data.
I got some sample stats via #2935.
For a big city:
87.8MB map file (geometry takes ~21% total size).
0.3MB (0.3% of map file size) potential saving at 1.5x threshold, 0.6Mb (0.7%) at 2.0x.
For a small region:
26.5MB map file (geometry takes ~70% total size).
1.4MB (5% of map file size) potential saving at 1.5x threshold, 2.3Mb (9%) at 2.0x.
Need to assess performance impact (data reading, rendering) somehow to get a better idea of a feasible threshold level.
Here are very rough performance benchmarks I did, but they can give some idea anyway.
I've prepared a build (see branch
pastk-styles-geometryfallback-1more-detailed
) that allows to switch between a "regular" geometry or force to one level more detailed, e.g. for a zoom level 10 instead ofgeo0
ageo1
(regularly meant for zl 12) will be used.Upd: I've added results for forcing use of the most detailed (best)
geo3
geometry on any zl.I measured with a stop-watch :) a time it takes to render a center of a very big city (Moscow) when changing zoom level 11 to 10 and 10 to 11. Some (if not most) data was likely cached by the app and Android. So results could be different for a "cold" render.
So we can say that:
It'll be great to have this kind of performance benchmark automated.
Update: added stats for using the most detailed geometry always.