Skip articles that haven't changed between dumps #9
Labels
No labels
bug
documentation
duplicate
enhancement
good first issue
help wanted
invalid
question
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: organicmaps/wikiparser#9
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The dump schema includes a
date_modified
timestamp and other revision metadata.To reduce disk I/O, we could store some metadata along the articles, compare it against the new one when processing, and skip them if they haven't changed.
One way to do this would be to store the
date_modified
timestamp as themodified
attribute of the article file.An interesting optimization, but it may not worth it. Need to prove its benefits first. Let's leave it in a very low priority for now.
Understood, I've been thinking of it since you mentioned it here, we'll see what the profiling shows for the workflow.