All resources are on the project wiki now

This commit is contained in:
Bryan Housel 2021-03-19 22:51:32 -04:00
parent d1a38e5973
commit 27c6dca7ca
6 changed files with 25 additions and 962 deletions

View file

@ -1,672 +1,14 @@
# Contributing
<!--
### tl;dr
##### :raised_hand: &nbsp; How to help:
We're always looking for help!
* [Prerequisites & installation instruction in the README](https://github.com/osmlab/name-suggestion-index#prerequisites)
* `npm run build` will reprocess the files and output warnings
* Remove generic names - [show me](#hocho--remove-generic-names)
* Add `brand:wikidata` and `brand:wikipedia` tags - [show me](#female_detective--add-wiki-tags)
* Add missing brands - [show me](#convenience_store--add-missing-brands)
* Edit Wikidata in compliance with their policies - [show me](#memo--edit-wikidata)
- Read [the Code of Conduct](CODE_OF_CONDUCT.md) and remember to be kind to one another.
- See [the project wiki](https://github.com/osmlab/name-suggestion-index/wiki) for info about how to contribute to this index.
👉 Tip: You can browse the index at https://nsi.guide/ to see which brands are missing Wikidata links, or have incomplete Wikipedia pages.
If you have any questions or want to reach out to a maintainer, ping
[@bhousel][@bhousel], [@1ec5][@1ec5], or [@tas50][@tas50] on:
- [OpenStreetMap US Slack](https://slack.openstreetmap.us/) (`#poi` or `#general` channels)
-->
&nbsp;
## Background
## :world_map: &nbsp; About OpenStreetMap
[OpenStreetMap](https://openstreetmap.org) is a free, editable map of the whole world that is being built by volunteers.
Features on the map are defined using _tags_. Each tag is a `key=value` pair of text strings.
For example, a McDonald's restaurant might have these tags:
```js
"amenity": "fast_food",
"cuisine": "burger",
"name": "McDonald's",
and more tags to record its address, opening hours, and so on…
```
The `amenity=fast_food` tag identifies it as a fast food restaurant.
&nbsp;
## :bulb: &nbsp; About the name-suggestion-index
The goal of this project is to define the _most correct tags_ for common features,
and to link these features to a [Wikidata](https://www.wikidata.org/) QID identifer.
- This helps people contribute to OpenStreetMap, because they can pick "McDonald's"
from a list and not need to worry about the tags being added.
- This helps the OpenStreetMap project because consumers can use and understand the
data better when it is tagged consistently.
&nbsp;
## :card_file_box: &nbsp; Data files
### Organization
The `data/*` folder contains a lot of files - one file per category.
_Category files_ are organized in a tree/key/value path. Each category file contains all the items that share an OpenStreetMap key/value tag.
- _tree_ - The highest level of organization - each tree contains categories that follow a similar approach to naming and linking to Wikidata.
- _key_ - An OpenStreetMap tree key (e.g. "amenity")
- _value_ - An OpenStreetMap tag value (e.g. "fast_food")
The name-suggestion-index currently supports these trees:
- _brands_ - Branded businesses like restaurants, banks, fuel stations, shops
identified by `brand`/`brand:wikidata` tags
- _operators_ - Organizations like post offices, police departments, hospitals
identified by `operator`/`operator:wikidata` tags
- _flags_ - Flagpoles hoisting common kinds of flags (national, regional, religious, advertising)
identified by `flag:wikidata` tag
- _transit_ - Transit networks (bus, rail, ferry, etc.) and related infrastructure
identified by `network`/`network:wikidata` tags
For example:
- `data/`
- `brands/amenity/fast_food.json`
- `brands/shop/supermarket.json`
- `operators/amenity/post_office.json`
- `flags/man_made/flagpole.json`
- `transit/route/bus.json`
- and so on…
<!--
##### Collecting items from OpenStreetMap
These files are created by a several step process:
- Process the OpenStreetMap "planet" data to collect common tags -> for example, `dist/collected/names_all.json`
- Filter all the tags into -> `dist/filtered/names_keep.json` and `dist/filtered/names_discard.json`
- Merge the items we are keeping into -> `data/**/*.json` files for us to decide what to do with them
The data files are organized by topic and OpenStreetMap tag:
* `data/brands/*` - Source files for each kind of branded business, organized by OpenStreetMap tag
* `amenity/*.json`
* `leisure/*.json`
* `shop/*.json`
* and so on…
-->
### Category file contents
Each category file contains:
- `properties` - Object containing category-wide properties
- `items` - Array containing the items in the category
For example `brands/amenity/fast_food.json` _(comments added for clarity)_:
```js
"properties": { // CATEGORY PROPERTIES:
"path": "brands/amenity/fast_food" // "path" - the tree/key/value path for this category
},
"items": [ // An array of items belonging to this category
{ // ITEM PROPERTIES:
"displayName": "McDonald's", // "displayName" - Name to display in summary screens and lists
"id": "mcdonalds-658eea", // "id" - a unique identifier added and generated automatically
"locationSet": {"include": ["001"]}, // "locationSet" - defines where this brand is valid ("001" = worldwide)
"tags": { // "tags" - OpenStreetMap tags that every McDonald's should have
"amenity": "fast_food", // The OpenStreetMap tag for a "fast food" restaurant
"brand": "McDonald's", // `brand` - Brand name in the local language (English)
"brand:wikidata": "Q38076", // `brand:wikidata` - Universal Wikidata identifier
"brand:wikipedia": "en:McDonald's", // `brand:wikipedia` - Reference to English Wikipedia
"cuisine": "burger", // `cuisine` - What kind of fast food is served here
"name": "McDonald's" // `name` - Display name, also in the local language (English)
}
},
```
There may also be items for McDonald's in other languages!
For example, this is how McDonald's should be mapped in Japan:
```js
{ // ITEM PROPERTIES:
"displayName": "マクドナルド", // "displayName" - Name to display in summary screens and lists
"id": "マクドナルド-3e7699", // "id" - a unique identifier added and generated automatically
"locationSet": { "include": ["jp"] }, // "locationSet" - defines where this brand is valid ("jp" = Japan)
"tags": {
"amenity": "fast_food",
"brand": "マクドナルド", // `brand` - Brand name in the local language (Japanese)
"brand:en": "McDonald's", // `brand:en` - For non-English brands, tag the English version too
"brand:ja": "マクドナルド", // `brand:ja` - Add at least one `brand:xx` tag that matches `brand`
"brand:wikidata": "Q38076", // `brand:wikidata` - Same Universal wikidata identifier
"brand:wikipedia": "ja:マクドナルド", // `brand:wikipedia` - Reference to Japanese Wikipedia
"cuisine": "burger",
"name": "マクドナルド", // `name` - Display name, also in the local language (Japanese)
"name:en": "McDonald's" // `name:en` - For non-English names, tag the English version too
"name:ja": "マクドナルド", // `name:ja` - Add at least one `name:xx` tag that matches `name`
}
},
```
&nbsp;
#### Item properties
##### `displayName` (required)
The `displayName` can contain anything, but it should be a short text appropriate for display in lists or as preset names in editor software. This is different from the OpenStreetMap `name` tag.
By convention, if you need to disambiguate between multiple brands with the same name, we add text in parenthesis. Here there are 2 items named "Target", but they have been assigned different display names to tell them apart.
In `brands/shop/department_store.json`:
```js
"items": [
{
"displayName": "Target (Australia)",
"id": "target-c93bbd",
"locationSet": {"include": ["au"]},
"tags": {
"brand": "Target",
"brand:wikidata": "Q7685854",
"brand:wikipedia": "en:Target Australia",
"name": "Target",
"shop": "department_store"
}
},
{
"displayName": "Target (USA)",
"id": "target-592fe0",
"locationSet": {"include": ["us"]},
"tags": {
"brand": "Target",
"brand:wikidata": "Q1046951",
"brand:wikipedia": "en:Target Corporation",
"name": "Target",
"shop": "department_store"
}
},
```
##### `id` (generated)
Each item has a unique `id` generated for it.
When adding new data, don't add the `id` line (key and value).
Then run `npm run build` which will add the key and generate the value automatically.
The identifiers are stable unless the name, key, value, or locationSet change.
##### `locationSet` (required)
Each item requires a `locationSet` to define where the item is available. You can define the `locationSet` as an Object with `include` and `exclude` properties:
```js
"locationSet": {
"include": [ Array of locations ],
"exclude": [ Array of locations ]
}
```
The "locations" can be any of the following:
* Strings recognized by the [country-coder library](https://github.com/ideditor/country-coder#readme). These should be [ISO 3166-1 2 or 3 letter country codes](https://en.wikipedia.org/wiki/List_of_countries_by_United_Nations_geoscheme) or [UN M.49 numeric codes](https://en.wikipedia.org/wiki/UN_M49).<br/>_Example: `"de"`_<br/>Tip: The M49 code for the whole world is `"001"`.
* Filenames for custom `.geojson` features. If you want to use a custom feature, you'll need to add these under the `features/` folder (see ["Features"](#features) below for more details). Each `Feature` must have an `id` property that ends in `.geojson`.<br/>_Example: `"de-hamburg.geojson"`_<br/>Tip: You can use [geojson.io](http://geojson.io) or other tools to create these.
You can view examples and learn more about working with `locationSets` in the [@ideditor/location-conflation](https://github.com/ideditor/location-conflation/blob/main/README.md) project.
⚡️ You can test locationSets on this interactive map: https://ideditor.github.io/location-conflation/
##### `tags` (required)
Each item requires a `tags` value. This is just an Object containing all the OpenStreetMap tags that should be set on the feature.
##### `matchNames`/`matchTags` (optional)
Brands are often tagged inconsistently in OpenStreetMap. For example, some mappers write "International House of Pancakes" and others write "IHOP".
This project includes a "fuzzy" matcher that can match alternate names and tags to a single entry in the name-suggestion-index. The matcher keeps duplicate items out of the index and is used in the iD editor to help suggest tag improvements.
`matchNames` and `matchTags` properties can be used to list the less-preferred alternatives.
```js
"properties": {
"path": "brands/amenity/fast_food" // all items in this file will match the tag `amenity=fast_food`
},
"items": [
{
"displayName": "Honey Baked Ham",
"id": "honeybakedham-4d2ff4",
"locationSet": { "include": ["us"] },
"matchNames": ["honey baked ham company"], // also match these less-preferred names
"matchTags": ["shop/butcher", "shop/deli"], // also match these less-preferred tags
"tags": {
"alt_name": "HoneyBaked Ham", // match `alt_name`
"amenity": "fast_food",
"brand": "Honey Baked Ham", // match `brand`
"brand:wikidata": "Q5893363",
"brand:wikipedia": "en:The Honey Baked Ham Company",
"cuisine": "american",
"name": "Honey Baked Ham", // match `name`
"official_name": "The Honey Baked Ham Company" // match `official_name`
}
},
```
👉 The matcher code also has some useful automatic behaviors…
You don't need to add `matchNames` for:
- variations in capitalization, punctuation, spacing (the middots common in Japanese names count as punctuation, so "V・ドラッグ" already matches "vドラッグ")
- variations that already appear in the `name`, `brand`, `operator`, `network`.
- variations that already appear in an alternate name tag (e.g. `alt_name`, `short_name`, `official_name`, etc.)
- variations that already appear in any international version of those tags (e.g. `name:en`, `official_name:ja`, etc.)
- variations in diacritic marks (e.g. "Häagen-Dazs" already matches "Haagen-Dazs")
- variations in `&` vs. `and`
You don't need to add `matchTags` for:
- Tags assigned to _match groups_ (defined in `config/matchGroups.json`).
For example, you don't need add `matchTags: ["shop/doityourself"]` to every "shop/hardware" and vice versa.
_Tags in a match group will automatically match any other tags in the same match group._
👉 Bonus: The build script will automatically remove extra `matchNames` and `matchTags` that are unnecessary.
##### `note` (optional)
You can optionally add a `note` property to any item. The note can contain any text useful for maintaining the index - for example, information about the brand's status, or a link to a GitHub issue.
The notes just stay with the name-suggestion-index; they aren't OpenStreetMap tags or used by other software.
```js
{
"displayName": "United Bank (Connecticut)",
"id": "unitedbank-28419b",
"locationSet": { "include": ["peoples_united_bank_ct.geojson"] },
"note": "Merged into People's United Bank (Q7165802) in 2019, see https://en.wikipedia.org/wiki/United_Financial_Bancorp",
"tags": {
}
},
```
&nbsp;
#### Identical names, multiple brands
Sometimes multiple brands use the same name - this is okay!
Make sure each entry has a distinct `locationSet`, and the index will generate unique identifiers for each one.
You should also give each entry a unique `displayName`, so everyone can tell them apart.
```js
{
"displayName": "Price Chopper (Kansas City)",
"id": "pricechopper-9554e9",
"locationSet": { "include": ["price_chopper_ks_mo.geojson"] },
"tags": {
"brand": "Price Chopper",
"brand:wikidata": "Q7242572",
"brand:wikipedia": "en:Price Chopper (supermarket)",
"name": "Price Chopper",
"shop": "supermarket"
}
},
{
"displayName": "Price Chopper (New York)",
"id": "pricechopper-f86a3e",
"locationSet": { "include": ["price_chopper_ny.geojson"] },
"tags": {
"brand": "Price Chopper",
"brand:wikidata": "Q7242574",
"brand:wikipedia": "en:Price Chopper Supermarkets",
"name": "Price Chopper",
"shop": "supermarket"
}
},
```
&nbsp;
## Features
These are optional `.geojson` files found under the `features/` folder. Each feature file must contain a single GeoJSON `Feature` for a region where a brand is active. Only `Polygon` and `MultiPolygon` geometries are supported.
Feature files look like this:
```js
{
"type": "Feature",
"id": "new_jersey.geojson",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": […]
}
}
```
Note: A `FeatureCollection` containing a single `Feature` is ok too - the build script can handle this.
The build script will automatically generate an `id` property to match the filename.
👉 GeoJSON Protips:
* There are many online tools to create or modify `.geojson` files.
* You can draw and edit GeoJSON polygons with [geojson.io](http://geojson.io) - (Editing MultiPolygons does not work in drawing mode, but you can edit the code directly).
* You can simplify GeoJSON files with [mapshaper.org](https://mapshaper.org/)
* [More than you ever wanted to know about GeoJSON](https://macwright.org/2015/03/23/geojson-second-bite.html)
<!-- &nbsp;
## What you can do
### :building_construction: &nbsp; Building the project
To rebuild the index, run:
* `npm run build`
This will output a lot of warnings, which you can help fix!
-->
<!--
### :hocho: &nbsp; Remove generic names
Some of the common names in the index might not actually be brand names. We want to remove these
generic words from the index, so they are not suggested to mappers.
For example, "Универмаг" is just a Russian word for "Department store":
```js
"path": "brands/shop/department_store",
"items": [
{
"displayName": "Универмаг",
"id": "универмаг-d5eaac",
"locationSet": { "include": ["ru"] },
"tags": {
"brand": "Универмаг",
"name": "Универмаг",
"shop": "department_store"
}
},
```
To remove this generic name:
1. Delete the item from the appropriate file, in this case `data/brands/shop/department_store.json`
2. Edit `config/genericWords.json`. Add a regular expression matching the generic name.
3. Run `npm run build` - if the filter is working, the name will not be put back into `data/brands/shop/department_store.json`
4. `git diff` - to make sure that the items you wanted to discard are gone (and no others are affected)
5. If all looks ok, submit a pull request with your changes.
-->
&nbsp;
### :female_detective: &nbsp; Add wiki tags
Contributing `*:wikipedia` and `*:wikidata` tags is a very useful task that anybody can help with.
#### Example #1 - Worldwide / English brands...
1. Find an entry in a brand file that is missing these tags:
In `brands/amenity/fast_food.json`:
```js
{
"displayName": "Chipotle",
"id": "chipotle-658eea",
"locationSet": { "include": ["us"] }
"matchNames": ["chipotle mexican grill"],
"tags": {
"amenity": "fast_food",
"brand": "Chipotle",
"cuisine": "mexican",
"name": "Chipotle"
}
},
```
2. Google for that brand - if you are lucky, you might find the Wikipedia page right away.
<img width="600px" alt="Google for Chipotle" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/chipotle_1.png"/>
3. From the Wikipedia page URL, you can identify the `brand:wikipedia` value.
OpenStreetMap expects this tag to be formatted like `"en:Chipotle Mexican Grill"`.
* Copy the page name from the URL.
* Add the language prefix - "en:" for the English Wikipedia.
* Replace the underscores '\_' with spaces.
On the brand's Wikipedia page, you can also find its "Wikidata item" link. This appears
under the "tools" menu in the sidebar.
:point_right: protip: [@maxerickson] has created a user script to make copying these values even easier - see [#1881]
[#1881]: https://github.com/osmlab/name-suggestion-index/issues/1881
[@maxerickson]: https://github.com/maxerickson
<img width="600px" alt="Chipotle Wikipedia" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/chipotle_2.png"/>
4. On the brand's Wikidata page, you can identify the `brand:wikidata` value. It is a code starting with 'Q' and several numbers.
<img width="600px" alt="Chipotle Wikidata" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/chipotle_3.png"/>
5. Update the brand file, in this case `brands/amenity/fast_food.json`:
We can add the `"brand:wikipedia"` and `"brand:wikidata"` tags.
```js
{
"displayName": "Chipotle",
"id": "chipotle-658eea",
"locationSet": { "include": ["us"] }
"matchNames": ["chipotle mexican grill"],
"tags": {
"amenity": "fast_food",
"brand": "Chipotle",
"brand:wikidata": "Q465751", // added
"brand:wikipedia": "en:Chipotle Mexican Grill", // added
"cuisine": "mexican",
"name": "Chipotle"
}
},
```
_(comments added for clarity)_
6. Rebuild and submit a pull request.
* Run `npm run build`
* If it does not fail with an error, you can submit a pull request with your changes (warnings are OK).
&nbsp;
#### Example #2 - Regional / non-English brands...
This example uses a brand "かっぱ寿司". I don't know what that is, so I will do some research.
1. Find an entry in a brand file that is missing these tags:
In `brands/amenity/fast_food.json`:
```js
{
"displayName": "かっぱ寿司",
"id": "e7198e-3e7699",
"locationSet": {"include": ["jp"]},
"tags": {
"amenity": "fast_food",
"brand": "かっぱ寿司",
"name": "かっぱ寿司"
}
},
```
2. Google for that brand - if you are lucky, you might find the Wikipedia page right away.
Tip: You might want to narrow you search by Googling with a `site:` filter: `"かっぱ寿司 site:ja.wikipedia.org"`
From these results, we can know that the brand is "Kappazushi", owned by a Japanese company
called "Kappa Create". We can also find the Wikipedia page.
<img width="600px" alt="Google for かっぱ寿司" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/kappa_1.png"/>
3. As with English brands, you can identify the `brand:wikipedia` value from the URL.
Because this is a Japanese brand, we will link to the Japanese Wikipedia page.
OpenStreetMap expects this tag to be formatted like `"ja:かっぱ寿司"`.
* Copy the page name from the URL.
* Add the language prefix "ja:".
* Replace the underscores "\_" with spaces.
Although I can not read Japanese, I can identify the "Wikidata item" link because
it always appears in the sidebar and mouseover will show the Wikidata 'Q' code in the URL.
<img width="600px" alt="Kappa Sushi Wikipedia" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/kappa_3.png"/>
4. On the brand's Wikidata page, you can identify the `brand:wikidata` value. It is a code starting with 'Q' and several numbers.
Note: The Wikidata page looks a bit sparse - you can edit this too if you want to help!
<img width="600px" alt="Kappa Sushi Wikidata" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/kappa_4.png"/>
5. Update the brand file, in this case `brands/amenity/fast_food.json`:
We can add:
* `"brand:en"` and `"name:en"` tags to contain the English name "Kappazushi"
* `"name:ja"` and `"brand:ja"` tags to contain the local name "かっぱ寿司"
* `"brand:wikipedia"` and `"brand:wikidata"` tags
* `"cuisine": "sushi"` OpenStreetMap tag
* Also check the `"locationSet"` property to make sure it is accurate.
```js
{
"displayName": "かっぱ寿司",
"id": "kappazushi-3e7699",
"locationSet": {"include": ["jp"]},
"tags": {
"amenity": "fast_food",
"brand": "かっぱ寿司",
"brand:en": "Kappazushi", // added
"brand:ja": "かっぱ寿司", // added
"brand:wikipedia": "ja:かっぱ寿司", // added
"brand:wikidata": "Q11263916", // added
"cuisine": "sushi", // added
"name": "かっぱ寿司",
"name:en": "Kappazushi", // added
"name:ja": "かっぱ寿司" // added
}
},
```
_(comments added for clarity)_
6. Rebuild and submit a pull request.
* Run `npm run build`
* If it does not fail with an error, you can submit a pull request with your changes (warnings are OK).
&nbsp;
## :convenience_store: &nbsp; Add missing brands
If it exists, we want to know about it!
Some brands haven't been mapped enough on OpenStreetMap (50+ times) to be automatically added to the index.
If it is a notable brand, you can add it manually to establish a preferred tagging.
1. Before adding a new brand, the minimum information you should know is the correct tagging required for instances of the brand (`name`, `brand` and what it is - e.g. `amenity=fast_food`). Ideally you also have `brand:wikidata` and `brand:wikipedia` tags for the brand and any other appropriate tags - e.g. `cuisine`.
2. Add your new entry anywhere into the appropriate file under `data/**/*` (the files will be sorted alphabetically later) and using the `"tags"` key add all appropriate OSM tags. Refer to [here](#card_file_box--about-the-data-files) if you're not familiar with the syntax.
3. If the brand only has locations in a known set of countries add them to the `"locationSet"` property. This takes an array of [ISO 3166-1 alpha-2 country codes](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) in lowercase (e.g. `["de", "at", "nl"]`).
4. If instances of this brand are commonly mistagged add the `"matchNames": []` key to list these. Again, refer to [here](#card_file_box--about-the-data-files) for syntax.
5. Run `npm run build`
&nbsp;
### Using Overpass Turbo
Sometimes you might want to know the locations where a brand name exists in OpenStreetMap.
Overpass Turbo can show them on a map:
1. Go to https://overpass-turbo.eu/
2. Enter your query like this, replacing the `name` and other OpenStreetMap tags.
Because we don't specify a bounding box, this will perform a global query.
```
nwr["name"="かっぱ寿司"]["amenity"="fast_food"];
out center;
```
Tip: The browsable index at https://nsi.guide/ can open Overpass Turbo with the query already set up for you.
3. Click run to view the results.
As expected, the "かっぱ寿司" (Kappazushi) locations are all concentrated in Japan.
<img width="600px" alt="Overpass search for かっぱ寿司" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/overpass.png"/>
&nbsp;
## :memo: &nbsp; Edit Wikidata
Editing brand pages on Wikidata is something that anybody can do. It helps not just our project, but anybody who uses this data for other purposes too! You can read more about contributing to Wikidata [here](https://www.wikidata.org/wiki/Wikidata:Contribute).
- Add Wikidata pages for items that don't yet have them.
- Improve the labels and descriptions on the Wikidata pages.
- Translate the labels and descriptions to more languages.
- Add social media accounts under the "Identifiers" section. If a brand or organization has a Facebook or Twitter account, we can fetch its logo automatically.
- Add the NSI identifier (P8253). This should be a short string like "chipotle-658eea", not a URL.
Tip: The browsable index at https://nsi.guide/ can show you where the Wikidata information is missing or incomplete.
### Adding properties to Wikidata
Social media accounts may be used to automatically fetch logos, which are used by the iD Editor.
<img width="800px" alt="Adding information on Wikidata" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/wikidata.gif"/>
Social media links are often displayed on the official web site of a brand, making them easy to find. When adding an entry for a social media account, it might be worth checking if that account has a "verified badge" which indicates a verified social media account, and if it does, this can be added via the "add qualifier" option, using "has quality" along with either "verified account" or "verified badge".
<img width="730px" alt="Checking Twitter references in Wikidata" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/wikidata-applebees-twitter.png"/>
### Adding references to Wikidata
Wikidata pages without a matching Wikipedia article should have some additional references by independent sources. For our purposes, the easiest one to add is usually something in form of "this shop brand had N shops on some specific date".
<img width="800px" alt="Adding references on Wikidata" src="https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/docs/img/wikidata_references.gif"/>
<!--See https://www.wikidata.org/w/index.php?title=Wikidata:Administrators%27_noticeboard&oldid=941582891#Entries_that_should_be_now_fixed for discussion on Wikidata-->
If adding a general reference, you can use "described at URL" (P973) as a top-level claim. The ideal would be links to articles in major newspapers that are primarily about the brand in question. Press releases and articles guest-written by the CEO are not as good.
### Creating Wikidata pages
For minor brands there may be no Wikipedia article and it may be [impossible](https://en.wikipedia.org/wiki/Wikipedia:Notability) to create one. In such cases one may still go to [Wikidata](https://www.wikidata.org) and select "[Create a new item](https://www.wikidata.org/wiki/Special:NewItem)" in menu. For such entries it is mandatory to add external identifiers and references in order to comply with Wikidata's notability policies (see section above with animation showing how it can be done).
[@bhousel]: https://github.com/bhousel
[@1ec5]: https://github.com/1ec5
[@tas50]: https://github.com/tas50

View file

@ -1,91 +0,0 @@
# Info for Developers
This file contains useful information for developers who want to use the name-suggestion-index in another project.
- [Distributed files](#distributed-files)
- [Downloading the data](#downloading-the-data)
- [API Reference](#api-reference)
## Distributed Files
The files under `dist/*` are generated:
- `nsi.json` - The complete index
- `dissolved.json` - List of items that we believe may be dissolved based on Wikidata claims
- `featureCollection.json` - A GeoJSON FeatureCollection containing all the custom features (geofences)
- `taginfo.json` - List of all tags this project supports (see: https://taginfo.openstreetmap.org/)
- `wikidata.json` - Cached data retrieved from Wikidata (names, social accounts, logos)
- `collected/*` - Frequently occuring name tags collected from OpenStreetMap
- `filtered/*` - Subset of name tags that we are keeping or discarding
- `presets/*` - Preset files generated for iD and JOSM editors
These files from the `config/` folder are also copied over to the `dist/` folder:
- `genericWords.json` - Regular expressions to match generic names (e.g. "store", "noname")
- `matchGroups.json` - Groups of OpenStreetMap tags that are considered equivalent for purposes of matching
- `replacements.json` - Mapping of old Wikidata QIDs to replacement new Wikidata/Wikipedia values
- `trees.json` - Metadata about subtrees supported in this project
Each file is available in both regular `.json` or minified `.min.json` format.
### Metadata
Each JSON file contains a block of metadata like:
```js
"_meta": {
"version": "5.0.20210315",
"generated": "2021-03-15T18:21:03.025Z",
"url": "https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/dist/featureCollection.json",
"hash": "c215297c0b7292e4c2c3033ec534d411"
}
```
- `version` - the semantic version of project when the file was generated:
`major.minor.patch` where patch is the date in `yyyymmdd` format
- `generated` - the date that the file was generated
- `url` - source url where the file is available
- `hash` - MD5 hash of the file
## Downloading the data
You can download the files from the index directly from GitHub or use a CDN.
### Latest published release (stable forever):
Direct from GitHub <sub><sup>([docs](https://stackoverflow.com/questions/39065921/what-do-raw-githubusercontent-com-urls-represent))</sup></sub>:
```js
https://raw.githubusercontent.com/osmlab/name-suggestion-index/{branch or tag}/{path to file}
https://raw.githubusercontent.com/osmlab/name-suggestion-index/v5.0.20210315/dist/name-suggestions.presets.min.xml
```
Via JSDelivr CDN <sub><sup>([docs](https://www.jsdelivr.com/))</sup></sub>:
```js
https://cdn.jsdelivr.net/npm/name-suggestion-index@{semver}/{path to file}
https://cdn.jsdelivr.net/npm/name-suggestion-index@5.0/dist/name-suggestions.presets.min.xml
```
### Current development version (breaks sometimes!):
Direct from GitHub <sub><sup>([docs](https://stackoverflow.com/questions/39065921/what-do-raw-githubusercontent-com-urls-represent))</sup></sub>:
```js
https://raw.githubusercontent.com/osmlab/name-suggestion-index/{branch or tag}/{path to file}
https://raw.githubusercontent.com/osmlab/name-suggestion-index/main/dist/presets/nsi-josm-presets.min.xml
```
Via JSDelivr CDN <sub><sup>([docs](https://www.jsdelivr.com/?docs=gh))</sup></sub>:
```js
https://cdn.jsdelivr.net/gh/name-suggestion-index@{branch or tag}/{path to file}
https://cdn.jsdelivr.net/gh/osmlab/name-suggestion-index@main/dist/presets/nsi-josm-presets.min.xml
```
## API Reference
Some of the JavaScript code is available in both ES6 module (.mjs) and CommonJS (.js) formats.
More info soon.

View file

@ -1,112 +0,0 @@
# Info for Maintainers
This file contains useful information for maintainers.
You don't need to know any of this if you just want to contribute to the index!
- [Prerequisites](#prerequisites)
- [Project Setup](#project-setup)
- [Building the index](#building-the-index)
- [Syncing with Wikidata](#syncing-with-wikidata)
- [Releasing](#releasing)
- [Building nsi.guide](#building-nsiguide)
- [Other commands](#other-commands)
- [Collecting names from the OSM planet](#collecting-names-from-the-osm-planet)
## Prerequisites
- [Node.js](https://nodejs.org/) version 10 or newer
- [`git`](https://www.atlassian.com/git/tutorials/install-git/) for your platform
## Project Setup
#### Installing
- Clone this project, for example:
`git clone git@github.com:osmlab/name-suggestion-index.git`
- `cd` into the project folder,
- Run `npm install` to install libraries
#### Updates
- `git pull origin --rebase` is a good way to keep your local copy of the code updated
- rerun `npm install` whenever dependencies are updated in `package.json`
## Building the index
- `npm run build`
- Takes a few seconds and should be run whenever the `data/*` or `config/*` files change
- Processes custom locations under `features/**/*.geojson` into `dist/featureCollection.json`
- Sorts `dist/collected/*` name lists into `dist/filtered/*` "keep" and "discard" name lists
- Merges new items found in the "keep" lists into the `data/*` files
- Generates ids
- Outputs warnings to suggest updates to `data/**/*.json`
- Make sure to check in code when done, with something like `git add . && git commit -m 'npm run build'`
## Syncing with Wikidata
- `npm run wikidata`
- Takes about 15 minutes and should be run occasionally to keep NSI in sync with Wikidata
- Fetches related Wikidata names, descriptions, logos, then updates `dist/wikidata.json`
- Updates the Wikidata pages to contain the current NSI identifiers
- Outputs warnings to suggest fixes on Wikidata for missing social accounts, or other common errors
- Make sure to check in code when done, with something like `git add . && git commit -m 'npm run wikidata'`
- (We may try to automate more of this eventually)
## Releasing
- `npm run dist`
- Takes a few seconds and builds all the files in `dist/*`
- The semantic version number of the project is updated automatically:
`major.minor.patch` where patch is the date in `yyyymmdd` format
- Rebuilds iD and JOSM presets, taginfo file, other output files
- Should be run whenever the index is in a good state (build and wikidata sync has happened successfully)
- Make sure to check in code when done, with something like `git add . && git commit -m 'npm run dist'`
- Projects which pull NSI data from GitHub (such as <https://nsi.guide/>) will appear updated soon after `npm run dist`
- Other downstream projects may pull from `dist/*` too
To publish an official release, follow the steps in [RELEASE.md](RELEASE.md).
- Official releases are stable forever and available via NPM or on CDNs like JSDelivr
- Projects which pull name-suggestion-index from NPM or a CDN (sucn as iD) will appear updated soon after publishing
- Publishing the code to NPM requires rights to run `npm publish`
## Building nsi.guide
<https://nsi.guide/> is a web application written in ReactJS that lets anyone browse the index.
- `npm run appbuild`
- Rebuilds the ReactJS code for <https://nsi.guide/>
- The source code for this app can be found under `app/*`
- Only need to rebuild this when the app code changes, not when the index changes
## Other commands
- `npm run lint` - Checks the Javascript code for correctness
- `npm run test` - Runs tests agains the Javascript code
- `npm run` - Lists other available commands
## Collecting names from the OSM planet
This takes a long time and a lot of disk space. It can be done occasionally by project maintainers.
- Install `osmium` command-line tool and node package (may only be available on some environments)
- `apt-get install osmium-tool` or `brew install osmium-tool` or similar
- `npm install --no-save osmium`
- [Download the planet](http://planet.osm.org/pbf/)
- `curl -L -o planet-latest.osm.pbf https://planet.openstreetmap.org/pbf/planet-latest.osm.pbf`
- Prefilter the planet file to only include named items with keys we are looking for:
- `osmium tags-filter planet-latest.osm.pbf -R name,brand,operator,network -o filtered.osm.pbf`
- Run `node scripts/collect_all.js /path/to/filtered.osm.pbf`
- results will go in `dist/collected/*.json`
- A new challenge:
- Attempt an `npm run build`. Now that unique `id` properties are generated, it is possible that this command will fail.
- This can happen if there are *multiple* new items that end up with the same `id` (e.g. "MetroBus" vs "Metrobus")
- You'll need to just pick one to keep, then keep trying to run `npm run build` until the duplicate `id` issues are gone.
- `git add . && git commit -m 'Collected common names from latest planet'`

View file

@ -1,7 +1,7 @@
[![build](https://github.com/osmlab/name-suggestion-index/workflows/build/badge.svg)](https://github.com/osmlab/name-suggestion-index/actions?query=workflow%3A%22build%22)
[![npm version](https://badge.fury.io/js/name-suggestion-index.svg)](https://badge.fury.io/js/name-suggestion-index)
# name-suggestion-index (aka "NSI")
# name-suggestion-index ("NSI")
Canonical features for OpenStreetMap
@ -11,7 +11,10 @@ Canonical features for OpenStreetMap
The goal of this project is to maintain a [canonical](https://en.wikipedia.org/wiki/Canonicalization)
list of commonly used features for suggesting consistent spelling and tagging in OpenStreetMap.
[Watch the video](https://2019.stateofthemap.us/program/sat/mapping-brands-with-the-name-suggestion-index.html) from our talk at State of the Map US 2019 to learn more about this project!
> <br></br>
> 👉 &nbsp; [Watch the video](https://2019.stateofthemap.us/program/sat/mapping-brands-with-the-name-suggestion-index.html)
from our talk at State of the Map US 2019 to learn more about this project!<br></br>
> <br></br>
## Browse the index
@ -48,50 +51,24 @@ Currently used in:
## About the index
You can learn more from these pages:
- <https://nsi.guide/> - Browse and search all the data
- [CONTRIBUTING.md](CONTRIBUTING.md) - How to contribute data about brands, transit, and other features to this index
- [DEVELOPING.md](DEVELOPING.md) - If you are a developer and want to use the name-suggestion-index in your project
- [MAINTAINING.md](MAINTAINING.md) - How to setup and build the index, sync with wikidata, and make releases
### Source files (edit these):
The files under `config/*`, `data/*`, and `features/*` may be edited:
- `data/*` - Data files for each feature category, organized by topic and OpenStreetMap tag
- `brands/**/*.json`
- `flags/**/*.json`
- `operators/**/*.json`
- `transit/**/*.json`
- `features/*` - GeoJSON files that define custom regions (aka [geofences](https://en.wikipedia.org/wiki/Geo-fence))
- `us/new_jersey.geojson`
- `ca/quebec.geojson`
- and so on…
- `config/*`
- `genericWords.json` - Regular expressions to match generic names (e.g. "store", "noname")
- `matchGroups.json` - Groups of OpenStreetMap tags that are considered equivalent for purposes of matching
- `replacements.json` - Mapping of old Wikidata QIDs to replacement new Wikidata/Wikipedia values
- `trees.json` - Metadata about subtrees supported in this project
### Generated files (do not edit):
The files under `dist/*` are generated.
See [DEVELOPING.md](DEVELOPING.md) for info about the generated files.
See [the project wiki](https://github.com/osmlab/name-suggestion-index/wiki) for details.
## Participate!
- Read the project [Code of Conduct](CODE_OF_CONDUCT.md) and remember to be nice to one another.
- See [CONTRIBUTING.md](CONTRIBUTING.md) for info about how to contribute to this index.
We're always looking for help!
We're always looking for help! If you have any questions or want to reach out to a maintainer,
ping `bhousel`, `1ec5`, or `tas50` on:
- Read [the Code of Conduct](CODE_OF_CONDUCT.md) and remember to be kind to one another.
- See [the project wiki](https://github.com/osmlab/name-suggestion-index/wiki) for info about how to contribute to this index.
If you have any questions or want to reach out to a maintainer, ping
[@bhousel][@bhousel], [@1ec5][@1ec5], or [@tas50][@tas50] on:
- [OpenStreetMap US Slack](https://slack.openstreetmap.us/) (`#poi` or `#general` channels)
[@bhousel]: https://github.com/bhousel
[@1ec5]: https://github.com/1ec5
[@tas50]: https://github.com/tas50
## License

View file

@ -1,13 +0,0 @@
## Release Checklist
### Update version, tag, and publish
- [ ] git checkout main
- [ ] git pull origin
- [ ] npm install
- [ ] npm run build
- [ ] npm run wikidata
- [ ] npm run dist _(version number updates automatically and will print to console)_
- [ ] git add . && git commit -m 'A.B.C'
- [ ] git tag A.B.C
- [ ] git push origin main A.B.C
- [ ] npm publish

View file

@ -1,40 +0,0 @@
# Resolving Warnings
## :thinking: &nbsp; Resolve warnings
Warnings mean that you need to edit files under `data/brands/*`.
The warning output gives a clue about how to fix or suppress the warning.
If you aren't sure, just ask on GitHub!
&nbsp;
### Duplicate names
```
Warning - Potential duplicate:
------------------------------------------------------------------------------------------------------
If the items are two different businesses,
make sure they both have accurate locationSets (e.g. "us"/"ca") and wikidata identifiers.
If the items are duplicates of the same business,
add `matchTags`/`matchNames` properties to the item that you want to keep, and delete the unwanted item.
If the duplicate item is a generic word,
add a filter to config/genericWords.json and delete the unwanted item.
------------------------------------------------------------------------------------------------------
"shop/supermarket|Carrefour" -> duplicates? -> "amenity/fuel|Carrefour"
"shop/supermarket|VinMart" -> duplicates? -> "shop/department_store|VinMart"
```
_What it means:_ These names are commonly tagged differently in OpenStreetMap. This might be ok, but it might be a mistake.
For "VinMart" we really prefer for it to be tagged as a supermarket. It's a single brand frequently mistagged.
* Add `"matchTags": ["shop/department_store"]` to the (preferred) `"shop/supermarket|VinMart"` entry
* Delete the (not preferred) entry for `"shop/department_store|VinMart"`
For "Carrefour" we know that can be both a supermarket and a fuel station. It's two different things.
* Make sure both items have a `brand:wikidata` tag and appropriate `locationSet`.
Existing tagging (you can compare counts in `dist/filtered/names_keep.json`), information at the relevant Wikipedia page or the company's website, and [OpenStreetMap Wiki tag documentation](https://wiki.openstreetmap.org/wiki/Map_Features) all help in deciding how to address duplicate warnings.
If the situation is unclear, one may contact the [local community](https://community.osm.be/) and ask for help.
&nbsp;