[routing] [strings] Add the street name to verbal turn instructions #3130
No reviewers
Labels
No labels
Accessibility
Accessibility
Address
Address
Android
Android
Android Auto
Android Auto
Android Automotive (AAOS)
Android Automotive (AAOS)
API
API
AppGallery
AppGallery
AppStore
AppStore
Battery and Performance
Battery and Performance
Blocker
Blocker
Bookmarks and Tracks
Bookmarks and Tracks
Borders
Borders
Bug
Bug
Build
Build
CarPlay
CarPlay
Classificator
Classificator
Community
Community
Core
Core
CrashReports
CrashReports
Cycling
Cycling
Desktop
Desktop
DevEx
DevEx
DevOps
DevOps
dev_sandbox
dev_sandbox
Directions
Directions
Documentation
Documentation
Downloader
Downloader
Drape
Drape
Driving
Driving
Duplicate
Duplicate
Editor
Editor
Elevation
Elevation
Enhancement
Enhancement
Epic
Epic
External Map Datasets
External Map Datasets
F-Droid
F-Droid
Fonts
Fonts
Frequently User Reported
Frequently User Reported
Fund
Fund
Generator
Generator
Good first issue
Good first issue
Google Play
Google Play
GPS
GPS
GSoC
GSoC
iCloud
iCloud
Icons
Icons
iOS
iOS
Legal
Legal
Linux Desktop
Linux Desktop
Linux packaging
Linux packaging
Linux Phone
Linux Phone
Mac OS
Mac OS
Map Data
Map Data
Metro
Metro
Navigation
Navigation
Need Feedback
Need Feedback
Night Mode
Night Mode
NLnet 2024-06-281
NLnet 2024-06-281
No Feature Parity
No Feature Parity
Opening Hours
Opening Hours
Outdoors
Outdoors
POI Info
POI Info
Privacy
Privacy
Public Transport
Public Transport
Raw Idea
Raw Idea
Refactoring
Refactoring
Regional
Regional
Regression
Regression
Releases
Releases
RoboTest
RoboTest
Route Planning
Route Planning
Routing
Routing
Ruler
Ruler
Search
Search
Security
Security
Styles
Styles
Tests
Tests
Track Recording
Track Recording
Translations
Translations
TTS
TTS
UI
UI
UX
UX
Walk Navigation
Walk Navigation
Watches
Watches
Web
Web
Wikipedia
Wikipedia
Windows
Windows
Won't fix
Won't fix
World Map
World Map
No milestone
No project
No assignees
4 participants
Due date
No due date set.
Dependencies
No dependencies set.
Reference: organicmaps/organicmaps-tmp#3130
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "verbal_streets"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
(See #3118 for previous discussion.)
Latest beta (FDroid) or debug APK available here to logged-in users (updated 2024-05-02): https://github.com/organicmaps/organicmaps/actions/runs/8934037651?pr=3130 -- this may overwrite/uninstall your existing Beta, back up your bookmarks!
Language Reviewers Read This
Proposal
We propose adding a new locale string for TTS sounds:
dist_direction_onto_street
-- it has four or five parameters in it,%1$s
%2$s
%3$s
%4$s
and optionally%5$s
. This string is used to combine strings into a complete sentence. So before we may have just announcedin 100 meters
+turn left
but now the phrase can be more complex:Turn
+in 100 meters
+left
+onto
+Main Street
. I've guessed what the appropriateonto
word should be, so please check that.Also, existing instructions like
Make a slight right turn.
orExit.
have been short commands at the end of sentences, but now they can sometimes be part of a larger phrase, so many languages will need to customize things. To handle that, you may optionally copy a "direction" string likemake_a_right_turn
ortake_the_1_exit
, add "_street" to the end of the string ID (creating a new string calledmake_a_right_turn_street
which you can see already exists in sound.txt for some languages) and modify it as desired to fit the new sentences. Any direction ending in_street
will override the regular direction when used as part of a street name announcement, somake_a_right_turn = Make a right turn.
can coexist withmake_a_right_turn_street = Turn right
and form a complete phrase likeIn 100 meters
+Turn right
+onto
+Main Street
without affecting the more basicIn 100 meters
+Make a right turn.
There are two more changes to allow language customization: splittings directional instructions in two, and custom directions for turns with exit numbers. Some languages like Dutch and Japanese have different sentence ordering like our earlier "Turn in 100 meters left onto Main Street" example. To allow this, any direction string can have a version named "_street_verb" which will pair with its "_street" version to create a full sentence. So in this example,
turn_left = Turn left
but we create new stringsturn_left_street = left
andturn_left_street_verb = Turn
so that we have two complete versions with and without street name announcements:Turn
+in 100 meters
+left
+onto
+Main Street
andIn 100 meters
+Turn left
+onto
+Main Street
. This new_street_verb
string is not only optional but its positional parameter%5$s
is also optional withindist_direction_onto_street
. Not all languages need or will have that fifth argument which can be placed anywhere (in this example, at the beginning of the string like%5$s %1$s %2$s %3$s %4$s
. Sorry for the odd ordering, it's how it has to be.)The exit number customization is for many languages which announce proper noun turns like "M4 Expressway" and "Main Street" differently from numbered exits like "Exit 12" or "Exit 13B". So there is a new optional string called
take_exit_number
which will override the direction andonto
strings when an exit number is about to be announced. So if we settake_exit_number = take exit
then the awkward "In 100 meters, exit onto 13B: Main Street, Downtown" becomes the more natural "In 100 meters, take exit 13B: Main Street, Downtown".Details
%1$s
will be replaced with the distance string, likein_500_meters
("In 500 meters")%2$s
will be replaced with the direction likemake_a_right_turn_street
ormake_a_right_turn
if unavailable ("Turn right") orexit
for motorways ortake_the_2_exit
for roundabouts. It can also be replaced by directions with*_street
keys likemake_a_right_turn_street
to support custom grammar without affecting regular non-street-announcing instructions, and can additionally be replaced bytake_exit_number
when an exit number is about to be announced.%3$s
will be replaced by theonto
string, or if an exit number is about to be announced it will be blank%4$s
will be replaced with the name/number of the street/exit as displayed onscreen, like "Main Street" for a road or "[123] M4: Main Street > London" for motorway exits. Some formatting is auto-applied to exit strings so they sound more natural like "123: M4, Main Street, London". However the exit and name appear at the top of the screen in Organic Maps is how they will be sent to your phone's TTS engine, so languages may be mixed, however I've been surprised by how well Google and Samsung handle mixed-language strings. Your testing is appreciated!%5$s
will be optionally be replaced with the*_street_verb
strings you create for your language. For example in Dutch we place this string at the beginning so that part of the directional instruction can come first.What we need from you
pref_tts_street_names_title
which is a toggle called "Announce Street Names" andpref_tts_street_names_description
which describes the toggle. When enabled, the TTS engine will speak the new "turn onto Main Street" or "take exit 123: London, City Center, Main Street" strings as configured above. When disabled, the TTS engine behavior will remain as it was ("in 500 feet, turn left" without any street name or fancy formatting.)RTL languages like Arabic, Persian, Hebrew, Kurdish, Pashto, Urdu:
It can be complex to combine RTL and LTR strings. Please take extra care reviewing and testing especially when the Beta is available to make sure the correct order is maintained. We can especially use your help writing test strings that match the C++ output and result in the correct pronunciation: it can be hard to edit and understand on a US keyboard without knowing the language!
Developer todo
onto=in
kuelekea
@ -1401,2 +1600,4 @@
zh-Hans = 取道第四个出口
zh-Hant = 取道第四個出口。
[take_the_4_exit_street]
I'm not familiar with Hindi, but is it ok here and everywhere in PR for hi ?
@ -1401,2 +1600,4 @@
zh-Hans = 取道第四个出口
zh-Hant = 取道第四個出口。
[take_the_4_exit_street]
Yes the
।
character is a period in Hindi, so that has been removed and is the only change.पचास फीट में। दायीं ओर मुड़ें। पर highway 123
is translated asIn fifty feet. Turn right. on highway 123
पचास फीट में दायीं ओर मुड़ें पर highway 123
is translated asturn right in fifty feet onto highway 123
We're fortunate that most TTS seems forgiving of the ASCII space character and C++ string concatenation: I checked that the TTS output seemed mostly sane for Japanese and Arabic despite massively different punctuation in those languages, and thanks to the magic of Unicode even right-to-left languages seem to have appended and pronounced their strings right to left despite the C++ appending things as if they were left to right.
@biodranik @AntonM030481 Do we agree with this approach? Translate "onto" and append street name to the end of voice instruction.
If yes, I can tune C++ part and it will be ready to merge.
@zyphlar Thanks for the clarifications. You don't need to close prev and open new PR after commit changes. Just make force push into your existing branch.
It will read whatever is currently provided visually in the "next turn" display, but we decided so far to simply add a new "onto" string rather than duplicate all directional strings to create full and proper phrasing at this time. So it will take [123]: [I 5] Seattle > City Center and say "in 4000 feet, exit onto one hundred twenty three: eye five: Seattle, City Center"
To do what you're suggesting (which I agree is the best ultimate answer) we would need all these new translations (we're already stretching things with i.e. Japanese where you can't just append prepositions left to right to make sense, it's ideally inverted like "main street (regarding), turn left") and we'd also need to pass the exit into the notification struct separately.
+1 on this. This feature is really great when your TTS engine is in the same language as the street names. Otherwise the TTS engine will simply butcher the street names and it will be very hard to understand.
Maybe we can at least detect if
and exchange this exit number with "onto".
So it will say "in 4000 feet, exit (one hundred twenty three:) onto eye five: Seattle, City Center"
I like that idea but if it's alright I'd like to deploy the smallest possible (acceptable to maintainers) change first and then work on the very real and good problem of localizing complex strings (for example I like the idea of an option to just speak numbers for people with TTS headaches, as well as more nuanced locale strings for languages with different sentence ordering, but that's a large "would be nice" change for my first contribution.)
My first concern right now is I don't know how to add a new preference setting. Could someone with more experience stub that out so I don't go messing around in places I've barely just learned to touch? (I don't have the ability to dev/test on iOS, for example.)
Thanks for your work!
It should be easy to programmatically remove the last dot (all unicode dots actually) in C++ depending on the setting.
@ -1,15 +1,16 @@
#include "routing/turns_sound_settings.hpp"
#include "routing/turns_tts_text.hpp"
Why do you need it?
What is wrong with std::replace?
Any progress?
@ -72,12 +72,14 @@ NotificationManager NotificationManager::CreateNotificationManagerForTesting(
}
Maybe change to:
So you don't have to write
GenerateTurnText(..., "")
if there isn't a next street (see below)@ -50,0 +63,4 @@
{
name.clear();
if (auto const & sh = ftypes::GetRoadShields(road.m_ref); !sh.empty())
I think what @biodranik meant is this:
is doing the same thing as boost's
replace_all
in this case, but it doesn't need the huge boost dependency.No progress yet, help appreciated
Seems all good in PT-BR
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
In my opinion it would be better to replace
"w"
with "na" because it sounds better with street names and in this case is universal.How does it say?
In Turkish, we say: X Street onto turn right
And translation isn't looks good, looking for an alternative translation.
@MetehanOzyurek thank you for your feedback! Currently we are limited to the following format:
in_500_meters + " " + make_a_right_turn + " " + onto + " " + nextStreet
for example in English "In five hundred meters Turn right onto Main Street" or in Turkish "Beş yüz metre sonra Sağa dönün üzerine Kazım Karabekir Caddesi"I understand that many languages do not have this same sentence structure, unfortunately it will require redoing most of these navigational strings in every language and all of the navigational TTS code to support positional parameters like this:
Is there a temporary fix that will work for Turkish? Perhaps some words in the same order that make sense even if they sound computer-y? (This is also a problem in other languages, if no adequate solution can be found then maybe we do need to redo all the strings and logic.)
@zyphlar how other apps are pronouncing it, and which translations are used in other apps? E.g. in OSMand?
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
This is updated in the latest, thank you for your feedback! See below for discussion on sentence ordering as well.
@biodranik OsmAnd doesn't attempt to pronounce street names at all (which is one thing that makes it hard to use IMO)
If we can't find a workaround for most of these languages, and we don't want to have a lot of redoing, we could create an "onto" or "turn" suffix for select languages that require it, and do an if/else in code for those cases. I guess it depends on which is more work for our translation teams. I've made a list of languages that need special attention in my top comment, it's about half a dozen.
You're totally wrong. Maybe you haven't used it for a while because it is in navigation settings.
@pm4rcin hmm well I can't get it to work on my phone with the Floater fake GPS. I can see how they might say things like "turn left and go _____" but I don't see a conjunction for "onto StreetName" https://github.com/osmandapp/OsmAnd/search?p=4&q=route_tshl
I've just tested it a moment ago and it works but I also can't find it in their strings. The string that switches the option is
016cfd8b44/OsmAnd/res/values/strings.xml (L3542)
@pm4rcin well however it is they do it, are you able to see what verbiage they announce it with, especially in one of the differently-ordered languages mentioned up top? (Is there an "on to" preposition? Do they always say the street name last even in those languages?)
@MetehanOzyurek I have tested in OSMand and it says something like
"Street name" enine sola den
in turkish. Could you confirm it's correct?What about also adding the ref keyword to street names. So for example when you turn to highway/trunk/primary etc. you'll know the road's number which is really useful and a good preparation for speaking the whole destination tag which is green in screenshot below like Magic Earth does?
Spoiler
@pm4rcin see above discussion, currently "street name" is whatever is displayed at the top of the screen to aid you in turning. So if it already displays
[123]: [I 5] Seattle > City Center
it will pronounce "in 4000 feet, exit onto one hundred twenty three: eye five: Seattle, City Center" -- if we want to fine tune what the "street name" is for turns, that can be a separate PR modifying the nextStreet logic.I'll test locally and see how it works in different places. In your example
123
is an exit number andI 5
is a road ref or the opposite? Or is it totally abstract?@pm4rcin yes Exit #123 onto highway I-5 with some destinations or streets. Again whatever is currently displayed in text is what we have to work with, I simply remove and replace some of the punctuation to make it more pronounceable. So your screenshot would most likely be pronounced "in 475 meters, exit onto A1, Lodz Gorzyczki" and if we want to change anything after "exit onto" that's a separate PR for modifying the nextStreet feature.
What about using fmt library (I guess it's in C++ standard now), where you can have positional arguments, like this:
So one only have to provide the grammar order (it's
{0} {1} {2} {3}
for English but{1} {2} {0} {3}
for Turkish) in the translation resource file and let other translation unmodified.For the beginning, let's collect here the list of how other navigation apps, including Apple and Google, are announcing street names, in different languages. I'm sure their engineers did a lot of work to make it convenient for users. It would be not very wise to invent a bicycle here.
No, OsmAnd translations are wrong. You can check from Google Maps but as you can't understand what Google says, I can listen instead of you:)
Correct translation should be like: Beş yüz metre sonra Kazım Karabekir Caddesi yönünde sağa dönün
English: After 500 meters, turn right onto Kazım Karabekir Street
@zyphlar I see, but it should be like Beş yüz metre sonra Kazım Karabekir Caddesi üzerine (btw, yönünde is better translation) sağa dönün
GMaps:
Take the interchange on the right, keep left, keep right at the fork etc. So it's not really useful for us.
OSMand:
Za 600 metrów skręć w prawo w zjazd 20 w kierunku A1 Łódź.
(After 600 meters turn right onto exit 20 toward A1 Łódź)
Magic Earth:
Po 600 metrach wybierz zjazd 20 na A1 i jedź we wskazywanym na tablicach drogowych kierunku na Łódź.
(After 600 meters take the exit 20 on A1 and follow the signpost towards Łódź.)
Magic Earth's is more verbose but maybe it's how we should do it. It has also elegantly separated road's ref from destination. What do you guys think?
@X-Ryl669 yes I understand, there are many ways to internationalize but as I said with the positional parameters it would still (and probably rightfully should, to get the kind of nice flowing sentence structure @pm4rcin is talking about) involve redoing a fair amount of translations and tts code. I'm willing to do it if the maintainers and translators are up for it, but I wouldn't want to hinge my PR on such a large scope if the willingness isn't there.
@pm4rcin I'm surprised that Google maps doesn't try to say the street name, it usually does a really good job in the US even human-recording many common names. Maybe they offer degraded features in some languages.
@biodranik maybe since this feature is disabled by default, we could use a "beta" label or hide the option for languages that use different orderings, and then address a TTS redesign in a second PR?
@MetehanOzyurek thank you for your help. Is there no way to fit Turkish into the format
in_500_meters + " " + make_a_right_turn + " " + onto + " " + nextStreet
then? Even if it sounds "computery?" Obviously we would like to redo the system to sound good and natural, I'm just asking if there's some wording that would be understandable to Turkish users. (In English we can say "in 500 meters turn left onto Main St" but we can also say "at Main Street, in 500 meters, turn left" even if it's not as natural)@zyphlar sorry, edited your message. I wanted to quote-reply your message🤦♂️😅
Yes it sounds computery :(
At least you need to say
Next Street + onto
Even it's not natural with this format but better
But anyway, I'm looking for an alternative translation. But I don't think I can find a natural translation.
@MetehanOzyurek okay. Do you think it's better temporarily to leave it with the current behavior of "bes yuz metre sonra saga donun" without any street name, to avoid confusion? Or is "bes yuz metre sonra saga donun uzerine StreetName" acceptable temporarily?
This new feature will be disabled by default so 90% of people won't notice it yet, but obviously I don't want something unintelligible or to cause car crashes!
I may be able to pick certain languages to put "onto" or "turn left" after the street name as appropriate, but that adds complexity so I need to hear all the details from the affected people. Do you know how other Turkish GPS apps pronounce things?
@zyphlar users won't use it and complain if the voice won't sound understandable, even if it was initially disabled. How hard is it to use positional parameters? We have a lot of translation volunteers if necessary.
Please also make sure that iOS version works:
@biodranik for iOS you said you'd help, I don't have a Mac available to dev/build/test. It seems similar-ish to how I implemented in Java.
@biodranik it's not too hard for me, as long as the translation volunteers can understand something like:
The way I would code it, the Pirate Locale outcome here would be "Give me a starboard heading in half a click over at Main Street, bring her around" except when this feature is disabled where it would revert to the current "in half a click hang to starboard." It would also generate the desired Turkish text "Beş yüz metre sonra Main Street yönünde Sağa dönün." Some languages like Dutch (NL) probably have different phrasing and positioning between a standalone right turn and inside a phrase, which would require many/all direction phrases be duplicated or retranslated as in the _street pirate string above. There could even be languages with complex "in 500 meters" phrasing, not to mention conjugated/gendered verbs and suffixes.
It just might not be obvious to 100% of translators and it can get confusing quick and we'll have to draw the line somewhere. IDK what translation tools you might have available but it might be a good idea to have a text sample generator to make sure the result still makes sense to native speakers, since you and I can only cross our fingers and use google translate for most of them.
(BTW I highly recommend having a Pirate or L33tSpeak or Emoji-only locale, partly for fun but partly for testing this kind of stuff.)
It won't cause car crashes. Most people can understand what navigation says but it will decrease the translation and voice instructions quality - if you translate it as "üzerine".
Btw, I found a more understandable alternative translation when writing this comment😅
Maybe you can use
ve şu yönde ilerleyin:
It will be understandable and grammatically correct.Still, it's much better to rearrange the sentence but that might be an alternative solution. I'll look for other alternative translations.
I'm not sure what Google or Apple say but I don't remember any mistakes like that. I'l check asap. Also I'm going to ask for an alternative translation on our Organic Maps Turkish chat.
@MetehanOzyurek it looks like enough languages prefer different ordering that I'm going to code this up with positional parameters. Does this make sense to you and seem grammatically best?
Yes this makes sense👍👏
@biodranik this is ready for language review and adding the iOS preference. I'll have a helpful language review tool ready shortly but I have no Mac.
@MetehanOzyurek @pm4rcin @matheusgomesms thank you for all your language help! I've created a tool to show you all the various TTS combinations in each language, and (if your browser supports it, like Chrome) even speak the phrases in TTS. Please review your language and leave feedback in these comments:
https://zyphlar.github.io/organicmaps-locale-viewer/
@zyphlar cool tool, thanks! You can improve it by allowing switching from different languages. It would be better to also use link to
data/strings/sound.txt
instead of a generated JSON file.@biodranik yes it has drop down menus for locale and tts voice. I chose to use the json because it's easier to parse and browse by filename without having to write a custom parser
Is this newline necessary?
Default en translation should be always specified.
Does it make sense to embed this translation into dist_direction_onto_street instead of defining it separately?
@ -266,7 +317,6 @@
nl = Sla linksaf.
pl = Skręć w lewo.
pt = Vire à esquerda.
Does it make sense to embed this translation into dist_direction_onto_street instead of defining it separately?
It looks like %2 can be directly embedded here instead of referencing an external string, right?
Why did you comment it?
Would it be easier if a third parameter will be optional, with an empty string by default?
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
There should be no spaces in those strings, right?
@ -151,0 +247,4 @@
string const nlShortJson =
R"({
"in_300_meters":"Over driehonderd meter",
"in_500_meters":"Over vijfhonderd meter",
Values should be taken from real files, otherwise what is the point of this test?
@ -151,0 +349,4 @@
routing::RouteSegment::RoadNameInfo("1"));
TEST_EQUAL(getTtsText.GetTurnNotification(notificationHu8), "Háromszáz méter után Forduljon jobbra az 1re", ());
Notification const notificationHu8a(300, 0, false, CarDirection::TurnRight, measurement_utils::Units::Metric,
routing::RouteSegment::RoadNameInfo("10"));
If there are some points in sentences to make them sound better, it should be mentioned in a comment in data/strings/sound.txt
Here and in other places, it should fit in one line. Fewer lines of code = easier to read it. We try to not exceed 120 chars, but it is allowed sometimes to improve readability.
@ -173,9 +186,9 @@ struct Notification
bool operator==(Notification const & rhv) const
ditto
@ -4,0 +3,4 @@
#include "routing/turns_sound_settings.hpp"
#include "routing/turns_tts_text_i18n.hpp"
#include "indexer/road_shields_parser.hpp"
@ -50,0 +63,4 @@
{
name.clear();
if (auto const & sh = ftypes::GetRoadShields(road.m_ref); !sh.empty())
No, we don't. See base/string_utils.hpp,
UniString
,UniChar
and other helper methods.codecvt is not implemented well on all systems at the moment and may not work properly.
Where are they come from? From the street name? Does it make sense to cleanup this street name only, instead of processing the whole TTS string, and to save a bit of battery?
Those debug logs are for you only and should be removed in the final versions. You may mark them with TODO.
You can parse json, but lead translators to edit the right file.
No, it's replaced by any direction like turn left, turn right, make a slight right turn, take the 3rd exit, etc. We'd have to duplicate this positional format across every permutation and for 90% of languages it's just unnecessary
This will be the one exception to that rule, I meant to ask what's the safeguard for missing translations but the implemented logic is that we look for a _street translation and if it doesn't exist we fall back to the plain translation. English has no need for separate grammar inside a prepositional phrase but a few languages do.
No in fact we need to make a bunch of other _street strings, probably one for every possible direction. In at least Japanese and Dutch we have a situation where a command like turn left or take the 3rd exit has different grammar from a preposition like turn left onto Main Street, it'll end up being more like "left onto Main Street make a turn." So we need separate strings for that and only that case.
If I'm really lucky, "make a turn" and "take an exit" won't be additional complications requiring even more assembled particles. Will have to double check with the JA and NL experts. But yeah duplicating every direction for every language seems excessive.
@ -14,0 +26,4 @@
"exit":"驶出。",
"onto":"上",
"take_exit_number":"驶出 上",
"take_exit_number_street_verb":"NULL",
I can't judge for a function word's translation, so could you give us a example of what will be in%1$s, %2$s, %3$s
?OK I know, this should be changed.
@ -14,0 +26,4 @@
"exit":"驶出。",
"onto":"上",
"take_exit_number":"驶出 上",
"take_exit_number_street_verb":"NULL",
Yes please view https://zyphlar.github.io/organicmaps-locale-viewer/ in order to preview every possible word combination. This formatter is used to create all of the translations that use the words "Main Street" in that example. You can see more details in the "comment" in sounds.txt
Also, think about easier support and translation to other languages. Sometimes it is easier to write several similar strings than to understand what these special characters mean and how your algorithm combines them together.
Absolutely, if it gets any more complex or if people struggle then I'll do it. I just don't want to give dozens of people dozens of things to duplicate if it can easily be avoided. Placeholders and fragments in i18n are pretty common.
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
It's a hard situation, there have always been spaces in these strings because the algorithm has always been
dist_str + " " + dir_str
and I'm fixing that with the new _street formatter but to fix it for all other TTS output we would have to refactor and use formatter strings for all those as well. As far as I can tell it has no effect on the spoken output so it's just a testing quirk for languages that don't require spaces between words.@ -151,0 +349,4 @@
routing::RouteSegment::RoadNameInfo("1"));
TEST_EQUAL(getTtsText.GetTurnNotification(notificationHu8), "Háromszáz méter után Forduljon jobbra az 1re", ());
Notification const notificationHu8a(300, 0, false, CarDirection::TurnRight, measurement_utils::Units::Metric,
routing::RouteSegment::RoadNameInfo("10"));
What kind of points are you thinking of? This is spoken just fine in English.
@ -151,0 +247,4 @@
string const nlShortJson =
R"({
"in_300_meters":"Over driehonderd meter",
"in_500_meters":"Over vijfhonderd meter",
I didn't create this testing methodology I just used what existed. The point of this test isn't to verify grammar, but to verify the logic of the TTS function. I want to ensure that the correct strings are generated for the provided inputs (for example, no issues with RTL string concatenation, no issues with Unicode regex, no issues with determining when to add the street name vs not add it, etc) the specific strings being used aren't important. If locale json files aren't being parsed we'll know about it pretty quickly, but if Arabic TTS generation suffers a bug we may never know without this test.
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
A proper implementation would take into account the language and if it has or does not have spaces to delimit words.
It is also another argument to use as few placeholders as possible in favor of some partial duplication.
How many almost identical strings will be now if you remove one placeholder?
My proposed solution is that we only duplicate directional strings (with a _street suffix) for those languages who have different syntax for commands versus prepositional phrases, like Japanese and Dutch. 90% of languages will need no such duplication so there will only be a few dozen new strings:
to do your proposed method of removing the %2 directional placeholder, we'd remove the new dist_direction_onto_street translation but need to duplicate every directional string in every language; the sounds.txt file will be 2x larger if not 3x or 4x to handle some future cases like the desired "exit 123 for city center toward main st" alternative to the current "exit onto 123, city center, main street":
I think the best solution is the one currently coded. It's not that uncommon to have multiple placeholders as long as we communicate with our translators properly, it's not too hard to program for, and it opens up a lot of excellent future language possibility.
Ok, let's polish, cleanup, and test it properly then. Very good comments and examples for these translations are needed in the sounds.txt for other translators too.
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
Given that it hasn't caused any problems yet and I'm not changing that part of the code, maybe we can implement these changes as-is for now and move towards a more unified placeholder-based format string methodology as we go on. That's the ultimately best answer, better than plain string concatenation.
Good catch
I'm making the minimal change possible here; this function invocation is often written this way because of all the /* comments */ in and among these calls explaining why they're not all the same. I could redo all invocations but there's not really a great reason for me to do so, I tried and it doesn't improve readability that much.
@ -50,0 +63,4 @@
{
name.clear();
if (auto const & sh = ftypes::GetRoadShields(road.m_ref); !sh.empty())
Yes, street name, like
[123] City Center > Main Street
and yes we replace only characters on the street name string, the whole TTS is only assembled later with the ttsOut variable.Good catch, apologies
@biodranik I believe all outstanding issues have been resolved, ready for further language review and iOS preference implementation
Congrats!
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
Other translations that I don't really know how to make into a GitHub review:
"leave_the_roundabout":"Zjedź z ronda.",
"exit":"Zjedź.",
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
This should've been a review of sound.txt - sorry
pl = Oznajmianie nazw ulic
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
In which context it is said because I'm not sure if it's wrong or correct here? Provide the example in polish with this.
I think we should find a better translation because it's not only about street names but for destination and road numbers.
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
It is used as a verb, not a noun. See https://zyphlar.github.io/organicmaps-locale-viewer/
Some examples that don't really make sense:
The English version also doesn't specify those things, so it's more of an issue with the original text than the translation
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
I think it should be improved to say "Zjedź w prawo" because that makes sense in most countries and is probably only used on motorways and trunk roads. Because only this types of roads have no intersections with traffic lights. Then the opposite should be added for "British" countries.
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
FYI this feature is programmed to announce "in 200 meters, turn left onto Street or Exit Name" or "in 200 meters, exit onto 123, London, City Center, 11th Street" for all streets as long as they have names or exit numbers (we use the verb "exit" or "turn right" or "take the 3rd exit" when a motorway, regular street, or roundabout is detected.) If the distance is far enough and a name appears at the top of the screen in OrganicMaps, it will be sent to the TTS engine when this feature is enabled by the user (disabled by default) with the appropriate verb from the translations file.
I understand that in some countries this behavior could be a little confusing, for example in Japan a lot of navigation is done with named intersections instead of named streets. A future PR will handle such cases as we discover them and consider how to support it.
Also note that this feature will not apply to rapidly executed turns: if there's only a few seconds left before the turn needs to be made, OrganicMaps will say "turn right, then turn left" without a distance or street name. So many successive turns in dense areas won't get names announced. Also alleys and parking lots obviously don't have names as often and so won't have their nonexistent name announced.
Yes in English I don't feel the difference is very important (I guess we could change it to "Announce Street Names and Exit Signs" or something to be most accurate about speaking both "turn right onto 11th Avenue" as well as "exit onto 123, London, City Center, 11th Avenue" instead of just "turn right" and "exit.")
A future improvement may have separate logic for exits so that we can say "turn right onto 11th Avenue" and then separately "take exit 123 onto 11th Avenue for City Center towards London" but that's a high level of parsing and complexity I'd like to avoid for now if at all possible.
It would be great to document all necessary cases to see the whole picture. And maybe put a good foundation to properly support these changes in the future. Our vision is that everything should eventually be perfect, for all countries, and for all languages.
@lunarna-gh what do you think the best way to communicate announcing both street names and exits would be?
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
@lunarna-gh can you give me a before/after of the strings from https://zyphlar.github.io/organicmaps-locale-viewer/ which don't make sense and how we can improve each? I don't speak Polish so a clear Incorrect vs Correct example will be best for me to make the necessary code/string changes.
I'm not sure, I don't think it's that important, but if needed there could just be smaller text underneath that states "When turned on, the text-to-speech voice will announce street names, exit signs and reference codes." (if it announces reference codes). This could also make the option easier to understand in some languages while avoiding making the setting title too long
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
Oh there's no need to change the sentences or anything, my example was just in reply to pm4rcin asking what sounds wrong. These sentences will make sense when translated as stated above
leave_the_roundabout: Zjedź z ronda.
exit: Zjedź.
That's what I've been thinking of. With that the title itself is good and we just need to add explanation as you said. It's probably the best way to go ATM. :)
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
Is it used only on highways and trunk roads? Because we could make it the same as Magic Earth. There it says "Wybierz zjazd".
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
"Wybierz zjazd" kind of sounds like it's a choice up to the driver instead of guiding the driver through the Organic Maps route though, doesn't it?
While testing I have been thinking about adding support for positional parameters for all strings. It could be used e.g. to add destination to
make_a_slight_left_turn
andmake_a_slight_right_turn
at least. What do you think @zyphlar?@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
After some thinking and testing
zjedź
orzjedź w prawo
is much better. I'd opt for the second because it sounds better in my opinion.@pm4rcin I think it's very tempting, and very powerful when used correctly, but can also increase complexity and confusion a lot, so let's make the smallest change now in this PR that produces a good result and then consider branching out in the future as we gain experience. (I'm still worried for the dozens of other language reviews we've gotta do just to get this change done!)
What stops you from implementing it in the right way from the start? ;-)
@biodranik at a certain point I'll be refactoring the entire app, having never released a C++ or Java project of my own in my life, and that just seems like a bad idea from a project management perspective ;)
Plus you yourself were arguing for the opposite -- duplicating every string in the system to have less parameters and less complexity/confusion -- so I'm trying to strike a balance :p
The idea behind duplicating was to help contributors with translations, the slowest but important part of a good-quality release.
Generally, we value quality here. And the approach may differ depending on the case.
I would highlight several points here:
@biodranik in this case, i think it would be purely cosmetic and mostly save a few lines of translation JSON at the expense of just as many lines of formatting logic and testing. left vs right, slight vs sharp vs plain, the work here is already done and it can be very difficult to communicate how the final strings will be assembled and read aloud, especially across a language barrier. (See the discussion about "is exit a noun or a verb, how is it assembled" above. Editing in context is much easier.)
We could refactor later with hardly consulting translators (just testing that the output is the same), but there's a lot of danger in getting too eager with parameters in i18n: just because we can chop up sentences a certain way in our language doesn't mean that's true for other languages. I found out that Hungarian for example conjugates noun suffixes based on the vowel sounds in the noun, which is simply out of scope at this time. We could easily write thousands of lines to sound completely natural for a single language, and we just can't reasonably go down that rabbit hole just yet.
Oh also @pm4rcin I may have misunderstood you at first: this PR already includes all possible turning directions with all possible distances and destinations. We created an additional optional _street string for languages which need separate instructions for sentences that have noun objects (Japanese and Dutch) but they're not required for most languages:
ada410b582/data/strings/sound.txt (L117)
Refer to all the various combinations of directions, distances, and destinations here: https://zyphlar.github.io/organicmaps-locale-viewer/
All this is enabled because of the positional parameters in dist_direction_onto_street and no more are needed unless we want to get silly :P
@lunarna-gh @pm4rcin please see https://zyphlar.github.io/organicmaps-locale-viewer/ and the updated source files, I've changed what I believe is your consensus so please let me know if it's better or worse.
@ -12,1 +26,4 @@
"exit":"Zjedź zjazdem.",
"onto":"na",
"take_exit_number":"Wybierz zjazd",
"take_exit_number_street_verb":"NULL",
I guess
zjedź w prawo
is a bit clearer, but this would require a separate string for countries which drive on the left with azjedź w lewo
translation (UK tourists, visitors and polish speakers need to be accounted for) which sounds like a lot of extra effort ifzjedź
worksI'll look through it more thoroughly when I get back home
When it exits highway it says "Za 800 metrów zjedź w prawo na A1 w [Destination names]". I think it would be better to say "w kierunku" because it sounds more correct. EDIT: Or just delete the "w"if it's easier.
W kierunku
sounds nice. A comparison of what word exactly is needed to change:Before: "Za 800 metrów zjedź w prawo na A1 w [Destination name]"
After: "Za 800 metrów zjedź w prawo na A1 w kierunku [Destination name]"
That's interesting. It doesn't seem like it's possible to implement the suggested translation without completely redoing the highway exit logic. @pm4rcin Do you still think it's the best option?
Wybierz zjazd.
sounded worse on paper than it actually does while navigating, I tested how Magic Earth handles this and I don't think it's that bad if we want to make it sound natural.@ -500,7 +500,15 @@ void RoutingSession::GenerateNotifications(std::vector<std::string> & notificati
// Generate turns notifications.
std::vector<turns::TurnItemDist> turns;
if (m_route->GetNextTurns(turns))
nit: Please, put block braces from new line here and below. This is our C++ code style.
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
nit: I'd prefer to use C++ R"()" macro for string literals without ugly escaping:
https://en.cppreference.com/w/cpp/language/string_literal
@ -72,12 +72,14 @@ NotificationManager NotificationManager::CreateNotificationManagerForTesting(
}
Hm, default value = "" is not needed in cpp function's definition. Only in declaration.
@ -155,0 +157,4 @@
: m_distanceUnits(distanceUnits)
, m_exitNum(exitNum)
, m_useThenInsteadOfDistance(useThenInsteadOfDistance)
, m_turnDir(turnDir)
Add std::string const & nextStreet = "" into existing constructor above?
@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
dirStr = std::move(dirStreetStr); and without braces
@biodranik of course US must be different in every way than the rest of the world... :D Could you provide an example of such road because it's interesting by itself? Do you know some more strange examples where US stands out in roads?
@pm4rcin I don't remember exact coordinates, but I definitely drove these exits in many countries myself. At least one was somewhere in the Seattle center (it was a carpool lane). From quick visual search I see this:
https://www.openstreetmap.org/search?query=47.2263085%2C-122.4639702#map=19/47.22630/-122.46301
@biodranik @lunarna-gh @pm4rcin be aware that currently OM does not distinguish left highway exits from right highway exits. It's something I'd like to improve in the future but not in-scope for this task. So "take this signed exit name/number/destination off the motorway or roundabout" is the intent whether left or right and the ambiguity can hopefully be solved later.
@vng I'm aware of the Unicode directionality, I've inserted "reset direction" characters into RTL strings to hopefully prevent any weirdness with mixing i.e. Arabic and English strings, but we'll need Arabic speakers to review. I have been reviewing by listening to the debug build TTS speech and it seems correct as is but the warning is good because it's true that every editor I use displays the output differently.
@biodranik the only current blockers I'm aware of are a lack of iphone preference and a lack of review for most languages. I've tested English, Japanese, and Arabic on Android and to my untrained ears they seem fine and the preference applies properly.
Can you please clarify? I thought that all our directions contain the "left" or "right" turn indication.
es= Manténgase a la derecha.
@ -279,3 +329,3 @@
vi = Rẽ trái.
zh-Hans = 左转。
zh-Hans = 左转
zh-Hant = 左轉。
es = Manténgase a la izquierda.
es = Salga.
es = Cámara próxima
Make it the same as es-MX, it sounds better
See my locale tester link. If you're on a motorway ("highway" in en-US) OM will currently say "in 500 meters, exit" when there is a motorway exit node on the route, no matter whether the exit is left, right, or straight. This was actually a big reason for displaying and speaking the exit number/name, because yes arriving into any big US city you will have exits on all three angles within a mile of each other so you need to follow the signage closely. We improve this in this PR to say "in 500 meters, exit onto 123, London, City Center, 17th Avenue" but obviously the eventual goal with enough language/locale/logic/routing support would be more like "in 500 meters, use the left two lanes to take exit 123, onto 17th Avenue, for City Center, towards London" or whatever minor variation thereof we prefer. But that's a lot of work and the app is simply not there yet, it's out of scope for simply taking the step of adding the NextStreet to TTS in a decent way.
I may have caused confusion by saying "highway" which may mean "street" in other locales: you're correct that named surface streets get left, right, slight, and sharp directions spoken as normal. It's only motorway and roundabout exits that don't have a direction associated in OM at this time.
de = Straßennamen ankündigen
de = Wenn aktiviert, wird der Name der Straße oder Ausfahrt, in die abgebogen werden soll, laut angekündigt
Great work, congrats! Could you please update it, adding pt-br and es-mx, following #3976? Then I can review both pt and pt-br (but it seems that in pt everything is fine, so far).
There's a problem with polish translation and left highway exit because it uses the standard
exit
which always says "zjedź w prawo(right)". @zyphlar what about adding a new keywordsleft
andright
which would solve the problem?@pm4rcin This has already been discussed in the last few comments, it's currently not possible, I think
Wybierz zjazd.
is alrightIt was already this value.
@ -279,3 +329,3 @@
vi = Rẽ trái.
zh-Hans = 左转。
zh-Hans = 左转
zh-Hant = 左轉。
It was already this value.
@pm4rcin @lunarna-gh [exit] pl = Zjedź w prawo has been changed to Wybierz zjazd
@pm4rcin @lunarna-gh one issue with your requested "Za 800 metrów zjedź w prawo na A1 w kierunku [Destination name]" -- I presume that A1 is the exit number? Unfortunately we currently can't interject any phrasing between the exit number, street name, and destination names. That's all lumped together and it will be a future project to separate them out nicely. See the "Main Street" phrasing here:-
https://zyphlar.github.io/organicmaps-locale-viewer/?lang=pl
"Za czterysta stóp Zjedź w prawo na Main Street"
our formatter is
%1$s %2$s na %3$s
where %3$s is whatever is displayed at the top of the screen, usually something like "A1; Main Street; London" and that's how it'll be read out loud for now.I agree the next step after all this will be to increase the complexity further and have a formatter like
%1$s %2$s w %3$s na %4$s w kierunku %5$s
where %3 is the exit number, %4 is the street name, and %5 is the destination so that we get very natural language like "In 800 meters, take exit A1 onto Main Street towards London" but there are so many combinations that we need to start small (what if there is no exit number / street name / destination, what if the language does some very odd conjugation where a feminine street and a masculine exit sound different, etc etc etc.)Is it okay if we leave it as "Za 800 metrów zjedź w prawo na A1, Main Street, London" for now?
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
I don't object but I'm also not going to make that change at this time because I'm only continuing the existing style used by previous contributors in these files:
If we want to change how we write json embedded in c, maybe that can be its own cleanup task outside of this PR which I'd really like to push through to completion.
A1 is the ref of motorway. Exit numbers are numbered normally with numbers (+optional letters after numbers)
Yes. But it's Wybierz zjazd as you said now ;)
@lunarna-gh what about "Wymawiaj nazwy ulic" or if you have better translation?
@ -1,17 +1,34 @@
{
"make_a_slight_right_turn":"Trzymaj się prawej strony.",
I think the better version would be "Dotrzesz do celu." or "Dojedziesz do celu.". @lunarna-gh
@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
@vng these changes made in latest
@pm4rcin correct somehow i made the change and it unmade itself.
Za 800 metrów Wybierz zjazd na A1, Main Street, London
is the new wording as far as I can tell. (In my locale viewer, the example is nowZa czterysta stóp Wybierz zjazd na Main Street
https://zyphlar.github.io/organicmaps-locale-viewer/?lang=pl)Language reviewers: I've updated the https://zyphlar.github.io/organicmaps-locale-viewer/ tool to more efficiently allow changes to be communicated. You can make and change your own changes in the browser, and paste the diff into a comment here.
All reviewers: a beta APK for this PR is available for review here: https://drive.google.com/file/d/16bngLUemsA9oU0HnEI2C0xXsPbSk80Pp/view?usp=sharing
Since this was built on my laptop and not any source you've used before, you may need to uninstall any existing Organic Maps (Beta) you have installed (export your bookmarks first!), as well as allow installing from unknown sources (web browser, Google Drive, etc), and redownload your maps. Sorry for the inconvenience.
Once installed, you'll also need to go to Settings, Voice Instructions, and enable the Announce Street Names option since it's disabled by default. No names will be announced for very short or immediate turns (lefts and rights in dense neighborhoods) but when a turn is a few hundred meters out and OM announces "in ___ feet/meters, turn..." then this PR will add the street name, exit number, or highway/destination name to the end as seen in the visualization tool: https://zyphlar.github.io/organicmaps-locale-viewer/
Look forward to your feedback!
@vng @biodranik @dvdmrtnz @lunarna-gh @pm4rcin @j13m126 @TheAdventurer64 @matheusgomesms @MetehanOzyurek @arnaudvergnet @AntonM030481 @X-Ryl669 @TimMagee @marcthespark @sn0ot @nic787 @jimcarst
For Dutch-nl:
Translation change request:
Here are my suggestions for making them work without logic.
Translation change request:
And here are some examples if you can add logic:
In one hundred meters, turn left onto {name} street / road / square / lane / avenue / boulevard
Száz méter múlva forduljon balra a / az {name} utcába / útra / térre / közbe / sugárútra / körútra
In 500 meters, take the M5 motorway.
Ötszáz méter múlva hajtson fel a /az M5-ös/-os/-es/-as autópályára.
In fifty meters Take the first exit onto Highway 99
Ötven méter múlva hajtson ki az első kijáraton a / az Highway 99 -ra / -re
In three hundred meters Go straight onto Main Street
Háromszáz méter múlva, haladjon tovább egyenesen a / az Main Street -án / -en / - on
How are
-ra / -re
,-án / -en / - on
, etc. are rendered/pronounced in your examples for non-Hungarian names? How to properly generate the string for the TTS engine to pronounce them correctly?It's trickier, with foreign names because they are chosen based off the pronounced version of the word so can't be guessed from a string, but we can add them manually:
a Main Street-re
a Highway 99-re
Name Lane-re
If they're written with " - " Google Translate pronounces it right (afair it uses the same engine?).
On jan 2 2023, at 7:56 du, Alexander Borsuk @.***> wrote:
@ -1,17 +1,34 @@
{
"make_a_slight_right_turn":"Trzymaj się prawej strony.",
I wasn't sure of this, because of the example string
In five hundred feet You’ll arrive onto Exit 12, Main Street, London Then Take the first exit.
(pl:Za pięćset stóp Dojedziesz na Exit 12, Main Street, London Następnie Zjedź pierwszym zjazdem.
) -- would it sound good if it was used for midway points as well?I like it, it's clear but
Wymawianie nazw ulic
fits other strings in the settings betterConsistency is a good thing but until we find a better translation let's keep it like I said.
@ -1,17 +1,34 @@
{
"make_a_slight_right_turn":"Trzymaj się prawej strony.",
@lunarna-gh It's only used at the absolute end of the route. It's not used in middle points. It's for example used like "Za 100 metrów jedź prosto, następnie dojedziesz". I think that @zyphlar probably auto-generated that and it made some incorrect artifacts. Look at the english version https://github.com/organicmaps/organicmaps/blob/master/data/sound-strings/en.json/localize.json#L13
?? It is the same translation
@jimcarst I'm trying to implement this but we may have difficulty with the current logic. Is there a faster way to communicate with you, maybe a chat or instant message? My email/Gmail is my github username at gmail.com if that helps
Improvements to norwegian (nb)
Change:
nb = Du ankommer.
to
nb = Ankommer du.
Change:
nb = så
to
nb = deretter
Change:
nb = Kamera foran
to
nb = Fotoboks foran
Thanks for the new tool! I reviewed PT and PT-BR, and I made a small fix which I believe it will sound better.
@ -12,1 +26,4 @@
"exit":"Pegue a saída.",
"onto":"para",
"take_exit_number":"Pegue a saída",
"take_exit_number_street_verb":"NULL",
"dist_direction_onto_street":"%1$s %2$s para %3$s",
@ -12,1 +26,4 @@
"exit":"Saia.",
"onto":"para",
"take_exit_number":"Siga pela saída",
"take_exit_number_street_verb":"NULL",
"dist_direction_onto_street":"%1$s %2$s para %3$s",
pt = Quando ativado, o nome da rua ou saída para virar será falado em voz alta.
pt-BR = Quando ativado, o nome da rua ou saída para virar será falado em voz alta.
I have been using the beta and I have one primary complaint.
I would like it to announce the street name even when making the final announcement. Often on city streets it currently never says the street names because I am making turns too often.
@ -500,7 +500,15 @@ void RoutingSession::GenerateNotifications(std::vector<std::string> & notificati
// Generate turns notifications.
std::vector<turns::TurnItemDist> turns;
if (m_route->GetNextTurns(turns))
If I’m not mistaken,
GetNextTurnStreetName()
walks the routing graph to find the actual next street name:67079e48fb/routing/route.cpp (L175-L181)
This is ideal behavior for the initial instruction, since users don’t necessarily intuit which way would be “heading west”, for example. It’s also great for roundabout instructions. However, it can be misleading in the context of a freeway/expressway exit instruction: the user doesn’t necessarily see a street name in advance on any signs. When they do see a street name, it may not be the same one that they need to take.
In the U.S., street names are very often signposted as freeway/expressway exit destinations, typically but not necessarily listed before other destinations. There’s a long-running dispute about whether to include them in the main
destination
tag or split it out into a separatedestination:street
tag. The latter approach forces routers to assume that the street is listed first and also assume that it isn’t listed in fine print as “alt text” for a route shield.@ -500,7 +500,15 @@ void RoutingSession::GenerateNotifications(std::vector<std::string> & notificati
// Generate turns notifications.
std::vector<turns::TurnItemDist> turns;
if (m_route->GetNextTurns(turns))
@1ec5 if you're concerned that this code would result in behavior like telling a user to exit Highway 101 by saying "in 3000 feet, exit onto Main Street" in the US, don't worry. Some magic within this bit of code (maybe
GetClosestStreetNameAfterIdx()
?) grabs the appropriate tags from the highway exit nodes and so the behavior is to instead say "in 3000 feet, exit onto Exit 123; Main Street; Embarcadero" -- it essentially pronounces the same text that's been displayed at the top of the screen this whole time.Obviously this can be improved further as discussed in the PR (we say "take exit 123 onto Main Street towards Embarcadero," hence all the necessary work in formatter strings and i18n), but the information is there and it's a vast improvement over the status quo of being on a highway and it says "in 3000 feet, exit. ... ... ... exit." and you have to figure out whether they mean Exit 122, Exit 123A, or Exit 123B in the spaghetti streets typical to our experience.
There's a beta APK available in the PR, I encourage you to download and try it out:
organicmaps/organicmaps#3130 (comment)
Updating PT strings
Fixing PT and PT-BR structure. Using "na" is weird, all TTS use "para".
@ -12,1 +26,4 @@
"exit":"Pegue a saída.",
"onto":"para",
"take_exit_number":"Pegue a saída",
"take_exit_number_street_verb":"NULL",
@ -12,1 +26,4 @@
"exit":"Saia.",
"onto":"para",
"take_exit_number":"Siga pela saída",
"take_exit_number_street_verb":"NULL",
Like said in previous comment, just fixing this string that didn't see before.
Pixel 6 - Android 13
French ✅
Please move all generated json files into a separate commit:
[strings] Regenerated TTS json files
@biodranik This is done and should be ready for merge
Am I correctly understanding that some commits override what was done in other commits? It may make sense to squash these related parts.
?
Are empty strings expected? If yes, then it's better to mention in a comment.
Why is it not removed?
Are empty strings really needed?
Do es-MX and pt-BR correctly take es/pt translation if it's equal and missing?
Does it make sense to copypaste everything? Shouldn't the en translation work by default for all languages?
Абвяшчаць назвы вуліц
Проговаривать названия улиц
Проговорювати назви вулиць
Fix TODO?
Fix TODO?
Only get nextStreet if TtsStreetNames is enabled.
Why the spaces are expected/needed?
Why some tests are commented?
@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
Please write new code following the best practices, without looking at some old code/approaches. Otherwise, we would now write code using C++ 2003 standard )
When time comes, someone will refactor it. Or older code will be removed. Or changed if modification is needed.
In this case, reading the tests is important. Don't make it hard for reviewers.
nit: indenting looks strange.
nit: strange indent
@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
Space before streetOut?
Can the full string be translated? E.g. "take exit $@", "съезжайте на $@ съезде"?
Can this and other regexes be made static const?
sizeof(ttsOut)/sizeof(ttsOut[0])
Consider caching all regexes either as static const or as local class members (faster).
Please create a variable to avoid doing double allocations/copying.
Space after comma here and in other places.
Should these functions be public in .hpp file? Can they stay static/private in .cpp only?
Some languages don't appear to have an "onto" word so yes sometimes null is possible. I will update the comment.
Was holding it open just in case another language reviewer came along needing it; removing.
Added, just didn't see it somehow
I can check what happens when I take them out, I am somewhat concerned that the automatic fallback logic might put i.e. English words in a Japanese translation if English uses a word like "onto" and Japanese doesn't. The current logic is tested and intentional to say we tested a string in this location but want it explicitly blank. It may look strange from a language perspective, trying to translate all possible strings, but string formatters with optional components mean that some things will need to be specifically empty in order to make sense.
For example we specify Japanese and Dutch for all other
take_*_street_verb
strings, but the Japanese construction apparently does not require it for this specific case, and this documents that. It's not just a missing string, it's a part of a multi-part string that we want to be blank.I haven't tested this,
es-MX
appeared on my radar during development and I wanted to stub it out / handle it. AFAIK es-MX is currently equal to es, however pt-BR is definitely not equal to pt in all cases. I can remove either/or if you desire, I don't intend a lot of changes there.Please don't change PT and PT-BR, I've translated and revised them. Yes, there are some differences.
I don't know, but if en fallback "just works" then it's a big problem for other strings where I do logic like
if (string myTrans = getTranslation("dist_direction_onto_street", locale)){ outputDistDirOntoStr(myTrans, myString); } else { outputPlainDir(myString); }
-- automatic fallback would say "oh sure, we have your string. You wanted English right?" when I'm explicitly checking for the string in my (often non-English) locale. And if en fallback doesn't "just work" then the answer to your question is no. I've programmed and tested this as written, I'd have to go back and check assumptions to see what happens if/when translation strings are removed or when a string foren
is defined but not the user's language. (It just so happens that many of our edge cases are in unrelated languages likenl
andja
and not "the default"en
so the case hasn't been tested.)In any case the general answer to your question is that we should proactively specify the formatter strings we want for the sentence orderings we want. Just because many languages happen to use the same ordering doesn't mean that they all "inherit" their ordering from English, and would complicate future changes ("huh, I'm going to modify the sentence ordering for Danish, I guess I'll just... uh... it's not here... hmmm...."). This formatter string is the most critical part of the language changes I'm making, sometimes I think it's worth being a little verbose so it's clear that of all strings this one probably shouldn't be null.
For some reason this review is of my commit only. In 13223101bb64e1d111754767d9c6826407b8c6f7 David updates these appropriately.
See MWMRoutingManager comment
Yes if line 507 is false, nextStreet will be an uninitialized string. We then pass std::string nextStreet through a number of functions and constructors until finally on line 54 and 78 of turns_tts_text.cpp we check
if notification.m_nextStreet.empty()
to determine if we should pronounce the street name or not.The extra space is where "In 500 feet" would go if it was going to be pronounced, like
{then} {distance} {direction}
but since this set of instructions does not have a distance it ends up beingthen (null) enter the roundabout
. Rather than add more string manipulation logic to the TTS engine that will not affect output but only increase complexity and computation, I adjusted the test to validate that this is indeed an acceptable TTS output. I could also add whitespace collapsing logic to the tests, but it seemed superfluous.@ -149,2 +152,4 @@
}
UNIT_TEST(EndsInAcronymOrNumTest)
{
Is this a blocker, do you want me to refactor the tests before merge?
Can you review this file in its entirety and let me know the preferred indenting style? This style in this file was last modified by Vladimir Byko-Ianko and Olga Khlopkova before me, I'm just trying to make the minimal changes necessary to accomplish the task without refactoring or prettifying beyond the scope of what I create. This specific line is a little less indented than others because otherwise the line width would be >100 chars
@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
Unneeded, fixed
@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
Perhaps in the distant future, but for now Google TTS does a surprisingly good job of saying
take exit s"yezzhaytena@
which I have no idea what it means, but if I was traveling in the former USSR at least that would match the street signs I'm seeing more closely than "take exit move out @" which is what Google Translate would otherwise do and is total nonsense.What we do for this is provide the
take_exit_number
string which replaces large parts of the instructions so you can say "turn right onto Main Street" and also "take exit 123" in the same master formatter string. (I was afraid I'd have to make really complex formatters until I began implementing this with real languages' requirements.) So the full thing will be translated:Через четыреста метров
Съезд на
172: Priest Drive
(or whatever the exit/street/highway names/numbers are) as in here: https://zyphlar.github.io/organicmaps-locale-viewer/?lang=ruIn any case
onto_exit_number
has been mostly replaced bytake_exit_number
and no languages appear to need further strings in and among that (maybe in the distant future if we needtake_exit_number_off_highway
andtake_exit_number_leaving_roundabout
but it's too early to attempt that) so I will remove this.@ -50,0 +60,4 @@
* @arg std::string & name - semicolon-delimited string output for the TTS engine
*/
void FormatFullRoadName(RouteSegment::RoadNameInfo & road, std::string & name)
{
It seems so; that change is made and tested.
@matheusgomesms can you review the PR as it currently looks and tell me if the current changes are correct? organicmaps/organicmaps#3130/files/00cb82086690e58e6edde753ed166cbe5675f537#diff-631491c869fee64c6cc86150edcfb7f52b23d306aee68d05816f8a141727e022
The only additional change I plan to make tonight is the change you requested on September 8:
Then it should be explicitly mentioned in the comment above.
It's better not to guess, but check/test, and make it explicit in comments/documentation. Where the fallback is ok, copy-paste is not needed.
If you modify the code, then try to follow the convention. Variables were aligned by
(
or with 4 spaces from the start of the string, in your case it's something else.Great, thanks! I got confused with many comments :P
I checked the Portuguese community, and it seems that the best approach here is (I guess already implemented on lines 572-573):
pt = Apanhe a saída
pt-BR = Pegue a saída
@biodranik I've removed extra spaces (some of which existed before this PR) by adding more logic to the TTS engine
@biodranik I've confirmed that fallback is active when
en
is specified. For this string I've deduplicated and specified that all languages inherit fromen
by default, and for[onto]
I've specified that an empty string is required for some languages in order to prevent nonsense. (generate_localizations would create strings like "一キロ先 Broadway Avenue onto 右折です" otherwise, where Japanese requires no "onto" or equivalent in this position, especially not in English)Per other comment, I've tested and explicitly commented.
Tested and commented
Unfortunately, I don't have time to go deeper into this Hungarian strings transformation code. I'd like to see some tests for turns_tts_text.cpp module like before->after transformation to understand the purpose of these manipulations ..
UPD: Sorry, the tests are already present :) But some of them are failing
nit: announceStreets var name is better here and everywhere below, IMO. No need in these long sentences :)
Also setAnnounceStreets(), shouldAnnounceStreets() functions.
Will have an infinite loop if from is a substring of to.
ReplaceAll("xxyyzz", "yy", "xyyz");
More natural code is like this:
nit:
UniString const & str
myString is in UTF8, so pushing the last byte only will break UTF8 consistency (if it is not an ASCII).
If the purpose is to replace the last codepoint, I'd prefer to rewrite this function.
Slight change to PT-PT, after feedback from Portuguese community
@ -4,2 +6,4 @@
"make_a_right_turn_street":"NULL",
"make_a_sharp_right_turn":"Kanyarodjon élesen jobbra.",
"make_a_sharp_right_turn_street":"NULL",
"enter_the_roundabout":"Hajtson be a körforgalomba.",
This should be "Hajtson ki a" (or az)
It should be in the strings regeneration commit.
https://en.cppreference.com/w/cpp/regex/regex_replace with a const static allocated regex can be a better solution.
Replacing many occurrences leads to many reallocations.
Here and everywhere.
What is going on here? Looks like an unnecessary overhead. Can you explain the algorithm?
The memory at runtime is allocated 1) for the vector and 2) for each constant string each time this function is run. Please treat OM as an embedded environment where each overhead matters.
Use constexpr std::array and std::string_view instead.
Or even better, if these values are used for the search, please create a regex instead. It will work faster.
@biodranik if the last two characters in an acronym or number (i.e. we won't say ABC or 123 as if they were words, we will spell it out like ay bee see or one hundred twenty three) then in Hungarian we start by looking at the last two characters. If the last two characters are 10, 40, 50, 70, 90 then we have a "-re" ending because of how it's pronounced. If they're 20, 30, 60, 80 then they'll have a "-ra" ending.
A phrase ending in "-hundred" is a special case, so if the last three letters are "100" then that has a "-ra" ending.
If none of the above are true, then we can simply look at the last character in the string for the appropriate suffix. If the last character is one of AÁHIÍKOÓUŰ0368 then it gets a "-re" ending. All other cases will get a "-ra" ending however we can't simply stop there because if there is some unknown character like whitespace or punctuation we have to keep looking further backwards into the string until we find a match or we run off the end of the word (
" "
) which is a suspicious condition (what Hungarian word or acronym doesn't have a single character in the Hungarian alphabet?)I suppose it's possible to make this with some pretty crazy regex, just remember this isn't about finding any occurrence of a string or the last occurrence or even the last character, this is about starting at the end of a word or acronym and counting backwards until we find a matching ending word-sound so that we know whether 123 or 110 or 100 or 1000 or ABC or ABA end with "high" or "low" vowel sounds.
And the same process but different is repeated for regular words, since like many languages the way you say a letter or number's name is different from how you say it when it's in the middle of a word. "Exit 100A" is not the same thing as "Exit Hundred Aardvark" especially when the question is "what are the beginning and ending vowel sounds when spoken"
Finally recall that we can't just
for(int i=strlen;i>0;i--) if (frontNames.contains(myString[i])) return 1;
because each "character" is possibly a multi-byte character, and we need to sometimes look at multiple characters at the same time. So that's why there's substring overhead, grabbing the last 1/2/3 unicode glyphs, not just the last 1/2/3 bytes.Undid string_utils change, using regex_replace
This thing is called vowel harmony, and it's awfully complex. Usually translators solve it by rephrasing the sentence instead of writing long regexes. E.g. you can say
Forduljon ide: M4 autópálya
instead ofForduljon az M4 Autópályára
.https://en.wikipedia.org/wiki/Vowel_harmony#Hungarian
Also that's the recommendation of the Hungarian translators of fsf, best practices in Hungarian: https://fsfhu.gitlab.io/forditas-hogyan/#a-s-hez
@infeeeee I'm aware, but we've done a really good job of making this sound natural in Hungarian per multiple conversations with multiple native speakers so I'm inclined to not just delete it all because of a few for loops
Fwiw, those recommendations are for translators dealing with systems where natural translation isn't possible, so it lists some workarounds.
@aronkvh OSMAnd uses the same methods for reading the street names, (to use the same example it says
Forduljon az M4 autópálya felé
), so it's also good middle ground for those who don't want to deal with the complexity of agglutinative languages.Tested the current implementation, and it's working, I tried to find edge cases where it breaks, but couldn't find any, congrats!
@biodranik unfortunately even Google doesn't handle grammar only pronunciation. So we don't need to write out "one hundred twenty three" at least but everything else will be read literally like a naive student with a really good accent would. Consider that TTS won't correct you if you write "she was a idiot" instead of "she was an idiot," this is the same situation just moreso.
@infeeeee since when was OsmAnd announcing street names! Apparently I have to check it out again :P
@biodranik comments added
Az utcanevek felolvasása
Sorry, I know there was a discussion on this, but this sounds unnatural to me. It literally translates to "choose an exit", like I am choosing between some options. I'd say most native speakers would say "Zjedź zjazdem", which literally means "exit using the exit", but it actually makes sense and is not gibberish. And "zjedź … zjazdem" is already in Organic Maps as a translation.
@ -410,3 +479,3 @@
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
es = Salida.
es = Salga.
See comment below for more information. "Zjedź zjazdem" makes more sense here.
I've added some comments to the Polish translation. I don't know the weird obsession with "Wybierz zjazd", but most natives don't talk like that.
@ -410,3 +479,3 @@
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
es = Salida.
es = Salga.
@Acrobot line 490 now says
pl = Zjedź zjazdem.
per your upstream changes, when this PR is merged it will only be used for the definitive stand-alone command like "in 100 meters, exit." Line 570 above for [take_exit_number] will be used in the fuller case of "in 100 meters, take exit 123, Main Street, Queen's Expressway, London" as opposed to "in 100 meters, turn right onto Main Street, Queen's Expressway, London"@Acrobot @pm4rcin I'm happy to defer to whatever consensus exists, I just don't speak Polish so I have to judge based on your discussion. Check out https://zyphlar.github.io/organicmaps-locale-viewer/?lang=pl for some example sentences, you can click to edit and suggest changes. What seems best to both of you?
@zyphlar from my look it all seems good right now. I have checked your webpage and everything looks good now. So I'm patiently waiting for it to be merged because it is taking too long and it would be good starting point to implement more advanced instructions as discussed throughout whole PR. :)
Anuncia els noms dels carrers
Quan s'activa, el nom del carrer o sortida a prendre s'anunciarà en veu alta.
Muralla
I've no idea why these turned back to English but they should still say "Camino"
Cuando se active, se pronunciará en voz alta el nombre de la calle o salida para tomar.
"Agafeu" instead of "Agafa"
cap a
Thanks for this catch, for some reason es-MX sometimes doesn't generate right inside of WSL but works fine in native Linux @biodranik
@fitojb thank you for your review! Please check the latest updates.
nit: std::size_t -> size_t
@ -74,20 +74,53 @@ public:
struct RoadNameInfo
{
nit:
m_name(std::move(name))
here and every param belownit: no need braces here
@ -5,3 +8,4 @@
#include "base/string_utils.hpp"
#include <cstring>
#include <string>
Do not use global var like this. Just declare and use local
GetTtsText getTtsText;
for each test.@ -102,26 +107,24 @@ UNIT_TEST(GetDirectionTextIdTest)
UNIT_TEST(GetTtsTextTest)
{
Simple
R"(text here)"
should work fineThis is a flaky issue reproduced randomly. It would be great to find why it happens. Does it depend on some locale env var?
Please fix remaining comments.
Let's avoid introducing a new code style.
Can this value change to a non-boolean type in the future? Is it worth checking the type conversion dynamically to avoid crashes in the future after a type change?
Are these wrappers really needed? Can Config be called directly?
Let's make it a
CHECK_LESS(start, str.size(), ());
to immediately crash and fix the bug.Let's fail immediately with a CHECK instead of correcting the value.
Will our engine work if duplicate sub-language strings are removed if they are equal to the main language (e.g. es == es-MX, pt == pt-BR, etc.)? Other strings work fine. I would prefer to remove duplicate sub-language translations to keep it simpler.
@ -43,3 +51,4 @@
[make_a_right_turn]
en = Turn right.
ar = انعطف يميناً.
Can you please remind me what happens if there is no en and other translations? Will an empty string be returned for all languages except nl?
And if there is a
en
translation, then it will be returned for any language, right?@ -399,3 +468,3 @@
vi = Đi thẳng.
zh-Hans = 直行。
zh-Hans = 直行
zh-Hant = 直行。
Are you sure that no dot is needed here and below?
@ -410,3 +479,3 @@
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
es = Salida.
es = Salga.
I also see that other translations also missing a dot. I remember that the TTS engine changes pronunciation depending on the dot presence.
@ -508,3 +661,4 @@
pl = Dotarłeś do celu.
pt = Chegou ao seu destino.
pt-BR = Você chegou ao seu destino.
ro = Ați ajuns.
Dots were removed intentionally, right? If yes (and if it sounds better without a dot in Chinese), then please also remove dots in corresponding zh-Hant translations.
@rtsisyk has added
az
translation. It would be great to add it too. Can you please also check if some other translations were added? Does it make sense to auto-generate them (e.g. hi) so users will fix it later?Is there a new line delimiter?
@ -17,6 +17,7 @@ NSString * kSelectTTSLanguageSegueName = @"TTSLanguage";
enum class Section
{
VoiceInstructions,
StreetNames,
This is not serialized anywhere, right?
Using global vars in such a strange way doesn't look right. Is BuildCell called more than once?
@ -203,6 +226,7 @@ struct CamerasCellStrategy : BaseCellStategy
{
using base::Underlying;
ditto
Don't be afraid to make one line.
@ -17,6 +17,7 @@ NSString * kSelectTTSLanguageSegueName = @"TTSLanguage";
enum class Section
{
VoiceInstructions,
StreetNames,
@dvdmrtnz can you help me with these iPhone questions?
Make it a single line (better) or fix the indentation.
@ -55,0 +59,4 @@
/// \param turns contains information about the next turns starting from the closest one.
/// \param distanceToTurnMeters is distance to the next turn in meters.
/// \param turnNotifications is a parameter to fill it if it's necessary.
/// \param nextStreetInfo is the RoadNameInfo for the next street to turn on (optional)
Does it make sense to pass optional param as a const pointer and check it for nullptr?
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
This method is always checked for
-1
, can the code be rewritten to usestrings::EndsWith
?Is it possible to also avoid 1) Unicode conversion and 2) memory allocation for substrings?
Let's create a separate turns_tts_hungarian.cpp file with all hungarian-related logic, it will make adding other languages easier and general code cleaner.
Won't this function fail on strings 1) without spaces and 2) starting with a space?
Won't this code crash on strings 1) without spaces and 2) starting with a space?
Can all characters be defined as
U'e'
char32_t code points, and all work/transformation/comparison be done using UniString, without unnecessary conversions?std::u32string_view
can also be used if necessary.Move it outside of the loop.
Maybe create a static function
RemoveLastDot()
to avoid duplication?Should dots be removed from translations instead?
Do spaces break ja TTS?
@ -74,20 +74,53 @@ public:
struct RoadNameInfo
{
@vng updated, please double check me
@biodranik just seems like an existing convention; will need to include Config into NavigationService.
We'll find out! Certain TTS-specific words like "onto" do need to be specified but I've removed all other duplication.
@ -43,3 +51,4 @@
[make_a_right_turn]
en = Turn right.
ar = انعطف يميناً.
Correct, adding an
en
here would require that all languages not having a unique string for "make a slight right turn onto $name" (different from english anyway) would need to be specified as blank here.@ -399,3 +468,3 @@
vi = Đi thẳng.
zh-Hans = 直行。
zh-Hans = 直行
zh-Hant = 直行。
It doesn't really matter for something like this because it's either always at the end of a sentence anyway or my logic removes terminal dots when it's in the middle of a sentence.
Before with a dot:
In 500 feet, Exit.
Exit.
Now with or without a dot:
In 500 feet, Exit onto Main Street
Exit onto Main Street
TTS is almost always smart enough to still sound authoritative when given a sentence fragment without terminal punctuation, so honestly I'd prefer having no periods at all but it really doesn't matter. I've re-added it here.
@ -410,3 +479,3 @@
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
es = Salida.
es = Salga.
It'd be interesting to hear if that was the case, I haven't noticed it in my tests.
@ -508,3 +661,4 @@
pl = Dotarłeś do celu.
pt = Chegou ao seu destino.
pt-BR = Você chegou ao seu destino.
ro = Ați ajuns.
I'll just re-add them. If it makes a difference for some people's TTS then we can keep them, my code handles them properly. This was just not-caring more than intention.
Yes all my added translations that haven't been human reviewed are automatically translated. I've added
az = Küçə adlarını elan edin
andAktiv edildikdə, dönmək üçün küçənin və ya çıxışın adı ucadan səsləndiriləcək.
hopefully that's acceptable for @rtsisyk ?When I started this PR some languages didn't have much support so that's the only reason any language would be missing. I'll add that too. The string directly above mine for example didn't have
hi
either, but does havesw
...@ -55,0 +59,4 @@
/// \param turns contains information about the next turns starting from the closest one.
/// \param distanceToTurnMeters is distance to the next turn in meters.
/// \param turnNotifications is a parameter to fill it if it's necessary.
/// \param nextStreetInfo is the RoadNameInfo for the next street to turn on (optional)
@biodranik I'm not the person to ask about C++ API/language design, I'll follow your suggestions :)
This code is trying to replicate the same structure of
MWMSettingsViewController.m
where a reference is stored to each cell to be able to distinguish between them in theswitchCell
callbackAt least it will hide the variable visibility outside of the mm file.
@ -55,0 +59,4 @@
/// \param turns contains information about the next turns starting from the closest one.
/// \param distanceToTurnMeters is distance to the next turn in meters.
/// \param turnNotifications is a parameter to fill it if it's necessary.
/// \param nextStreetInfo is the RoadNameInfo for the next street to turn on (optional)
A pure C++ way is to use std::optional, but it has a bit more overhead compared to passing nullptr.
I can't imagine a situation where we'd want to make "announce street names" much besides yes or no. And in any case if we did we'd need to go in and reimplement
Config.setAnnounceStreets(newVal)
as well asmTtsPrefStreetNames
and test that the change worked... seems fine as is to me but I'm not a professional Java programmer... other settings in this file are implemented the same way.@ -55,0 +59,4 @@
/// \param turns contains information about the next turns starting from the closest one.
/// \param distanceToTurnMeters is distance to the next turn in meters.
/// \param turnNotifications is a parameter to fill it if it's necessary.
/// \param nextStreetInfo is the RoadNameInfo for the next street to turn on (optional)
I handled this with simply an empty RoadNameInfo struct override function, which works and seems fine / efficient enough for my tastes. If you want me to redo it with nullptr or std::optional I can, but it seems like a stylistic choice.
Later we check
if (notification.m_nextStreetInfo.empty())
before using it.Do we have a linter for this kind of stylistic stuff? It's getting pretty nitty in here, at this rate this PR will be in progress until 2026...
It's not about breaking, simply that Japanese doesn't have spaces and stray spaces make the test suite weird/difficult/inconsistent.
It's possible however it's not really important and if there is some TTS engine out there that prefers sentences to be ended with hard stops (or a translator gets a little too eager) I don't want that to sound weird here. I erred on the side of a minimal change instead of presumptively redoing lots of stuff.
Back in August, I suggested in Slack that this project investigate something like Fluent for representing translations more flexibly. I was hoping this project could learn from the challenges that OSRM Text Instructions faced. But there wasn’t much appetite for reconsidering the slots-based approach, since so much had already gone into it even by then.
A significant number of the code review comments since then have been individual suggestions from translators that a translator could’ve self-served in any translation management system. I’m confused because #4076 makes it seem like Organic Maps is already being translated via Weblate – why the different system? Apparently Weblate supports this INI format, Android strings, Apple strings, and Fluent.
@ -43,3 +51,4 @@
[make_a_right_turn]
en = Turn right.
ar = انعطف يميناً.
This is a great argument for using localizable format strings instead of the Lego-like “slots” that this PR has been designed around. The main downside to format strings is that you need a lot of slightly differing format strings to handle various scenarios, which is extra maintenance overhead and translation burden. But it seems like this PR gets a different form of overhead anyways.
@ -409,7 +478,7 @@
da = Afkørsel.
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
“Lối thoát” means “an escape”, and “vào” would introduce a street name, not an exit number.
Vietnamese avoids passive voice like the plague, but you wouldn’t know it from machine translation:
“Đường phố“ specifically means a city street, as opposed to a road.
Is there any way to tell which languages were machine-translated and which were human-translated? If not, errors will likely go undetected for longer.
@ -409,7 +478,7 @@
da = Afkørsel.
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
@1ec5 all new strings that haven't had a human reviewer in the comments here are machine generated. You can see the list of language teams at the bottom and who's proposed changes; any and all help is definitely appreciated.
@1ec5 thanks for the review Minh! We are using format strings in the sense of
Turn $1 onto $2
etc which is about as flexible as we can get without me majorly rearchitecting the entire app's i18n as my first (long delayed) PR.We're moving towards Weblate as fast as we can but the core issue is that the developer workflow is via these text files which are actually not a supported format. They look like INIs but most platforms don't support all strings (all languages) in one file: they expect one file per language. So the most direct fix is to just abandon the txt file format entirely and change the per-platform string files directly... but that's a big change for developers and core contributors used to doing it the other way for so long.
“Turn $1 onto $2” would be fine, but the current approach seems to be literally “$1 $2 $3 $4”, where $1 is “Turn” and $3 is “onto”. That’s no different than concatenating Lego strings together. Placeholders are for dynamic content, e.g., street names, not static strings. The challenge OSRMTI faced was that we had to choose between treating words like “right” or “fifth” as dynamic content (harder for some languages to integrate into the overall sentence without custom code) or inlining them as static content (resulting in a proliferation of format strings, because these words can’t be reused as easily). The current approach really leans into the former.
If you’re open to addressing this issue as part of this PR, I think it would be a matter of inlining the existing translations into the format string as a one-time operation. Then the issue of null slots in organicmaps/organicmaps#3130 would go away, because the translator can simply omit a word in the format string, confident that another language’s translation won’t take its place.
@1ec5 I hear you, but I think you're underestimating the complexity of the format string -- which does change, as some languages have different subject-verb-adjective-object ordering. We could certainly move some logic out of the locale strings and into the program, but honestly the most stable fix would be to just have some locale strings/files behave differently than others (maybe a marker or argument that disables the english fallback logic etc.)
Merging some of these strings into the formatter to remove the formatter would likely increase duplication not decrease: previous versions of this PR had something like
$1 $2 onto $3
until we realized that "onto" actually needs to itself be dynamic based on a few different factors.It's true that we can't very easily reuse any part of "take the fifth exit" across all languages due to counting issues, what constitutes an "exit," how to conjugate "take", etc, but that's not the biggest problem the app faces so I'm not super worried. It is a future goal to support arbitrary distances and countings, like "in
13
meters, turn right", which will put all of this to the test especially with that fun fun fun pluralization, and that's being discussed and maybe we do move to something like Fluent later -- cc @pastkFor now, this PR is about including the street name in the turn instructions, and I fear if we stray past the minimal changeset to accomplish that goal then it'll be 2026 before this feature gets released.
@ -409,7 +478,7 @@
da = Afkørsel.
de = Nehmen Sie die Ausfahrt.
el = Εξοδος.
Translation change request:
Dunno what happened about
dist_direction_onto_street
…@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
I've refactored a bit; we can't just do a pure EndsWith because we need to check backwards iteratively, but I did indeed refactor with EndsWith on a substring instead if the repetitious i; i-1; i-2; etc.
@1ec5 I replied elsewhere but it got attached to an unrelated comment... there are facilities in Github to see which language review team has checked a PR but the quickest way is to see which languages have been debated here in the comments. You're correct that a more formal language review process/team would be ideal, I haven't been around long enough to know what the status is.
I don’t think I am. “$1 $2 onto $3” can be translated as “Onto $3 by $2ing $1” if necessary. What you’d lose is the ability to tell that “onto” is the preposition that needs to be inclined differently depending on $3’s gender, so you’d have to have a different string per gender or some form of markup within the string itself. OSRMTI was able to accommodate most of these needs for most languages, except for some prominent gaps in German.
Right, my point is that redundant, unreusable strings aren’t necessarily such a problem that you need to avoid them so aggressively. It’s OK to make translators translate more strings if that gives them more control over the final output. OSRMTI struck a balance that translators found somewhat daunting at first, but it isn’t nearly as burdensome as in Valhalla, which does minimal substitution or concatenation. Sometimes OSRMTI contributors would complain about having to translate 300 strings, but once they got started on the first 20%, Transifex’s translation memory kicked in, speeding up the remaining 80%.
Fluent (and a work-in-progress industry standard, MessageFormat2) would somewhat address this tradeoff, so that translators would work with something streamlined and the redundant strings would only occur in the built product. Translatewiki.net supports an even more powerful templating language as part of its message format, but that’s less useful for mobile platforms where there’s no tooling to interpret that language.
It isn’t for me to insist on reworking the translation units as part of this PR. Going forward, there will be plenty of opportunities to experience the limitations of the current approach and evaluate alternatives (not that the alternatives are unalloyed goods either).
Any translation management service would have that functionality, including Weblate, so the answer could just be that a TMS will be adopted sooner or later.
The instant we get any more complex than this I fully agree; as you can see Hungarian was a whole thing by itself and I'm not sure Fluent helps much with vowel harmony especially for unknown variables (street names).
But like I said there is a desire to support dynamic numbers and that's a prime Fluent use case; we're already in the realm of silliness with 11 strings for 1-11 roundabout exits. I see the value, but I think it's a project for a future day
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
We have tests for strings with spaces and they seem to pass
Done
@ -74,20 +74,53 @@ public:
struct RoadNameInfo
Why the order of member vars here was changed?
Now
std::u32string_view constexpr front {U"eéöőüű"};
can be used, it's easier to read and to support.Avoid unnecessary intermediate conversions:
@ -0,0 +136,4 @@
// scan for acronyms and numbers first (i.e. characters spoken differently than words)
// if the last word is an acronym/number like M5, check those instead
if (EndsInAcronymOrNum(myUniStr))
return CategorizeHungarianAcronymsAndNumbers(hungarianString);
Does it make sense to pass here the lower-cased unicode string and use the same U'char' approach in the function?
🇭🇺✅
@ -74,20 +74,53 @@ public:
struct RoadNameInfo
I don't think they were, or if they were it was prior to rebase
@biodranik do you have a linter for this kind of stuff? If we delay PRs for all these nits I'll have an average contribution rate of 1 PR per 3 years.
Commit your changes first. Then use
git clang-format <commit_hash>
There could be some issues, as we didn't finish polishing it here organicmaps/organicmaps#4755
@ -0,0 +136,4 @@
// scan for acronyms and numbers first (i.e. characters spoken differently than words)
// if the last word is an acronym/number like M5, check those instead
if (EndsInAcronymOrNum(myUniStr))
return CategorizeHungarianAcronymsAndNumbers(hungarianString);
No, acronyms are all uppercase strings like IBM or ST, it would defeat the point to pass a lowercase string into an acronym function. The lowercasing is for ease of detecting and matching all other non-acronym text.
@ -233,6 +233,10 @@
<string name="pref_map_3d_buildings_disabled_summary">Los edificios 3D se apagan en el modo de ahorro de energía</string>
<!-- Settings «Route» category: «Tts enabled» title -->
<string name="pref_tts_enable_title">Instrucciones de voz</string>
<!-- Settings «Route» category: «Tts announce street names» title -->
Please replace "Cuando está habilitado" with "Cuando se activa" (shorter).
Also, let's add Catalan:
<string name="pref_tts_street_names_title">Anuncia els noms dels carrers</string>
<string name="pref_tts_street_names_description">Quan s'activa, el nom del carrer o la sortida per girar es pronunciarà en veu alta.</string>
@ -233,6 +233,10 @@
<string name="pref_map_3d_buildings_disabled_summary">Los edificios 3D se apagan en el modo de ahorro de energía</string>
<!-- Settings «Route» category: «Tts enabled» title -->
<string name="pref_tts_enable_title">Instrucciones de voz</string>
<!-- Settings «Route» category: «Tts announce street names» title -->
@fitojb like this?
46f3219e0c
@biodranik ok this works with the commit prior to the one you want to edit and also does all commits since (so it mangles locale json files etc) but otherwise thanks this will be useful to reduce nits
@ -233,6 +233,10 @@
<string name="pref_map_3d_buildings_disabled_summary">Los edificios 3D se apagan en el modo de ahorro de energía</string>
<!-- Settings «Route» category: «Tts enabled» title -->
<string name="pref_tts_enable_title">Instrucciones de voz</string>
<!-- Settings «Route» category: «Tts announce street names» title -->
@zyphlar Thank you!
You can safely make these lines one-liners.
ASSERT_LESS_OR_EQUAL is enough.
Is it still used?
Can it be empty?
en =
?@ -252,2 +252,3 @@
std::vector<std::string> notifications;
self.rm.GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
self.rm.GenerateNotifications(notifications, announceStreets);
kIsStreetNamesTTSEnabled ?
@ -191,2 +191,3 @@
std::vector<std::string> notifications;
GetFramework().GetRoutingManager().GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
GetFramework().GetRoutingManager().GenerateNotifications(notifications, announceStreets);
@dvdmrtnz kIsStreetNamesTTSEnabled ?
@ -74,20 +74,53 @@ public:
struct RoadNameInfo
nit: align //
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
}
/**
If outArr size is 0, then it's an (almost) infinite loop.
Are these some debug logs that should be removed?
nit: it's better to define strings closer to the place of their usage.
Can all these searches be replaced with
?
nit: we usually write const and constexpr after the variable type.
As Unicode symbols can take 2 or more
char
bytes, this code won't work properly.Can UniString be passed into this function, and u32string_view be used for back and front names?
isdigit accepts int, there should be a warning, right? static_cast can be helpful.
Maybe this Isdigit can be also used: https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uchar_8h.html#a42b37828d86daa0fed18b381130ce1e6
@ -0,0 +146,4 @@
std::u32string_view constexpr indeterminate{U"ií"};
// find last vowel in last word
for (size_t i = myUniStr.size() - 1; i > 0; i--)
Is there a guarantee/check somewhere that the string is never empty?
Internal .cpp functions should not be defined in headers.
This is related to all functions below. Please remove from this header everything what is not used by external code. You can move function comments into the .cpp if necessary.
@ -0,0 +10,4 @@
namespace turns
{
namespace sound
{
@ -0,0 +64,4 @@
} // namespace sound
} // namespace turns
} // namespace routing
@biodranik this and other word-wrap changes were from the clang linter, do you want to change the linter or have me fight with it in some specified way?
@biodranik we just had a big long conversation about this and this was the solution: it could be empty, but then your locale file validator would complain, so explicitly setting it to the string
"NULL"
and checking for that is the only sane workaround I can imagine@biodranik on Jan 27 you told me
If we keep on making changes to the changes you told me to make, this PR will never be closed. I'm going to leave this as-is because CHECK_LESS_OR_EQUAL is used in two other places in this file but ASSERT_ is not used, so it makes more sense to continue the existing pattern.
No, I'll remove
No it actually works fine in this case because we're looking for substring matches against other small strings and we return on the first match... if a string ends in multi-byte characters, we'll just loop until we match one of the strings above. I have test cases for this, it currently works as intended
We have to take in and return a std::string because of the outside API but otherwise OK
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
}
/**
This wasn't even caught by clang linting, if you want this level of formatting maybe tell the linter to do so
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
}
/**
That's why this code won't execute unless outArr size is greater than 0 (which is a little silly because that means we're announcing a turn onto a road with no name, number, junction or destination, but ok)
Comparing partial byte sequences instead of characters is possible, and even may work in most cases. This is not the right way to work with characters. If the code was iterating using characters, not bytes, then it was safe.
@ -47,10 +48,63 @@ void GetTtsText::ForTestingSetLocaleWithJson(std::string const & jsonBuffer, std
m_getCurLang = platform::ForTestingGetTextByIdFactory(jsonBuffer, locale);
}
/**
If you use
intmax_t i = 0; i < static_cast<intmax_t>(outArr.size()) - 1; ++i
, then additional check for the array size is not needed.@biodranik tests pass with simply calling
.endsWith()
repeatedly so if we're lucky that's all that's neededThanks! We're almost there. Every time I see something new )
The linter should be fixed to leave
JNIEXPORT jobjectArray JNICALL
on the previous line. JNI function names can be very long.Manual undo is enough.
Documenting obvious code only complicates it )
Instead of writing long comments, renaming arguments to haystack and needle is simpler and cleaner.
@ -252,2 +252,3 @@
std::vector<std::string> notifications;
self.rm.GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
self.rm.GenerateNotifications(notifications, announceStreets);
In the beginning of the file, above @interface, add
@ -191,2 +191,3 @@
std::vector<std::string> notifications;
GetFramework().GetRoutingManager().GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
GetFramework().GetRoutingManager().GenerateNotifications(notifications, announceStreets);
@ -15,6 +15,7 @@ namespace
{
NSString * const kUserDefaultsTTSLanguageBcp47 = @"UserDefaultsTTSLanguageBcp47";
NSString * const kIsTTSEnabled = @"UserDefaultsNeedToEnableTTS";
NSString * const kIsStreetNamesTTSEnabled = @"UserDefaultsNeedToEnableStreetNamesTTS";
This should be moved up from the anonymous
namespace
.Remove
name.clear()
call in the beginning, remove name as a parameter, and remove it instead as a result of the function instead ofvoid
.@ -65,4 +119,3 @@
std::string const dirStr = GetTextById(GetDirectionTextId(notification));
if (dirStr.empty())
return {};
Strings are empty by default.
nit:
Using a regex to trim is an overkill. Trailing spaces can also be trimmed, right?
Will returning an empty string instead of artificial NULL simplify the code?
@ -46,3 +56,4 @@
};
/// Generates text message id about the distance of the notification. For example: In 300 meters.
std::string GetDistanceTextId(Notification const & notification);
/// Generates text message id for roundabouts.
Is the goal to remove the last unicode dot from the string? Then it's not the case. Functions below remove any last point in the sentence, including "test.string" => "teststring".
Use base::EatSuffix(".") instead.
Here and below {} are not needed for one-liners.
These comments better look right before the if.
@ -252,2 +252,3 @@
std::vector<std::string> notifications;
self.rm.GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
self.rm.GenerateNotifications(notifications, announceStreets);
Please excuse me for my C language, but what is the purpose of adding const for integral types?
@ -252,2 +252,3 @@
std::vector<std::string> notifications;
self.rm.GenerateNotifications(notifications);
auto announceStreets = [NSUserDefaults.standardUserDefaults boolForKey:@"UserDefaultsNeedToEnableStreetNamesTTS"];
self.rm.GenerateNotifications(notifications, announceStreets);
It is a pointer type, not an integral type. The difference is that
const
marked variables can be placed in the read-only memory sections of the executable, which may lead to better performance (no guards when accessing), lesser code size, and more safety against accidental modifications.@biodranik No, we've gone back and forth on this about 5 times: if an i18n string is actually empty it will be detected by various i18n validators as missing and either fail validation or encourage people to supply it which is not necessary (and will be harmful/confusing) for most languages. But with increasing i18n complexity we do want translators to be able to provide i18n'd formatter string values as appropriate. So the default for some of these strings will be the keyword
"NULL"
so as to make it explicit that we want this string to be blank for languages without another string defined. This is my workaround to hacking into our i18n system as discussed in chat..Thanks to @zyphlar and to everyone else involved. The time has come to let it go!
It would be great to keep up-to-date issue with the necessary improvements.