Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pronunciation for names and destinations #3132

Merged
merged 176 commits into from
Oct 7, 2021
Merged

Pronunciation for names and destinations #3132

merged 176 commits into from
Oct 7, 2021

Conversation

gknisely
Copy link
Member

@gknisely gknisely commented Jun 8, 2021

Issue

Process the pronunciation for names and destinations in OSM. Store the pronunciation with the associated name or destination(sign).

Tasklist

  • Add tests
  • Update the changelog
  • If you made changes to the lua files, update the taginfo too.

@kevinkreiser
Copy link
Member

there was another pr that @danpaz started that was attempting to do this, the size of this PR is surprizing compared to what i remember that one was. do you guys mind filling out the description a bit when you are ready?

src/baldr/edgeinfo.cc Outdated Show resolved Hide resolved
@kevinkreiser
Copy link
Member

i know its probably not ready yet but, i thought i saw at different places in the code, different names being used for the same concept: phoneme, pronounciation, verbal would we be able to use just one of these for the sake of reducing confusion? or maybe these are different things and i didnt review it closely enough yet?

"ˈgɹænvɪl");

// Verify guide sign pronunciation instructions
gurka::assert::raw::expect_instructions_at_maneuver_index(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify guide sign pronunciation instructions

"ɛm ˈʤʌŋkʃən");

// Verify junction name pronunciation instructions
gurka::assert::raw::expect_instructions_at_maneuver_index(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify junction name pronunciation instructions

namespace valhalla {
namespace baldr {

class Pronunciation {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class to use with street names and signs through odin

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is completely trivial, could we just make it a struct with public members and no constructor?

namespace valhalla {
namespace odin {

class MarkupFormatter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new class that will handle the formatting of our markup strings

@@ -20,7 +21,8 @@ class NarrativeBuilder {
public:
NarrativeBuilder(const Options& options,
const EnhancedTripLeg* trip_path,
const NarrativeDictionary& dictionary);
const NarrativeDictionary& dictionary,
const MarkupFormatter& markup_formatter);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass markup formatter to the narrative builder

@@ -50,6 +63,7 @@ class Sign {
std::string text_;
bool is_route_number_;
uint32_t consecutive_count_;
boost::optional<baldr::Pronunciation> pronunciation_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional pronunciation for a sign

@@ -127,6 +127,24 @@ inline TripLeg_Node_Type GetTripLegNodeType(const baldr::NodeType node_type) {
" Unhandled NodeType: " + std::to_string(num));
}

inline Pronunciation_Alphabet
GetTripPronunciationAlphabet(const valhalla::baldr::PronunciationAlphabet pronunciation_alphabet) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transform to proto Pronunciation Alphabet

@gknisely gknisely mentioned this pull request Oct 7, 2021
Copy link
Member

@dgearhart dgearhart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - would appreciate @kevinkreiser 👀 for final review

@kevinkreiser
Copy link
Member

Yep I'll review later today!

CHANGELOG.md Outdated Show resolved Hide resolved
src/baldr/edgeinfo.cc Outdated Show resolved Hide resolved
Comment on lines +802 to +814
std::unordered_map<uint32_t, std::pair<uint8_t, std::string>>::iterator iter =
index_pronunciation_map.find(header.name_index_);

if (iter == index_pronunciation_map.end())
index_pronunciation_map.emplace(
std::make_pair(header.name_index_,
std::make_pair(header.phonetic_alphabet_,
std::string((text + pos), header.length_))));
else {
if (header.phonetic_alphabet_ > (iter->second).first) {
iter->second = std::make_pair(header.phonetic_alphabet_,
std::string((text + pos), header.length_));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use std::unordered_map<>::insert and if it already existed it will give you back the iterator where it already existed. that said, why do we want to overwrite it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. And yes; we use Jeita over Katakanna if it exists

Comment on lines +6 to +18
// Constructor
Pronunciation::Pronunciation(const valhalla::Pronunciation_Alphabet alphabet,
const std::string& value)
: alphabet_(alphabet), value_(value) {
}

valhalla::Pronunciation_Alphabet Pronunciation::alphabet() const {
return alphabet_;
}

const std::string& Pronunciation::value() const {
return value_;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why have this? if you make the header a struct we dont need to have this implementation at all. you can intialize it with the brace initializer and you can just use .value etc to get the information. it just seems not useful to me to have an impelmentation file which does nothing that you cant achieve with just the header and a simple struct. maybe this is just a style thing but i prefer less files to manage than more. if there is only one place this struct is referenced i might even just put it in that files header and not even make a new header, its already hard to manage all the source files imho

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinkreiser can we refactor this for the next release? so we have more time to test the refactor

tagged_values, types, added, dual_refs);
edge_info_offset =
graphtile.AddEdgeInfo(edge_pair.second, (*nodes[source]).graph_id,
(*nodes[target]).graph_id, w.way_id(), 1234, bike_network,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we fixed this random elevation bug previously....

Comment on lines +1433 to +1454
std::vector<std::string> ipa_tokens, nt_sampa_tokens, katakana_tokens, jeita_tokens;
GetPronunciationTokens(osmdata, node.ref_pronunciation_ipa_index(),
node.ref_pronunciation_nt_sampa_index(),
node.ref_pronunciation_katakana_index(),
node.ref_pronunciation_jeita_index(), ipa_tokens, nt_sampa_tokens,
katakana_tokens, jeita_tokens, true);

bool add_ipa = (ipa_tokens.size() && n_refs.size() == ipa_tokens.size());
bool add_nt_sampa = (nt_sampa_tokens.size() && n_refs.size() == nt_sampa_tokens.size());
bool add_katakana = (katakana_tokens.size() && n_refs.size() == katakana_tokens.size());
bool add_jeita = (jeita_tokens.size() && n_refs.size() == jeita_tokens.size());

for (size_t i = 0; i < n_refs.size(); ++i) {
if (add_ipa || add_nt_sampa || add_katakana || add_jeita) {
uint32_t count = 0;
const size_t size = pronunciations.size();
BuildPronunciations(ipa_tokens, nt_sampa_tokens, katakana_tokens, jeita_tokens, i,
pronunciations, add_ipa, add_nt_sampa, add_katakana, add_jeita, count);
exit_list.emplace_back(Sign::Type::kExitNumber, false, false, (count != 0), size, count,
n_refs[i]);
} else
exit_list.emplace_back(Sign::Type::kExitNumber, false, false, false, 0, 0, n_refs[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too bad this isnt in a lambda would be nice to have less duplication

@@ -311,7 +311,7 @@ void validate(

// Validate signs
if (ni->named_intersection()) {
if (tile->GetSigns(i, true).size() == 0) {
if (tile->ProcessSigns(i, true).size() == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: but i liked the previous name way better

names.insert(names.end(), tokens.begin(), tokens.end());

size_t key = names.size() - tokens.size();
AddPronunciations(pronunciations, name_offset_map,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like inside AddPronunciations we could have done all 4 rather than call it 4 times?

Comment on lines -1191 to +1193
StreetNamesFactory::Create(trip_path_->GetCountryCode(node_index), prev_edge->GetNameList()));
StreetNamesFactory::Create(trip_path_->GetCountryCode(node_index), prev_edge->name()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is not obvious to me

@@ -1357,7 +1377,7 @@ void ManeuversBuilder::FinalizeManeuver(Maneuver& maneuver, int node_index) {
if (!curr_edge->IsHighway() && !curr_edge->internal_intersection() &&
(curr_edge->name_size() > 1)) {
std::unique_ptr<StreetNames> curr_edge_names =
StreetNamesFactory::Create(trip_path_->GetCountryCode(node_index), curr_edge->GetNameList());
StreetNamesFactory::Create(trip_path_->GetCountryCode(node_index), curr_edge->name());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are more, was it a function name change or just over complicated for no reason or?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevinkreiser had to change the method since we need the pronunciation too - it just had a list of name, is route num pair before

Copy link
Member

@kevinkreiser kevinkreiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could streamline this a bit more but good for now

@gknisely
Copy link
Member Author

gknisely commented Oct 7, 2021

@dgearhart and I will have a follow up PR to clean things up. TY @kevinkreiser

@gknisely gknisely merged commit 3652aca into master Oct 7, 2021
@gknisely gknisely deleted the phonemes branch February 22, 2024 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants