-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix stitching errors in gtfs ingestion #3943
Conversation
@@ -234,7 +234,7 @@ std::priority_queue<tileTransitInfo> select_transit_tiles(const boost::property_ | |||
void setup_stops(Transit& tile, | |||
const gtfs::Stop& tile_stop, | |||
GraphId& node_id, | |||
std::unordered_map<feedObject, GraphId>& stop_graphIds, | |||
std::unordered_map<feedObject, GraphId>& platform_node_ids, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when we write the stops (nodes) into a transit pbf tile we keep track of the gtfs ids assigned to those node which are platforms, thats what this map does. we need that because later we will ingest the part of the feed that references those platforms to connected them with routes/trips at some sort of schedule.
@@ -257,18 +257,29 @@ void setup_stops(Transit& tile, | |||
auto onestop_id = isGenerated ? tile_stop.stop_id + "_" + to_string(node_type) : tile_stop.stop_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to simplify our model of transit we require that every station (even if it doesnt in real life) has:
- at least one in/egress
- exactl one station
- at least one platform
this makes handling connecting the data somewhat easiser and it models the highest level of complexity we want to support (even though bus stops dont need this in almost all circumstances). anyway... on the line above though you'll notice that what we do when the gtfs feed doesnt come with one of these things (either in/egress or platform, as it always has a station) we fake one so that our simplified model is fulfilled. when we fake one we have to give it a unique id so we take its parent stations id and just add the type onto it (assuming real ones one use the type as a name).
// other gtfs entities will refer to it. however when it is generated we must use its parents id | ||
// and not the generated one because the references in the feed have no idea that we are doing | ||
// the generation of new ideas for non-existant platforms | ||
platform_node_ids[{tile_stop.stop_id, feed_path}] = node_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the problem with giving these generated platforms a new id is that that id is not known to any other entities in the gtfs feed. any schedule information (trips) that connection stations together will connect them by the stations id since from the gtfs feeds point of view there is no platform, valhalla is making that up. so later, when we want to look up with graphid we gave to a platform that we made up by its id, ie when we are iterating over the gtfs feeds trips we need to remember those graphids by their parent station id not their new made up generated id, because of course the feeds raw data doesnt know about that, its a valhalla convention
} | ||
node_id++; | ||
} | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from here down there are pretty much no real important changes, its all me re-naming the generic "stop_graphids" to the specific "platform_node_ids". i think this variable name makes it much more clear what information is going in there even though for the generated ones the gtfs ids will be the parent station (but it is the one to use to get that platforms graphid)
bool dest_is_generated = tile_info.tile_station_children.find({dest_stopId, currFeedPath}) != | ||
tile_info.tile_station_children.end(); | ||
auto dest_onestop_id = | ||
dest_is_generated ? dest_stopId | ||
: dest_stopId + "_" + to_string(NodeType::kMultiUseTransitPlatform); | ||
stop_pair->set_destination_onestop_id(dest_onestop_id); | ||
stop_pair->set_origin_graphid(kInvalidGraphId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first time i started debugging this problem i thought i could help myself by initializing these to invalid but what i forgot was that this is protobuf 2 so later on it easily just calls has_origin_graphid
rather than checking for our sentinel value. same for destination below
.stop_lat = station_one_ll->second.second, .stop_lon = station_one_ll->second.first, | ||
.parent_station = stopOneID, .location_type = gtfs::StopLocationType::EntranceExit, | ||
.stop_timezone = "America/Toronto", .wheelchair_boarding = "1", | ||
.stop_id = stopOneID + "_rotating_door_eh", .stop_name = gtfs::Text("POINT NEMO"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decided to use a custom id name rather than what valhalla would generate to increase hardness to input from real gtfs
.stop_lat = station_one_ll->second.second, .stop_lon = station_one_ll->second.first, | ||
.parent_station = stopOneID, .location_type = gtfs::StopLocationType::StopOrPlatform, | ||
.stop_timezone = "America/Toronto", .wheelchair_boarding = "1", | ||
.stop_id = stopOneID + "_ledge_to_the_train_bucko", .stop_name = gtfs::Text("POINT NEMO"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, just allowing it to have a non standard id
@@ -152,6 +152,15 @@ TEST(GtfsExample, WriteGtfs) { | |||
.stop_timezone = "America/Toronto", .wheelchair_boarding = "1", | |||
}; | |||
feed.add_stop(thirdStop); | |||
struct gtfs::Stop thirdStopPlatform { | |||
.stop_id = stopThreeID + "_walk_the_plank_pal", .stop_name = gtfs::Text("THIRD STOP"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes sure to test the case that we can find the graph id for platforms who have actual ids rather than being generated (like the second stations platform above)
stopOneID + "_ledge_to_the_train_bucko", | ||
stopTwoID, | ||
stopThreeID + "_walk_the_plank_pal", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have to update the trips information to reference the proper stops, you'll see that we reference the two non generated platforms by name but the middle one is generated and will use the parent station id to look up its platform when connecting during pbf generation
@@ -257,7 +269,7 @@ TEST(GtfsExample, WriteGtfs) { | |||
EXPECT_EQ(trips[0].trip_id, tripOneID); | |||
|
|||
const auto& stops = feed_reader.get_stops(); | |||
EXPECT_EQ(stops.size(), 5); | |||
EXPECT_EQ(stops.size(), 6); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a platform to the 3rd station above
EXPECT_TRUE(transit.stop_pairs(i).has_origin_graphid()); | ||
EXPECT_TRUE(transit.stop_pairs(i).has_destination_graphid()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to check for bad values, just check that they are actually set
std::string stop_pair_ids[3] = { | ||
stopOneID + "_ledge_to_the_train_bucko", | ||
stopTwoID + "_" + to_string(NodeType::kMultiUseTransitPlatform), | ||
stopThreeID + "_walk_the_plank_pal", | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have to look for the correct names now that they are different
EXPECT_EQ(*stops.find(stopID), stopID); | ||
EXPECT_TRUE(stops.find(stopID) != stops.end()); | ||
} | ||
for (const auto& stop_pair_id : stop_pair_ids) { | ||
EXPECT_EQ(*stops.find(stop_pair_id), stop_pair_id); | ||
EXPECT_TRUE(stops.find(stop_pair_id) != stops.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was fairly non-sensical
@@ -385,7 +398,7 @@ TEST(GtfsExample, MakeTile) { | |||
|
|||
GraphReader reader(pt.get_child("mjolnir")); | |||
auto graphids = reader.GetTileSet(); | |||
int totalNodes = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bad name made it hard to understand
EXPECT_EQ(totalNodes, 9); | ||
EXPECT_EQ(transit_nodes, 9); | ||
EXPECT_EQ(uses[Use::kRoad], 6); | ||
// TODO: transit connect edges should only exist between egress and street nodes | ||
// EXPECT_EQ(uses[Use::kTransitConnection], 6); | ||
// NOTE: there are 4 for every station (3 of those) because we connect to both ends of the closest | ||
// edge to the station and the connections are bidirectional (as per usual) | ||
EXPECT_EQ(uses[Use::kTransitConnection], 12); | ||
EXPECT_EQ(uses[Use::kPlatformConnection], 6); | ||
EXPECT_EQ(uses[Use::kEgressConnection], 6); | ||
// TODO: it looks like convert transit skips putting any actual rail edges into level 3 | ||
// EXPECT_EQ(uses[Use::kRail], 4); | ||
// TODO: this is the only time in the graph that we dont have opposing directed edges (should fix) | ||
EXPECT_EQ(uses[Use::kRail], 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of these pass now! that means good things!
After merging #3700 i wrote a huge pr description with the work left to do. i will convert that into an issue and we can burn down the tasks but today i bring you the first task already burned down. There were errors about not being able to find the origin/destination stops of some of the stop pairs (the connections between stations where you actually ride on public transit). this makes other things fail. thankfully the problem wasnt all that complex. i'll describe the fix in the diff, comments should as well
ok so the only sad thing about fixing the stitching error was that the error wasnt stitching at all. this means that in a separate pr we need to offset the graph a bit more so that a trip crosses tiles (forcing stitchign to have to happen). i can add this to the list of things we need to do