While working as a developer at Bing, I worked on a tool called queryprobe, which we used internally to provide optics into the Bing ranker and help the team debug failed queries.
I learnt from that experience the importance, when developing a search engine, of investing in internal tools to visualize and clarify what’s going on under the hood. Search engines are complex beasts and it’s not uncommon to spend hours trying to understand why a certain result was produced for a certain query.
Investing in internal tools makes the team more productive, and generally happier. My former PhD colleague Yaniv Bernstein (who co-authored a couple of research papers with me) recently wrote about his experiences working at IBM and Google. He explains that Google does a much better job of investing in internal tools, and how that makes a big difference to his productivity and sanity.
With that comment in mind, here’s an overview of a handful of tools we’ve developed over the last couple of years; each is designed to help us to make sense of results produced by Rome2rio‘s search algorithm.
Graph search visualization
We originally developed our graph search visualization to create a cool slide in our 2012 PhoCusWright presentation. Since then we’ve extended its functionality and turned it into a handy internal tool for explaining why Rome2rio did not display bus route X or flight path Y.
Search performance test
As we’ve said countless times before in this blog, search speed is critical to us. The Rome2rio routing engine combines a variety of stages to identify and display multi-modal search results. We use our performance measurement tool to visualize the time taken by each stage, for a variety of queries. This helps us ensure that search times for a stage don’t blow out when we make accuracy improvements to the algorithm.
Anyone can check out our train, bus and ferry route visualization by visiting www.rome2rio.com and clicking on the Transport button at the top-right corner of the map. This will switch to a monochrome map, with our own transport tiles rendered over the top. We find this incredibly useful for our own analysis of our transport coverage and whether our database has a particular train, bus or ferry.
Query heat map
This tool provides us with a heat map showing the destinations our users are searching on, and helps us understand which regions of the world are of priority to them.
Political region visualization
Rome2rio’s geocoder technology converts textual place names (such as “seattle”) to map co-ordinates for searching. One component of the geocoder is an awareness of political regions, which are required to identify that the city of Seattle is located in Washington state, in the USA. We have developed a visualizer to help check the correctness of this component.
Global taxi fares
We developed a repository of taxi fares when we launched our door-to-door indicative pricing feature this year. To assist with sanity checking the data, we developed a tool for visualizing taxi rates across the globe with dark green representing countries with more expensive fares.
Rome2rio uses a database of connected landmasses (such as islands) to aid the routing engine. For example, the search engine will look for ferries and flights to reach a destination on another landmass, instead of considering surface routes. Our landmass visualization tool helps us make sense of our landmass detection logic and check its accuracy.
Transport agency fares
Our indicative pricing system also uses a repository of train, bus and ferry fares for thousands of operators. We have developed tools to easily browse the data to check its validity.
Missing transport visualization
We have developed a system that replays searches from our user query logs and identifies segments missing a train, bus or ferry route. This is a powerful tool for prioritizing expansion of our transport coverage.
Saved trip visualization
We use this tool to browse trips saved by our users, providing an overview of the types of multi-hop itineraries that are being created.
Internal debugging tools can make a significant difference to how the development team identifies and improves on the shortcomings of a complex technical system. They can also have a dramatic effect on management’s planning and prioritization by providing important insights into user behavior. At Rome2rio, we’re pretty happy with the set of tools we’ve developed so far.