Which Washington-area bus system was the first to offer its bus position data in an open standard? Would you believe it’s Ride On?

In part 2, we talked about how there are many different APIs — application programming interfaces, the way one computer system, like an app, gets data from to another, like bus positions from a transit agency. The fact that there are so many APIs means many apps don’t include all of the types of buses in the region that have real-time positions and predictions.

Prince George’s The Bus, Fairfax City CUE and the DC Circulator are available using NextBus, Inc.‘s API, which is one of the most common because many agencies contract with NextBus, Inc. WMATA also contracts with NextBus, Inc. but doesn’t use its API; WMATA built its own. ART has a different one entirely.

Since NextBus is most common, some residents asked Montgomery County officials why RideOn is not part of NextBus, too. One was Evan Glass, who tweeted last May:

Why is MoCo's Ride On bus system not accessible on the #NextBus app when all other jurisdiction are? #14bus cc @hansriemer (@EvanMGlass)

Note that Glass was talking about the “NextBus DC” app, the one that died this past December and, people discovered, actually wasn’t from the same company as the one that provides bus prediction services to many transit agencies.

Councilmember Hans Riemer passed the question on to Ride On officials. Carolyn Biggins replied:

Recently, our staff met with a representative of NextBus to discuss products and costs. Although NextBus has not yet given Montgomery County a firm price quote, they offered a ballpark figure of approximately $55,000 per year for operating costs. This would cover a barebones system which would only have their mobile and desktop web site along with a suite of management tools. There are also undetermined setup fees, probably starting around $15,000 but possibly much higher. …

At this point the inclusion of NextBus into the Ride On Real Time customer information product line is actively open for discussion. Feedback from our customers and industry critics point us in various directions and toward various apps; and, interestingly, NextBus is not at the top of our customer’s request list.

Besides our Eastbanc/Nerds Ride On Real Time App (available for iPhone and android) which, by the way, includes integrated real time data from Metrobus, Metrorail and several Northern Virginia jurisdictions, our customers have asked us to integrate into the “DC Metro Transit App” and “OneBusAway.”

We have been working with developers for DC Metro Transit App who recently responded to us with a very encouraging post about our open data: “This seems well thought out and documented. It is also nice that you can get the data in both JSON or XML [the 2 most popular formats for getting data from APIs] in a restful service [basically, a way of making APIs easier for the app developer to use]. I’ll give it a try in the app and let you know if I have any questions. You guys are ahead of the curve compared to other agencies.”

As you mentioned in your recent e-letter, Open Data and public/private initiatives, such as 3rd party app development, is the wave of the future — to “disrupt and create.” 3rd party app development not only unleashes the initiative of the private sector but also provides varied choices for our citizens: the delivery of information in many different formats to suit different consumers with varied needs and tastes.

In developing our Ride On Real Time system, Transit Services has taken this approach, both through internal product development but also by providing its data in as many different formats as possible while trying to maintain fiscal responsibility. We will continue to work with NextBus and other vendors to try and provide Montgomery County citizens the very best in transit information and customer service.

(Notes in brackets added.)

Biggins is right. The solution to the problem of Ride On not being part of many existing apps is not to work with any particular vendor, but to provide open data in more formats.

It’s particularly good to hear this from Ride On, because at first they did it wrong, and contracted with a software developer just to build them a website where people can track buses, but with no way for 3rd party app developers (in other words, people who aren’t the agency or one of its contractors) to access the data.

Following prodding from Kurt Raschke, us, and others, Ride On started offering an API, and even fairly quickly improved it based on feedback from Raschke and other developers.

Why doesn’t everyone just use GTFS?

In the area of transit schedules, one standard has largely emerged as the most common, and one all transit agencies ought to offer: the General Transit Feed Specification, or GTFS. GTFS is basically a set of big files that contain every single stop location and all of the schedules for the transit system. You can download it, write code to analyze it, and then do whatever you want.

There’s an analogue of GTFS for real-time buses, called GTFS-realtime. However, real-time is not the same as schedules. With schedules, you can download the whole thing once and it basically won’t change except every few months. With real-time bus tracking, the positions change every minute.

GTFS-realtime lets you download the entire set of bus positions as they constantly change. It’s a huge amount of data. For some applications, like if you’re making a live map showing buses, that’s what you want. For the typical smartphone app, where you just want one bus position at a time, it’s too much. That much data would overtax the user’s data plan and burden the phone trying to deal with it all.

Other APIs, like the NextBus and WMATA APIs, work differently. For those, an app sends it only the very specific question it wants answered, like asking for next arrivals at a particular bus stop.

Twitter, as an analogy, has both types of APIs. For most uses, you use a more transactional API. You ask Twitter for a list of recent tweets matching a hashtag, or ask it to post a specific tweet. But Twitter also offers a “firehose” API where certain users, who have to be approved ahead of time, can get the entire stream of all tweets, everywhere.

We need GTFS-realtime AND a transactional API

Ultimately, for transit, there needs to be both. If you’re building a smartphone app, it’s too hard to get the firehose of all bus positions, and easier to ask one simple question. But if you’re designing a real-time screen, it’s a burden to ask for each possible bus and bus stop every minute; you’d rather just get all the data at once.

WMATA’s API also goes through another service, called Mashery, which limits how many of these API questions you can ask in a set period of time. The intent is to keep someone from overwhelming WMATA’s systems and crashing them. But when Eric Fidler was building the real-time screen demos, he found that just asking for a few bus lines at nearby bus stops every minute, his system quickly hit the limit.

Plus, since one server was running many screens at once, the more screens, the quicker you hit the limit. We kept asking WMATA to increase the limit, and they did, but for many applications these limits will quickly become untenable.

Every transit agency ought to provide GTFS-realtime feeds for those that need them. ART’s vendor, Connexionz, now also offers it, making 2 area agencies that do. Others should join Ride On and ART and offer this feed as well. Often it will be the agency’s API contractor that offers it; agencies that pay NextBus for bus tracking services should require NextBus to offer a GTFS-realtime feed.

What’s the common transactional API?

At the same time, we need a transactional API, ideally a common one. If everyone used the same API, it would be really easy for app developers to support all of the region’s (or the nation’s or world’s) bus systems.

Unfortunately, there is no consensus here, unlike with GTFS. Most APIs are nonstandard ones an agency’s IT staff or its contractor devised. New York uses the European standard SIRI, but had to make some changes of its own, and few US agencies use that. NextBus’s is pretty widespread since that company serves a lot of agencies.

What to do? There are a few solutions.

First of all, everyone could get together and try to coalesce around an existing standard. It doesn’t really matter which. It doesn’t have to be the best one. Most standards are pretty imperfect; we type on QWERTY keyboards, which are one of the least efficient keyboard layouts you could devise, but any effort to come up with something else has failed. There’s a strong lock-in, but to some extent, it doesn’t really matter; we manage to type fine.

We could use SIRI; Europe does. Or NextBus could make their API a standard. Google did this with GTFS. Google initially created GTFS, but then they stopped controlling it and let the community of developers and agencies take control. They changed the “G” to stand for “General” instead of “Google.” Many standards in computing started out as some company’s property, but they transferred it to some national or international committee to shepherd.

If NextBus wanted to do this, they would probably want to give it a different trademark, so an agency offering the API wouldn’t be saying they offer “NextBus” (we’ve had enough problems with NextBus trademark confusion already). And they would need to let other agencies and developers make changes, through some process, without the company having control.

Another approach would be to not worry about this at all. It’s not all that hard to write some code to interact with multiple APIs, as long as they have a few features that you need to make them interoperable, like common identifiers. In the next part, we’ll talk about this.

Some other company or entity could also set up an intermediary computer system that takes in all of the data on one end, and lets app developers connect to it. It would have to get the “firehose” style data from the agencies, and can then even offer 5, 10, or 50 different styles of APIs on the other.

What has to happen for that to be possible? For one, someone has to maintain it and pay for the bandwidth. An organization like COG, or a partnership of the DC, Maryland, and Virginia state DOTs, could do it. Or, to go national, a group like APTA or a federal agency could provide it. Or, perhaps some private entity would find it worthwhile, though the amount of revenue they could make is probably limited.

But for that to happen, the agencies have to offer the “firehose” of GTFS-realtime. For that reason, while there isn’t consensus around all of the APIs, our region’s transit agencies can and should take one step now, to offer GTFS-realtime, as Ride On and ART now do.