How do you get 3,200 train services per weekday where they need to be on time for their 46,000 daily timetabled stops? And if the trains are late, how do you ensure you’re giving useful information so that hundreds of thousands of commuters can work out how to get home as quickly and safely as possible?

Sydney Trains, Australia’s largest public transport system, has to get more than 1 million passenger journeys completed on time every weekday. That’s 1,000+ people per eight-carriage train, travelling on 1,548km of mainline track through more than 300 stations, 3,962 signals and over 968 bridges.

Even if we pride ourselves on our punctuality, there are those times when we just don’t get to where we want to be on time. We’re late. Maybe a work meeting ran too long, the checkout queue was horrendous, the children work up late and didn’t want breakfast or we just had too many places to be and not enough time to get there. And sometimes, we would have made it – if only that train, bus or ferry had turned up when it was supposed to.

For Sydney Trains, getting the timetable right, and having trains arrive on time, is more than just a nice-to-have. It’s essential. For every 1% improvement in timetabling efficiency, the organisation estimates it saves commuters $5 million in lost minutes spent waiting on a platform or being late at their destination.

The Problem

The problem for Sydney Trains is that timetabling public transport is complex. You have to take into account the hundreds of thousands of commuters travelling each day. Their interconnected journeys. How they get to the platform. Then there’s the mechanics – the trains themselves, the track, the signals, level crossings, tracks mounts, circuits and more. There’s workforce considerations, unexpected factors like breakdowns and much more. There are millions of data points daily which influence the likelihood of Sydney Trains arriving and departing as scheduled.

The Opportunity

The UTS Data Science Institute and Sydney Trains saw the opportunity to use advanced machine learning techniques to pull hundreds of millions of data points together and learn from them. The data scientists built models that integrates track, system and rail network data to identify patterns and predict impact of delays across the network. For the first time, the data inputs included footage from the 12,000+ CCTV camera, allowing the inclusion of traveller behaviour.

The Impact

The robust timetabling evaluation model uses machine learning to predict in real time the delay effect. The outcome of this application of the intelligent timetable evaluation technology significantly reduces delay-caused losses and increases operation efficiency, enabling the train operating system to meet performance metrics and recover from incidents. The model’s also provided interesting insights into both passenger behaviour and understanding the impact of different delays. One such example is a new understanding that platforms are likely to be much more crowded on hot days, particularly if the temperature is over 36 degrees. Another learning is that if trains are delayed by 30 seconds or more at Town Hall Station, then there will be unrecoverable delays across the whole network.

 

Interested in learning more about how this problem was solved?

We've created a “deep dive”, self-paced, free online course that showcases how data was used to solve this problem.

Could your business be leveraging data & AI tools and techniques?

Work with us so we can understand your organisational needs and tailor the learning experience to support your business.

Looking to upskill or reskill?

Enrol in a UTS microcredential or short course.