Log in

View Full Version : Second Time in 24h whole Commuter Rail Shuts Down


Jamie2k9
04-11-2014, 19:04
Actually unreliable that in the space of 24h bar Maynooth line the whole commuter network was closed due to signal fault. Only in Ireland would something like this happen. Once is acceptable but more than once isn't.

Clearly some update was carried out without testing.

Kilocharlie
04-11-2014, 19:16
Actually unreliable that in the space of 24h bar Maynooth line the whole commuter network was closed due to signal fault. Only in Ireland would something like this happen. Once is acceptable but more than once isn't.

Clearly some update was carried out without testing.

bar Heuston!

Mark Gleeson
05-11-2014, 11:30
Irish Rail as always is not providing any explanation based on what happened this was a centralised problem


Accidentally pressing the all stop plunger
Power failure at CTC building
Failure of the auto routing software. This relies on the train schedule and the train describer to automatically set the points and trigger level crossings based on the train type.
Train describer failure, would result in a loss of knowledge of which train is which, in turn the auto routing software would fail
Train schedule data, two trains with the same ID in different places would cause issues



In all the above cases, there are two backups

Take local control of the emergency control rooms (ECP) and place into automatic mode, staff the backup panel in Connolly, Drogheda
Take manual control at the ECP's but turning the control switch, requires member of staff on the ground and operate as a traditional signal cabin

Mark Gleeson
05-11-2014, 15:40
Word is its a software bug, it relates to a specific location apparently

Big question is this something high level like auto routing or trains describer or something murky in the SSI/CBI data

Mark Gleeson
05-11-2014, 17:50
Still moving right now, we are at the point of collapse yesterday so far so good

Jamie2k9
20-11-2014, 18:48
They posted on twitter:

We apologise for the recent recurring Dart/Commuter signal fault. Computer system issue, fault now identified, fix to be implemented tonight

James Howard
20-11-2014, 19:21
Well it had another little lie down this evening but I was lucky enough to get away early on a 29K headed for Sligo.

Hopefully that will be the end of this bout of madness.

Mark Gleeson
21-11-2014, 07:13
Since they finally found the bug...

The CTC system has two mainframe computers. Active and standby. If the active fails the standby takes over with no impact noticeable

In the case of the 4 failures recently, both mainframes crashed at exactly the same time, since both run the same software at the same time, this points to a software bug

As far as we know all crashes occurred around 17:40-17:50 so points to bad timetable data being loaded in

Jamie2k9
21-11-2014, 19:12
So they have being trying to prevent this fault between 17.40-50 every single day up to last night?

Mark Gleeson
21-11-2014, 20:28
All we know is at a certain time between 17:40 and 17:50 a series of events occurs which if they occur in a certain sequence may cause a systems failure.

There are two machines so failure of hardware etc is covered.

Its fairly complex IT system, its a big mainframe. This is old school computing so not trivial to debug