In 1872, a man named George Smith made an astonishing discovery in the archives of the British Museum: A Mesopotamian clay fragment, predating the Hebrew Old Testament by several centuries, which bore a cuneiform text account of the Great Flood.
Legend has it that Smith was so excited by his find that he took off all his clothes and ran screaming through the Museum’s hallowed halls.
As a rule, GovTech’s data scientists prefer to remain fully clothed.
But they do experience a similar sense of joy when new insights emerge from amid the chaos of big data, said Mr Liu Feng-Yuan, Director of GovTech’s Government Digital Services (GDS) Data Science Division.
Mr Liu was speaking at TEDxNUS, an independently organised TED event held at the National University of Singapore on 18 March 2017. Themed ‘Hidden City’, the event featured talks and performances on a range of “uniquely Singaporean” issues.
Data in the driver’s seat
“I’m very passionate about data-driven policy-making, as opposed to policy-driven data-making,” said Mr Liu.
The latter, he noted, occurs when people generate data for the purpose of backing up policies they have already made.
Accomplishing the former, said Mr Liu, requires a multidisciplinary effort. His team at GovTech works to bridge the gap between policy-makers and information technology departments, both of which often work in silos.
The challenge lies in figuring out how to get people from diverse backgrounds to talk to one another and work as a cohesive team.
“You need computer scientists who can write code and build apps,” he said.
“But you also need social scientists who ask the right public policy questions. You need designers and data visualizers to present information in compelling ways.”
The ability to visualise information in different ways came in handy in 2016, when GovTech’s data scientists worked with other agencies to solve the mystery of the Circle Line train breakdowns.
Train of thought
The intermittent breakdowns, which began in August 2016, occurred when trains lost signal communication with base stations, and thus came to a halt for safety reasons.
After a particularly disruptive peak hour breakdown in November, Mr Liu was asked to get his team to take a stab at figuring out what was happening.
Despite receiving this SOS call early on a Saturday morning, he was able to round up several willing volunteers to work on the problem.
(Editor: They would give the shirts off their backs to help other agencies solve a data mystery.)
The team experimented with various ways of visualising the data, which consisted of the location and time of each breakdown, the train involved, and the direction in which the train was moving. But nothing jumped out as to what the cause could be.
A turning point came, he said, when the team tried out a visualisation called a Marey chart, commonly used to analyse transportation systems.
The chart plots the timing of each train’s stops against distance, represented as stations; traces of trains moving through the system thus appear as diagonal lines.
The team created their own version of a Marey chart, plotting the time and location of each train fault. They noticed a pattern emerging: faults tended to occur in sequence, in the same direction, and at the speed of a moving train.
This led them to the hunch that the culprit could be a train passing by on the other side of the track, leaving broken-down trains in its wake. “We thought we were looking at the boat, but we were actually looking at the waves caused by the boat,” explained Mr Liu.
By looking at which trains were in operation when faults occurred, they identified a single suspect: Train PV46. The very next day, transport authorities ran tests on the Circle line that confirmed the team’s hypothesis — once PV46 was put into operation, train faults repeatedly occurred.
The rogue train was pulled out of service, and the breakdowns that had plagued the system and affected thousands of commuters for months stopped completely.
“This is a really cool example of how data can solve real-world problems,” concluded Mr Liu.
Getting the right people together
In addition to the Circle line, Mr Liu’s team has also worked on other projects in urban mobility, such as the Beeline app, which allows users to book seats on direct bus routes, and to crowdsource new routes.
GovTech also runs a Technology Associate Programme (TAP), which trains recent graduates in technical knowhow and professional skills. One of the data scientists involved in the Circle Line analysis was part of this programme, and was only a few months out of university, said Mr Liu.
In other words, the passion to take off your shirt and dive deep into the data may matter more than work experience.
“If you've got the right team of people, it doesn't matter whether you're very experienced or not,” he added.
“You can do great things with data — it’s about empowering different groups of people and allowing them to do their best.”