Data Science making the decisive difference
Data science can be awesome — that much we can all agree on.
But sometimes, the insights we glean from big data never make it beyond a cool visualisation or an interesting blog post.
GovTech data scientist Dr Daniel Lim thinks we can do better. Speaking at the IE Alumni Weekend Singapore on 1 April 2017, Dr Lim outlined the Singapore government’s efforts to use data science to make a real-world impact on public policy and operations.
Organised by Spain-headquartered IE University, the alumni event was themed 4th Industrial Revolution: Creative Age, and featured talks and panel discussions on how big data and artificial intelligence would transform society.
Not just computer science
“We’re trying to enable evidence-based decision making in government,” said Dr Lim, describing GovTech’s work.
This could involve a variety of scenarios, he added.
If a government agency seeks insights into a policy question, data scientists can present their findings in a slide deck, for example.
A slide deck, however, would not help address an operations-related problem; in this case, data scientists might build dashboards or decision-making tools that can be used on a daily basis.
Finally, if the issue concerns digital services for citizens, data-driven app design and customisation of the service to the target audience become important.
All this requires a multidisciplinary team, said Dr Lim.
“Data science is not just about computer science. You need people who can understand the business use case, articulate problems to stakeholders, and sell the significance of your insights to everyone.”
“We need to deliver an end-to-end solutioning, going all the way from problem definition to downstream implementation.”
Keeping an ear on the ground
One area that GovTech’s data scientists have been exploring is the use of machine learning to identify key topics in large corpuses of textual data, said Dr Lim.
Analysing more than 50,000 tributes sent in by members of the public after the death of Singapore’s founding Prime Minister Mr Lee Kuan Yew, for example, revealed distinct topic categories.
Tributes describing personal experiences of interacting with Mr Lee, for instance, were distinct from those that described people’s emotions and feelings after finding out about his death.
“This is an interesting use case that would make for a cool blog post, but if that is all we did then it is not good enough because there is no immediate impact on government,” said Dr Lim.
The GovTech team collaborated with HDB to extend the same data science methodology used in the LKY use case to analyse one hundred of thousand emails sent by the public to the Housing and Development Board (HDB).
The analysis revealed a key topic on key collection — people were writing in to HDB to request to change the pre-assigned time they were given to pick up their keys.
HDB has since acted on the insights from the analysis and created an online portal for key collection, where the public can choose a time that suits their schedule.
“The power of this analysis is that it allows you to quantify the problem, which helps convince management that it is something it needs to pay attention to.”
The data scientists then packaged their code into a front-end user interface, so that HDB staff could run the analysis on their own.
“If we want to democratise data science and push it to the retail level, we need to create front-end data science products so that staff who don’t know how to code are also able to use them in their work,” said Dr Lim.
“Today, such analysis can be done by a HDB communications or policy officer.”
“Emails are the largest single untapped source of information about what the public cares about. We receive millions of them, but we don’t look at them in a systematic way,” he added.
“But putting these data science tools into the hands of every officer enables us to increase our ability to understand what citizens care about.”
A healthy approach to try and try again
GovTech’s data scientists have also collaborated with SingHealth applied machine learning to healthcare, using it to analyse electronic medical records and identify patients at high risk of hospital readmission.
Their algorithm, which was published in an academic journal, was about 80 percent accurate, said Dr Lim.
“But if all we did was create algorithms and publish academic papers, there is zero impact on the ground,” he said. “We need to think about how to apply algorithms to change how hospitals do things at the operational level.”
Data science can help predict outcome but it does not tell you how to solve the problem at hand, he said.
“Healthcare givers still need to know the most effective way of addressing the problem. Should doctors change the way they treat high-risk patients? Should hospitals assign social workers to such patients? Would it help to tell patients their risk scores, in the hope that this would inspire them to lead healthier lifestyles?”
In an effort to find out, the researchers are planning a randomised controlled trial to test out various possible interventions.
“Data analytics only carries you so far,” said Dr Lim.
“The idea is to try various ways of operationalizing the algorithm. If a method works well, we roll it out; if it doesn’t, we fail fast and try something else.”