Tweet Analysis Helps Track the Spread of a Disease and Pinpoint Nerve Centres

By February 19, 2013

The analysis of tweets posted by users online allows researchers to predict the areas to avoid in case of epidemic.


It’s a fact. Twitter is increasingly being used in medical situations, despite growing concerns over its reliability. Up to now it has been possible to determine how fast certain epidemics – flu, for example – were spreading by analysing key words. Now however, geolocation data also enables the nerve centres of such epidemics to be pinpointed, thus perhaps helping people to avoid falling ill. Researchers at the University of Rochester in New York State decided to look at tweets posted by users who were ill, with the aim of developing an algorithm capable of mapping in space and time the places where the illnesses are most virulent. The app they have developed, called GermTracker, drew up in one month a list of the various places in the New York City Metropolitan area where people suffering from flu had been.

Easy-to-use app

The application ranks people according to their state of health by using colour-coding from green to red, drawing on information contained in their tweets. The GPS data – the app uses geo-tagged tweets – provides a means of placing these people on a map which anyone can consult. Scientists say the tool empowers health-focused New Yorkers, who can now locate flu-infested hot spots in real time on the map. “A New Yorker might check it before entering a subway station — then opt to take another route home,” explains Adam Sadilek, one of the researchers. GermTracker could also be used in parallel with other methods by government or local authorities to try to understand the explosion of an illness such as flu. As the algorithm would be very useful if it could be used on a massive scale, its ability to distinguish tweets which indicate that people are actually ill from those which don’t still needs greater refinement.

A new tool under constant improvement

One of the team describes the process as being like teaching a new language to an infant. The algorithm has to learn to distinguish a tweet which says, for example "I’m sick, I stayed in bed the whole day", from one that says “I’m sick of driving in all this traffic.” The algorithm is however being constantly updated since each time someone clicks on one of the colour dots representing a tweeter, it accesses the tweet and assesses whether it corresponds to a ‘sick’ tweeter or not. Other areas of collaboration could emerge from the use of GermTracker. The original paper* states that “while it concentrates on ‘traditional’ infectious diseases such as flu, similar techniques can be applied to study mental health disorders, such as depression, that have strong contagion patterns as well.”

*Modeling Spread of Disease from Social Interactions, authored by Adam Sadilek, Henry Kautz and Vincent Silenzio; University of Rochester.

Legal mentions © L’Atelier BNP Paribas