UW researchers estimate poverty and wealth from cell phone metadata

The northern and western provinces are divided into cells (the smallest administrative unit of the country), and the cell is shaded according to the average (predicted) wealth of all mobile subscribers in that cell. The southern province is overlaid with a Voronoi diagram that uses geographic identifiers in the call data to divide the region into several hundred thousand small partitions, which each may be as small as a household or a microvillage. — The northern and western provinces are divided into cells (the smallest administrative unit of the country), and the cell is shaded according to the average predicted wealth of all cell phone owners in that region. The southern province is overlaid with a diagram that uses geographic identifiers in the call data to divide the region into several hundred thousand small partitions, which each may be as small as a household or a microvillage. The darker the area, the greater the wealth. Photo: Joshua Blumenstock

In developing or war-ravaged countries where government censuses are few and far between, gathering data for public services or policymaking can be difficult, dangerous or near-impossible. Big data is, after all, mainly a First World opportunity.

But cell towers are easier to install than telephone land lines, even in such challenged areas, and mobile or cellular phones are widely used among the poor and wealthy alike.

Now, researchers with the University of Washington Information School and Computer Science and Engineering Department have devised a way to estimate the distribution of wealth and poverty in an area by studying metadata from calls and texts made on cell phones. Such metadata contains information about the time, location and nature of the “mobile phone events” but not their content. Their paper was published Nov. 27 in the journal Science.

“Quantitative, rigorous measurements are key to making important decisions about social welfare allocation and the distribution of humanitarian aid,” said lead author Joshua Blumenstock, assistant professor in the UW Information School, who is also an adjunct professor in computer science and engineering. “But in a lot of developing countries high-quality data doesn’t exist.

“What we show in this paper, and I think fairly clearly, is that phone data can be used to estimate wealth and poverty.”

The research was performed in Rwanda, a nation of 11 million-some people in East Africa. There in 2009, while still working on his dissertation, Blumenstock oversaw students from the Kigali Institute of Science and Technology as they conducted telephone interviews with 1,000 mobile phone owners chosen at random.

The questions were designed to learn where those individuals fell on the socioeconomic ladder and what the “signature” of wealth is in the metadata — that is, what cell phone habits are particular to those who are relatively wealthy.

“For those thousand people, we know roughly whether they’re rich or poor. That’s the ground truth that anchors the data to reality,” Blumenstock said.

The researchers then linked that information to metadata about mobile phone use provided by a Rwandan telephone company to determine the hallmarks of socioeconomic status in the data.

Simple patterns emerged — for instance that wealthier people tended to make more calls than poorer people. But that’s just one of thousands of bits of information that aid this process.

Other hints of wealth or poverty in metadata are:

The way people pre-pay for phone time; those buying $10 worth of time tend to be wealthier than those buying 50 cents of time.

The daily rhythm of calls made — those phoning during daytime business hours are systematically different from those who make irregular calls, perhaps because they are more likely to be “white-collar” workers.

The degree to which a person is more likely to make than receive phone calls. Since in Rwanda the caller pays for the call, poorer people tend to receive more calls than they make. This also reflects a phenomenon called “flashing,” where a poorer person calls a wealthier friend and quickly hangs up, thus sending the signal that they should call back.

“In practice it’s not simple,” Blumenstock said. “We use supervised machine learning algorithms to sort through thousands of patterns to figure out what is most correlated with wealth and poverty. But once we know which mobile phone patterns are indicative of wealth, we can extrapolate to the country’s one and a half million cell phone users. We just see for each person thereafter what pattern they follow — the wealthy pattern or the poor pattern.”

Blumenstock’s UW co-author, Gabriel Cadamuro, a graduate student in computer science and engineering, said the team tried not to bring expectations of which aspects of the metadata might be found useful for predicting wealth.

“Using the appropriate machine learning technique enabled us to determine which of these values were the most useful,” Cadamuro said, “and we noticed that in doing it this way that we picked up a lot we would have missed had we tried to go purely via our intuition.”

That information is then overlaid onto area maps to create a visual representation of the geographic distribution of wealth, from the district level to that of households or microvillages.

Blumenstock emphasized that the research is conducted in a way that respects ethical standards and the privacy of the callers, as well as the competitive interests of the phone company providing the data.

Not all governments are able to conduct population censuses and household surveys, and some go decades in between. In Rwanda, household surveys occur every three to five years. Blumenstock said based on the government’s 2010 survey, the 2009 mobile phone metadata proved more effective at indicating wealth and poverty than the previous Rwandan government survey in 2007.

Blumenstock chosen for Gates Foundation’s “Grand Challenges Explorations”
The researcher has received a $100,000 grant for his research, “Billions of Transactions, Thousands of Photos: Combining Mobile Network Operator Data with Crowd-Sourced Photographs to Measure the Availability and Use of Digital Financial Services.” The work will take place in Ghana. Learn more online.

Blumenstock and colleagues suggest that governments might use this sort of survey process, which costs about $10,000, rather than spend millions on a formal countrywide census.

“We are saying, if you have nothing else and can’t survey the outer regions of the country, this creates an option to spend $10,000 and get interim estimates of what things look like, and to construct a higher-resolution estimate of the geographic distribution of wealth,” he said.

This early work is mostly “proof of concept” at this stage, Blumenstock said, but the researchers can envision many practical uses to come.

Cadamuro said, “We are hopeful that this broad approach to detecting signals means that the methodology would work even on different call networks from different countries.”

“What else could you measure that would be useful?” Blumenstock asked. “You could imagine using data from Twitter, Internet use, satellite and weather stations — all this data — to measure population vulnerability, or to make better policy,” he said.

“Maybe you could even detect with phone data whether people have been skipping meals — it doesn’t seem to me that far-fetched.”

The other co-author is Robert On, a graduate student at the University of California, Berkeley.

The research was funded by the NSF; the Institute for Money, Technology, and Financial Inclusion; and the Gates Foundation.

###

For more information, contact Blumenstock at 206-685-8746 or joshblum@uw.edu.

NSF grant #1025103
Institute for Money, Technology, and Financial Inclusion grant # 2010-2366
Gates Foundation grant # OPP1106936

UW NEWS

This page has been archived and is not updated.