Digital Frontiers, Breaking Barriers

Big Open Data & Bangladesh

Late January news broke that big data sourced from the fitness devices was unwittingly giving away the aerial locations of secret US Army bases in places like Kandahar, Afghanistan. The data was uploaded onto a heat map created by a fitness tracking company called Strava, presumably to show the global interconnectivity of active, health conscious people. The map lit up overlapping running/cycling tracks all over the world--with most of the developed world getting covered by blazing hot trails that barely spared a spot.

In places like Afghanistan, most of the country remained black, with tiny miniscule glimmers of light here and there. When zoomed in enough, the lights expanded to show running tracks. The military went berserk and Strava had to incorporate new privacy measures for its users so as to pander to military interests.

This entire blip was fascinating because it showed how useful big data is--even something seemingly as innocuous as running data can play an important role in transparency. And while it did not work out so great for the US military, if you zoom into Bangladesh on the Strava heat map, you find something positive--the penetration of phones with GPS and, potentially, location collecting applications like Facebook.

For the first time probably, we have a cohesive, publicly available map of how well-connected to the internet/smartphones Bangladesh is. You see, Strava collects two types of data--the number of people running and the number of people cycling. In Bangladesh most of the data-points showing up on the map are “the number of people cycling”--except these are not really people cycling. They are people riding another type of vehicle that has about the same speed range as a bicycle--our celebrated rickshaw. (Anyone who has worn a fitness tracker on a rickshaw knows this). Strava overlooked this anomaly and accidentally gifted us one of the first public data-sets on internet connectivity.

This is what we can find from the map: Bangladesh, while not being blindingly lit up like the developed world, is not pitch-black like Afghanistan either. All major thoroughfares light up in bright yellow, even surprisingly in remote corners of the country like Hatiya and Shah Porir Dwip.

All of these we had to wait to get from a foreign athletics company when our own telecom companies routinely collect this information, but have never made it public for research purposes. Big data exists behind closed doors, but the idea that it is something that could be analysed by anyone, and not just by the company owning it, for public benefit is a completely new concept that is yet to take hold in Bangladesh.

For example, from the Strava map you can also see the parts which are completely pitch-black, where smartphones are probably not used as much.  These include the haor zones, coastal Sundarbans and some parts of Chittagong Hill Tracts. Seeing this, we can ask crucial questions like, why is coastal Hatiya so well-connected to digital services but Charfassion, which is a few miles away, isn't? 

In the last year or two the government itself is realising the concept of open data. We now have an open data portal with a motley collection, a Sustainable Development Goals Tracker, and many of the government websites are well updated with documents. These laudable efforts definitely shows effort in transparency but these are not big data. Here's a good way to identify what is not big data: if it can be loaded on Microsoft Excel without the program dying in protest, it is not big data.

The road accident data of the government's open data portal for example, is simply an amalgamation of the totals of the last few years. This file could have been useful if the government published records of each individual incident. The record should have had the approximate latitude and longitude of the accident location (which can easily be obtained by a police officer should they wish to, except they don't). This would have then helped local administrations plan the roads better--where to put a speed breaker, where to block a turn etc.

In an atmosphere where obtaining data is an arduous process to begin with, think tanks have been doing this thankless job. This however is not enough--the main concept of open data is that it fosters democratic use of data under the assumption that it will be used for the good the public. In data deserts like our country, everyone including the corporate must really step up and release raw data.

 

The writer is a Reporter at the Star Weekend.

Comments

Big Open Data & Bangladesh

Late January news broke that big data sourced from the fitness devices was unwittingly giving away the aerial locations of secret US Army bases in places like Kandahar, Afghanistan. The data was uploaded onto a heat map created by a fitness tracking company called Strava, presumably to show the global interconnectivity of active, health conscious people. The map lit up overlapping running/cycling tracks all over the world--with most of the developed world getting covered by blazing hot trails that barely spared a spot.

In places like Afghanistan, most of the country remained black, with tiny miniscule glimmers of light here and there. When zoomed in enough, the lights expanded to show running tracks. The military went berserk and Strava had to incorporate new privacy measures for its users so as to pander to military interests.

This entire blip was fascinating because it showed how useful big data is--even something seemingly as innocuous as running data can play an important role in transparency. And while it did not work out so great for the US military, if you zoom into Bangladesh on the Strava heat map, you find something positive--the penetration of phones with GPS and, potentially, location collecting applications like Facebook.

For the first time probably, we have a cohesive, publicly available map of how well-connected to the internet/smartphones Bangladesh is. You see, Strava collects two types of data--the number of people running and the number of people cycling. In Bangladesh most of the data-points showing up on the map are “the number of people cycling”--except these are not really people cycling. They are people riding another type of vehicle that has about the same speed range as a bicycle--our celebrated rickshaw. (Anyone who has worn a fitness tracker on a rickshaw knows this). Strava overlooked this anomaly and accidentally gifted us one of the first public data-sets on internet connectivity.

This is what we can find from the map: Bangladesh, while not being blindingly lit up like the developed world, is not pitch-black like Afghanistan either. All major thoroughfares light up in bright yellow, even surprisingly in remote corners of the country like Hatiya and Shah Porir Dwip.

All of these we had to wait to get from a foreign athletics company when our own telecom companies routinely collect this information, but have never made it public for research purposes. Big data exists behind closed doors, but the idea that it is something that could be analysed by anyone, and not just by the company owning it, for public benefit is a completely new concept that is yet to take hold in Bangladesh.

For example, from the Strava map you can also see the parts which are completely pitch-black, where smartphones are probably not used as much.  These include the haor zones, coastal Sundarbans and some parts of Chittagong Hill Tracts. Seeing this, we can ask crucial questions like, why is coastal Hatiya so well-connected to digital services but Charfassion, which is a few miles away, isn't? 

In the last year or two the government itself is realising the concept of open data. We now have an open data portal with a motley collection, a Sustainable Development Goals Tracker, and many of the government websites are well updated with documents. These laudable efforts definitely shows effort in transparency but these are not big data. Here's a good way to identify what is not big data: if it can be loaded on Microsoft Excel without the program dying in protest, it is not big data.

The road accident data of the government's open data portal for example, is simply an amalgamation of the totals of the last few years. This file could have been useful if the government published records of each individual incident. The record should have had the approximate latitude and longitude of the accident location (which can easily be obtained by a police officer should they wish to, except they don't). This would have then helped local administrations plan the roads better--where to put a speed breaker, where to block a turn etc.

In an atmosphere where obtaining data is an arduous process to begin with, think tanks have been doing this thankless job. This however is not enough--the main concept of open data is that it fosters democratic use of data under the assumption that it will be used for the good the public. In data deserts like our country, everyone including the corporate must really step up and release raw data.

 

The writer is a Reporter at the Star Weekend.

Comments