Imagine a presentation that shows how millions of dollars of World Bank money are operating in one area of one Indian city alone. At the Open Data Camp in Bangalore, held over the weekend of 2-3 March, this visualisation of data from the World Bank project in Bommanahalli, Bangalore got everyone’s attention, in part because of the sheer monitoring opportunity the data afforded citizens.
Getting data and getting it clean was the theme of this two-day camp that afforded a fascinating glimpse of the sweeping range of possibilities of data. Data is emerging as a whole new language and the world wide web makes it fabulously dynamic.
Telling stories with data
“A single set of data can tell so many stories and it’s all in the way you represent it. There is no dearth of tools and story-telling mechanisms to make facts and data visually powerful” says S. Anand, Chief Data Scientist at Gramener - a data visualisation company that involves processing of tera-byte scale data.
The opening talk was about “Visualising Politics” by Anand; scraping data from CCEA (Cabinet Committee on Economic Affairs) Decisions, CCI (Cabinet committee on Infrastructure) Decisions and Cabinet decisions information put online by the GoI, he showed how such data, when presented suitably, could provide a clear glimpse into many facets of political development and discourse in the country. The transformation of data from what is available on the website of the Press Information Bureau into what Anand projected in one of his slides is brilliant and signals only the starting point of a data revolution.
Anand’s visual depiction of government data could easily engage one for an hour or more - there was so much information in it and yet, it was easy on the memory (for those who rely on visual memory). The information on one slide captured statistics about the number of times certain “keywords” were used in Cabinet Committee meetings and decisions, tactful use of which could make it possible to assess what the government’s priorities are. For example, one sees a distinctly greater emphasis on decisions related to ‘states’ post 2009, as opposed to the focus on ‘central’ in the pre-2009 era. Also, evidently, with the dwindling incidence of ‘agreements’ in meetings and decisions, one could conclude that the number of international agreements has declined dramatically since 2009.
Anand talks about Parliament decisions and ways to interpret them using data visualisation. Source: http://www.slideshare.net/gramener/visualising-politics
In another session, Alka Mishra, Senior Technical Director, National Informatics Centre (NIC) discussed the importance of open data in the government and the drive to enable this through “Data.Gov.In.” Under the National Data Sharing and Accessibility Policy (NDSAP, notified in March 2012) there were calls for nominations across all government bodies (state and centre) to prepare open datasets defined under the “Open Government Data” of the NDSAP guidelines as:
“A dataset is said to be open if anyone is free to use, reuse, and redistribute it – Open Data shall be machine readable and it should also be easily accessible.”
According to Alka, of the 80 government departments, 67 nominated themselves but only 15 out of the 67 have actually provided datasets that are made available on the data.gov.in website. The datasets include performance reports, statistics, various government schemes and there are 147 such sets available in various formats.
In a panel discussion entitled “Why Open Data is Good for Government and Citizens Alike” that also included Meera K, Editor of Bangalore’s interactive newsmagazine, Citizen Matters, and Shekhar Krishnan, PhD Student, Science Technology and Society, MIT, Alka emphasised the importance of not just the importance of open data, but of data that is ‘readable’ and ‘useable’ and conforms to international open data formats.
Many amongst the audience expressed concerns about formats like pdf and excel that make usability and processing difficult. Converting pdf to csv, which is open data format, takes time. “It takes about 8 hours to process 60 pdfs with tabulated data. So the bulk of our work is in getting clean data while representation of the data takes much less time,” says Anand as he shares his day-to-day experience in dealing with data.
Panel discussion. Alka Mishra of NIC, Meera K, co-founder, Citizen Matters and Shekhar Krishnan, PhD Scholar from MIT. Pic by: Shamala Kittane
“It is difficult to explain the requirement of ‘clean data’ and ‘open data formats’ to government officials,” Alka admits. She goes on to clarify that there is an effort to sustain the NDSAP by setting up of NDSAP Cells in various government offices. “They ensure that government makes datasets available regularly and that the data is clean before it goes on to the data.gov.in website. The Cabinet Secretary holds a meeting every month to address the issue of slow response of government organisations to NDSAP and ways to encourage opening up of data across government offices” shares Alka in the discussion.
“Since NDSAP is about proactive data aggregation, we cannot impose it on officials. It is unlike RTI where data is provided on demand by citizens. Communities coming together to demand open data from various government offices is the much needed catalyst to pace up this initiative of NDSAP,” Alka says in response to Meera’s suggestion to mandate availability of basic information like ‘project details’, ‘tender notifications’, ‘budgets and funds’ of all government departments at all levels (state, municipality, ward). “It is easier for us to push for this at the level of Centre,” she adds.
The efforts involved in getting data does not dampen the spirits of all data enthusiasts. According to some, it is exciting to know the shapes and forms that data can take if we know how to use it and if there is a strong reason to use it. For Deepa Gupta, Founding Director, Jhatkaa.org, use of data and technology are time-sensitive. She is exploring ways to use technology to transfer data quickly among large numbers of people, transcending geographical boundaries, so as to make activism more effective. “There are masses of people rising up against similar issues and it is scattered in time and space. It needs to be collective and sustained, and solidarity is important. Passion drives us and we need to capture that as data and use it to keep us going.” she says.
Open Data Camp, Bangalore, March 01, 2013 . Pic by: Shamala Kittane
What Deepa talks about is crowdsourcing of data - something that many organisations working in the social sector have benefitted from. “Ushahidi” was born of a distress situation in Nairobi when elections caused wide spread violence and the government shut out the media. Bloggers became the de facto reporters of news, they went mobile and quickly put together an online web site that had all information regarding safe booths to vote at, areas to be avoided etc. Ushahidi’s crowdsourcing app today tracks power cuts in India and any of us could become a part of it to make it more accurate.
Crowdsourcing data is best represented using maps. Arun Ganesh who graduated from National Institute of Design and has worked with Wikimedia foundation is one individual now engaged in creating afree geodatabase of India usingOpenStreetMap and building an online public transit information system for the country.
Open Data Technology
Involves anybody and everybody who is affected by power cuts. Live data that people can report and is made available online using Twitter and other mediums like Mobile Web, SMS, Smartphone Apps
A better understanding of demand for electricity; it also highlights where there is a need for investment in power production and supply. Now news reports news about lost productivity as well as potential ways to use cleaner alternative energy sources is also being made available. SRC: http://powercuts.in/page/index/2
Crowd sourcing and GPS, Wiki Maps (Supporting Free Libre and Open Source Software and geospatial software/data). More here
Open Street Map (Arun Ganesh/ planemad)
Bus users can report mapped stages, unmapped routes, erroneous routes online or through their smart phones
Making maps and commute easily available and easy to fix http://busroutes.in/chennai/
All of these benefit the educated, middle and higher classes of the society. “But what happens to the invisible sections of the society” questions Siddharth Hande, Fellow, Hyderabad Urban Lab in his rather insightful presentation titled “Examining data practices: The case of Cyberabad’s publicly accessible crime”.
Hyderabad police is making crime data available and open, but Hande recalls a recent incident to draw attention to the suppressed side of it. Hussain Sagar perceived and projected as an important ‘consumption centre’ is barely acknowledged as also being a ‘work hub’ for street sweepers and sex workers. This section (kept invisible for various reasons) was driven to a discreet location of the park by the police force and kept away from the Midnight March on the night of 5 January 2013 between 10 pm to 1 am, in which 10,000 participants marched around the lake, claiming the rights of women to walk on streets in the night without fear. However, the workers (sex workers and street sweepers) at Hussain Sagar were excluded deliberately from this as narrated by Hande.
Clearly, this raises serious questions with regard to the reliability of open crime data: ‘how and why certain data gets projected,’ ‘who wants this data to be visualised’ and so on. There are also questions on ‘data privacy’ with Hyderabad’s open crime data including the phone numbers of people. Hande’s presentation points to the clear need for data practices to be constantly examined and made more sensitive and standardised.
The 'description' section could contain sensitive information and is not suitable to be made available publicly. No rules are in place yet to regulate this section of the complaint. Source: Presentation made by Sidharth Hande
Bridging the divide
Sharing his farming experiences in Devarayanadurga, Dinesh T B from Servelots revealed that the organization, under the banner of ‘Janastu,’ is working on a project called “Re-narration Web.” Named “Alipi’ (pronounced ‘aa’ ‘lipi’ in Kannada - meaning non literate), the project hopes to bring Internet to the ‘10 per cent of the 10 per cent.’ 10 per cent of the total population in India had access to the Internet as of December 2011, and 90 per cent of this 10 per cent is urban population.
“Everybody seeks out mediated service. Why would anybody want to read the budget data if someone is providing them a concise analysis of the same ? The media (electronic and print) have made progress in this direction” - these were observations that led Dinesh and his team to immediately understand the need to extrapolate this service to reach communities that can use the Internet, irrespective of whether they are literate or whether they can read or write english.
“Since the western world does not acknowledge the possibility of ‘having eyes but not being able to read’, we are on our own. Software to address this need has to be developed from scratch. We have the framework ready and proof of concept in place. And we will develop the software on an ‘on-demand’ basis for those who need it,” says Dinesh, even as he expresses pride in how India has long been a centre for information re-narration through forms such as paper puppetry, story-telling through dance and songs (baul music, Harikathe) etc.
Data from the World Bank
The World Bank currently releases data on project-related contracts that were reviewed by the Bank before they were awarded. World Bank funds are tracked and regular project updates are made available online. “We are constantly on the alert for any irregularities in use of funds and encourage the public to report back to us if they find any discrepancies in the use of funds,” says Ankur Nagar, World Bank Finances (Open Data) in the course of his presentation on “Open Financial Data: Following World Bank’s Money.”
Nagar’s slide had everybody’s attention, as noted at the beginning of this article. It showed us the World Bank money operating in Bangalore that runs into millions of US dollars for the area of Bommanahalli alone. “Citizens can thus actually monitor World Bank-funded projects and report to the Bank if they suspect corruption,” Nagar points out.
The World Bank has a hotline where anybody who suspects fraud in a World-Bank funded project in their locality can summon an investigation by the bank to assess project integrity. “We are seeing the emergence of data scientists, data journalists and the trend is catching up. But the potential impact of open data in the humanitarian sector needs to be questioned,” Nagar says.
The journey onwards
So, can all of these and projects similar to these bring about a ‘tipping point’ as far as presentation, use and application of open data in India is concerned?
“There is data out there and enough tools and technology available to quantify social issues but it is the lack of awareness that is a challenge,” observes Anand. However, given the density and complexity of the society in India, awareness might take a while to come about, but we need to get our data and facts right first. “There are many possibilities to make the Internet truly accessible to all, open to not just the educated but everybody for whom information could be life changing. We need to find them,” says Dinesh, on the future reach of open data.
The infectious enthusiasm has evidently touched all. In a first-of-its-kind event, the National Innovation Council and the Planning Commission are coming together to host a two-day Hackathon on 6-7 April, in which teams of data enthusiasts will delve into the 12th Five-Year Plan and compete with each other in creating visualisations, info-graphics, mobile/web applications or even short films based on the same. The idea is to enable participants to have fun even as they take part in something as significant as evolving a vision plan for the nation. Clearly, India is changing and open data could have a bigger role to play than most would tend to think.