India must do more to see impact of open data
Web Foundation · November 16, 2015
Independent Research Economist Natasha Agarwal critically evaluates India’s progress on open government data. Her findings echo those of the three Web Foundation Open Data in Developing Country studies: challenges facing intermediary organisations in using open data in India, poor open data in the extractives sector, and non-existent open data in public sanitation in Chennai.
In 2012, India launched its national open government data (OGD) platform, data.gov.in. On the surface, it appears to be a great success: over 18,000 resources published, 4.7 million views and 1.8 million downloads. But are these figures translating into real change for citizens on the ground? Is this information being used to spot areas for concrete improvements in public service delivery?
Upon closer inspection, it becomes clear that the portal has only just begun its journey towards making a real impact for everyday Indians: critical datasets are unavailable on data.gov.in, available datasets are often outdated, duplicated, incomplete, inadequately referenced and lack common terms used to describe the data. Top level metadata such as data collection methodology and a description of the variables are also either missing or incomplete. These shortcomings make it difficult to compare and analyse datasets properly.
In a policy brief, I argue that these shortcomings happen for three reasons:
1. India’s policies on government open data are unclear and lack strong guidelines for individual departments and agencies to implement these policies.
When the Indian government decided to open up more government data through its new portal, it did so with the goal of creating greater public accountability and economic opportunities. But like any innovation, particularly across a government bureaucracy as large as India’s, this requires a clear plan and substantial efforts to change existing government practices.
The National Data Sharing and Accessibility Policy (NDSAP) was designed to fill this need and govern the release of India’s government data. However, in practice, it has not been implemented equally across all agencies and departments. Theoretically, all datasets of the various local and national ministries, departments, subordinate offices and autonomous bodies should be available on the data portal. But what we have seen in practice is that a lack of clarity in the NDSAP has meant varying interpretations between departments as to what qualifies for upload, resulting in inconsistent data sharing across government.
The government can fix this problem by 1. clarifying the threshold for what data must be shared on the portal, 2. simplifying the format used to share this information and 3. integrating the NDSAP with the government’s existing National e-Governance Plan – streamlining the number of regulations to be followed by government agencies and departments.
For example, the government could set a threshold for which datasets must be shared on the portal by requiring government departments to share any dataset that has been searched for by 100 citizens on the portal. Additionally, all data shared could be released in easy-to-use formats such as .xlsx, which include not only the data itself but also the metadata associated with it such as the original source, notes on how it was converted, etc. to help researchers understand it fully. For iexample, the database maintained by Agarwal and Lodefalk on the number of e-tourist visas issued by the Government of India.
2. Data providers and users are not aware of data.gov.in or how it could improve their work
Government agencies and departments do not always see the benefits of sharing their data with the public proactively. And people who could benefit from access to this data – such as journalists, researchers and civil society – are not aware that this data is available. (Buteau et al 2015) We can change this by building stronger relationships between government data providers and the data users and making sure the end users of data are involved throughout the data collection and distribution process.
3. The physical infrastructure in place to support open government data is not enough
To truly power India’s open government data revolution, the government must invest in better computers, a variety of up-to-date statistical packages, stable Internet connections, the hosting of supercomputers for processing larger datasets and an engineering team to troubleshoot. These measures can help to cut down on bureaucracy and waiting times – essential if India wants to motivate departments and agencies to embrace this transition to openness. The time saved through these improvements would also enable the publication of more original datasets, moving India closer to the ultimate goals of the open government data movement.
India’s long-term commitment to OGD is commendable. However, it is not enough just to set up an open government data portal. The commitment must be sustained over a period of time and involve the governmental and non-governmental organisations who can benefit most for the use of this data. In a survey of academics and researchers from a variety of disciplines, 98 percent of respondents said that open government data can have an impact on public policy. We must prioritise it to stimulate economic growth and secure India’s position as one of the world’s growing world powers.
To learn more, check out these research publications:
“Open data and applied socio-economic research in India: An overview.” Buteau, S., A. Larquemin, and J. P. Mukhopadhyay (2015)
Chattapadhyay, S. (2014). Opening Government Data through Mediation: Exploring the Roles, Practices and Strategies of Data Intermediary Organisations in India, World Wide Web Foundation, Washington DC.
“Dataset on India’s e-tourist visa (formerly tourist visa on arrival) programme” Agarwal, N and Lodefalk, M (2015)
November 17, 2015
Interesting and logical.I hope the government looks into your suggestions carefully.pl keep me updated about the response from the government.Regards.
January 23, 2016
Indeed, a correct,complete data availability is always a prerequisite for a good, valuable research but sometimes the data collection especially socio-economic in nature is always a challenge in country like India where many issues like linguistic, literacy, people lack of awareness regarding its utility for their welfare etc. are biggest challenge.Further, whatever is collected is of so lower quality that it required many iteration of data cleaning/scrutiny before final data release and this lag reduce its usability as well. A one such example is recently released census 2011 data. Anyway, much need to do in this regard to make people more aware to understand the meta data before providing their response.
September 28, 2016
Hi,As a data scientist, I love to explore and find meaningful insights from data. I was looking for open data sets for India, so that even I can contribute with my knowledge and experience to build information, areas of improvement, find trends, dig more meaning out of the cluttered datasets etc. Hope to see more of open data.