The COMAD Data Challenge Contest, “DataView 2016” was the first of its kind analytics and app contest to ask participants for insights from data government has put in public domain for everyday governance issues affecting the society. The slides details the contest and the winners.
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Data View2016 Analytics Competition for Public Health Using Indian Open Data
1. Biplav Srivastava, Debtanu Dutta, Hemant Mittal
March 12, 2016
1
DataView 2016
App Contest for Social Impact at COMAD 2016
2. Contest Highlights
• First contest focused on data analysis with open data
• Two round contest, first asking for insights and second asking
for apps (and insights)
• The first round begins at Indian Standard Time (IST) 00:00:00 on
November 1, 2015 and ended at 23:59:59 IST on December 15, 2015.
• The second round begins at Indian Standard Time (IST) 00:00:00 on
January 01, 2016 and ends at 23:59:59 IST on February 15, 2016.
• Evaluation by organization committee and feedback taken
from data.gov.in and COMAD organizers
• 6 teams participated in first round; 3 carried on to second
round
2
3. Insights Sought
1.What diseases are most prevalent in a given area (e.g., state, district, city, by
keyword)?
2.Which diseases have been better controlled than others in India? What
states have done better than others? Are there approaches which have
worked for controlling / reducing instances of diseases better than others?
3.How much money has been allocated to tackle specific diseases compared
to others? Which regions do better than others in controlling diseases relative
to money spent?
4.Is their a relationship between water-borne diseases and their relation to
water pollution? 3
4. Open Data Sets
Health
•H-DS-1: http://data.gov.in/catalog/number-cases-and-deaths-due-diseases , AllIndia (from 2000 to 2011) and State-wise (2010 and
2011) number of cases and deaths due to specified diseases (Acute Diarrhoeal Diseases, Malaria, Acute Respiaratory Infection,
Japanese Encephalitis, Viral Hepatitis).
•H-DS-2: http://data.gov.in/catalog/cases-and-deaths-due-kala-azar , Cases and Deaths due to the illness Kala-Azar in Bihar, West
Bengal and Country during the years 1996 till 2000.
•H-DS-3: https://data.gov.in/catalog/cases-and-deaths-due-japanese-encephalitis-and-dengue-dhf-during-tenth-plancases and deaths
due to Japanese Encephalitis and Dengue / DHF during Tenth Plan.
•H-DS-4: https://data.gov.in/catalog/water-quality-affected-habitations, Water Quality Affected Habitations
•H-DS-5: Hospital Directory with Geo Code as on September 2015, https://data.gov.in/catalog/hospital-directory-national-health-
portal
Expenditure
•F-DS-1: https://data.gov.in/catalog/outlays-and-expenditure-aids-control-programme-during-ninth-plan, outlays and expenditure of
AIDS Control Programme during Ninth Plan.
•F-DS-2: https://data.gov.in/catalog/public-sector-outlaysexpenditure-during-eleventh-five-year-plan, public sector outlays and
expenditures during Eleventh Five Year Plan (2007-12) under various Heads of Development (Rs. Crore).
•F-DS-3: http://data.gov.in/catalog/outlays-department-health-agreed-planning-commission-during-tenth-plan , data related to 9th
Plan Allocation, 9th Plan Anticipated Expenditure, 10th Plan Allocation as Agreed by Planning Commission.
•F-DS-4: https://data.gov.in/catalog/percentage-share-household-expenditure-health-and-drugs-various-states-during-eleventh-five,
data related to percentage share of household expenditure on health and drugs in various states during Eleventh Five Year Plan.
•F-DS-5: https://data.gov.in/catalog/state-wise-plan-outlays-and-expenditure, table provides state-wise plan outlays and expenditure
during 2011-2012.
•F-DS-6: https://data.gov.in/catalog/outlay-tenth-plan-tenth-plan-sum-annual-outlay-and-tenth-plan-actual-expenditure-department,
data related to Outlay Tenth Plan, Tenth Plan (200207) sum of Annual Outlay and Tenth Plan (2002-07) Actual Expenditure for
Department of Health and Family Welfare.
Water Quality
•W-DS-1: https://data.gov.in/catalog/status-water-quality-india-2012,
http://data.gov.in/catalog/number-cases-and-deaths-due-diseases , status of Water Quality in India in 2012
4
5. Overall Winners
First Prize
iFuse:A Visual Data Fusion Approach, by Gunjan Sehgal, Kaushal Paneri, Aditeya
Pandey, and Garima Gupta, TCS Research
App: http://apps.web2labs.net/BDFusion/HomePage.html (Username: Comad/ Password: Comad/ Dataset : Comad)
Video: https://www.youtube.com/watch?v=eniSZKFpq_o
Second Prize (Joint)
Aniya Aggarwal, Mayur Saxena, Varun Parashar, Nishtha Madaan, IBM Research
App: http://vpronaldo.github.io/insights-demo/; Video: https://youtu.be/POsdsHPCHjA
Planning Disease Control amidst Water Pollution, K Kumar Ajella, Sravanthi Reddy
Akavaram, Sravan Daggupati and Prasad Aluru, ValSoft
App: http://dataview2016.valsoftech.com
5Round 1 Results
iFuse:A Visual Data Fusion Approach, by Gunjan Sehgal, Kaushal Paneri, Aditeya Pandey, and Garima Gupta, TCS Research
Special mentions :
1. DataPeRceivers, Abhishek Dubey, Aishwarya Kaneri, Ajit Dhobale, Apurva Mulay, Karthik Prabhu
2. Planning Disease Control amidst Water Pollution, k Kumar Ajella, Sravanthi Reddy Akavaram, Sravan Daggupati and Prasad Aluru, ValSoft
6. Team’s Assessment
• TCS team
• Strong focus on questions; gave summarized as well as detailed
answers; used innovative in-house tools
• Strong submissions in both rounds
• IBM team
• Detailed and exhaustive answers; used off-the-shelf tools
• Strong second round submission
• Valsoft team
• Focused on visualization but did not give specific answers; used
off-the-shelf tools
• Participated in both rounds
6
7. A Data Community’s Attempt to
Answers Public Health Questions
A snapshot from the winning entries. See explanations by each
team and their demos for details.
7
8. #1. What diseases are most prevalent in a given
area (e.g., state, district, city, by keyword)?
(TCS team) “We discovered that seven sister states and eastern states of
Jharkhand,Chattisgarh and Odisha have recorded higher per capita cases of
MALARIA when compared with other states/regions.”
(IBM team)
(Valsoft team)
• Verbose
visualization
8
9. #2. Which diseases have been better controlled than others in India? What
states have done better than others? Are there approaches which have
worked for controlling / reducing instances of diseases better than others?
(TCS team)
•Malaria has been better controlled in India as compared to other vector borne diseases
•Chattisgarh, Odisha, Gujarat,Jharkhand and Tamil Nadu have done better in controlling Malaria as
they reported a higher survival ratio.
•Expenditure on vector-borne diseases has helped in curbing malaria whereas this is not the case for
Japanese Encephalitis.
(IBM team)
(Valsoft team)
• Verbose
visualization 9
10. #3. How much money has been allocated to tackle specific diseases
compared to others? Which regions do better than others in
controlling diseases relative to money spent?
• (TCS team)
• In the 10th five year plan government allocated maximum money on vector- borne
diseases followed by Tuberculosis.
• Gujarat, Kerala, Tamil Nadu and Rajasthan performed better than others relative
to money spent whereas A&N Islands and Mizoram were unable to do so.
• (IBM team)
• (Valsoft team)
• Verbose
visualization
10
11. #4. Is their a relationship between water-borne
diseases and their relation to water pollution?
• (TCS team) We found a positive correlation between per
capita acute diarrhoeal cases and (1) Avg. Conductivity , (2)
Avg. NITRATE- N+ NITRITE-N
• (IBM team)
• Moderate {Oxygen, pH, Conductivity} => Moderate {Viral
hepatitis, Acute Diorrhea }, Low {malaria, Japanese encephalitis}
• (Valsoft team)
• Verbose visualization
11
12. Lessons
• Data Science can help make a beginning to answer public
health questions with open data
• Students did not participate but this may become popular in
future; good area for research and serious innovations
• Participating teams experienced challenges with data quality
and sufficiency
• Made assumptions which reflects in results
• Lot of room for improvement
• Public health officials, NGOs, open data community should use
results and encourage more contests and hackathons in this
area in future
12