Ona 2013 data journalism no excuses with la nacion data
1. Data Journalism, no excuses!
Momi Peralta Ramos
@momiperalta
Florencia Coelho
@fcoel
LA NACION DATA
www.lanacion.com.ar/data
2. About LA NACION
• Based in Buenos Aires,
Argentina
• Sunday print circulation:
+ 360.000
• www.lanacion.com
unique visitors/month:
+ 11MM
• 9 magazine titles
• Impremedia (90%): US hisp.
leading publishing company
3. LA NACION Data
It´s LA NACION´s initiative to develop data
journalism and contribute to opening data in
Argentina
6. Data.gov Portals?
In 2012 – Buenos Aires City and
Misiones Province opened data portals
2013 – National Data Portal!
7. A vision for several actions
LA NACION decided to challenge status quo and
started opening data and developing data
Journalism
@fcoel - @momiperalta
8. Why?
- Data is a new raw material for journalism
- Move public data closer to the people
- Activate demand of public data
- Discover new stories hidden in datasets
- Allow citizen´s collaboration + innovate
- It is the future of journalism!
@fcoel - @momiperalta
9. HOW...is this possible??
• EXCUSES:
– There is NO DATA or DATA is not credible
– We are not the US or the UK in terms or
transparency
– We DON’T have programmers in our newsroom
– We DON’T have skills in our newsroom to gather
or analize datasets
– Will all this effort make sense? Will someone use
this data?
– We don’t… we dont…
10. HOW...is this possible??
• EXCUSES:
– There is NO DATA or DATA is not credible
– We are not the US or the UK in terms or
transparency
– We DON’T have programmers in our newsroom
– We DON’T have skills in our newsroom to gather
or analize datasets
– Will all this effort make sense? Will someone use
this data?
– We don’t… we dont…
11. HOW...is this possible??
• EXCUSES:
– There is NO DATA or DATA is not credible
– We are not the US or the UK in terms or
transparency
– We DON’T have programmers in our newsroom
– We DON’T have skills in our newsroom to gather
or analize datasets
– Will all this effort make sense? Will someone use
this data?
KILLING THIS SCKEPTICISM
– We don’t… we dont…
ONE BY ONE
12. 1. LEARN HERE!!
• Go to conferences or follow it online, and learn in the
sessions!! Become a member.
• ONA 2010 was our first inspiration into dataj, yes, a
pre-conference workshop in ONA.
• Learn free online in MOOCs , webinars, books
13. 2. EMBRACE HACKTIVISM
Big community of developers and NGOs willing
to help! Embrace, engage and promote
hacktivism!
@fcoel - @momiperalta
15. 3. START CREATING DATASETS,
START SMALL
…BE
HUMBLE,
BECOME A
DATA
BUILDER
16. 4. LEARN TO ASK FOR HELP – TEAMWORK!!
• With a little help from my friends…
DEVELOPER
JOURNALIST
The perfect team…
@fcoel - @momiperalta
IMAGE from Scraperwiki
https://scraperwiki.com/
19. HIV / AIDS 30 Anniversary VERTICAL Timeline
Collaborative Tools:
Vertical Timeline
with Google
Spreadsheets.
Team: 1 member LN
data + 1 member
HIV/AIDS NGO and …
@fcoel - @momiperalta
http://www.lanacion.com.ar/1583473-a-treinta-anos-del-descubrimiento-del-vihsida
20. … International Collaboration!!
WNYC Vertical
Timeline’s
Google
Spreadsheet
model online
ready to copy &
paste.
“Absurdly
illustrated” guide
by
@LisaWilliams
@fcoel - @momiperalta
33. Transport Agency – Processing
CCP Subsidy corresponding to March 2012
ccp_sistau_marzo12(6).xlsx
CCP)
(Subsidio
BUT: 1.200 Rows (Companies)
21 Columns
1.600 PDF files (subsidies CASH y Gasoil for Buses and Trains) from 2003 to now (March
2013).
Extra Challenge: After published, files are updated (up to 10 times)
34. Transport Secretary – Analysis
Monthly consolidation for 3 subsidies for each company, and geographic zone
Subsidies paid in 2010 USD 4.260.000 ….. per day
- What companies received more subsidies? , Which one have the greatest rise?
42. Census Argentina 2001-2010
Our 2013 Knight-Mozilla OpenNews Fellow in LA NACION, Manuel Aristaran, was in
charge of this technological challenge together with our newsroom developer.
43. A census-thon (download & normalize census variables
marathon)
• 2 days
with
Knight ICFJ
fellow
Sandra
Crucianelli
@fcoel - @momiperalta
45. SENATE EXPENSES 2004 – 2013
Google-GEN Data Journalism Award Winner 2013
Finalists: The Guardian, The Associated Press, BBC News, Center for Public
Integrity, The Financial Times, Global News, La Nación de Costa Rica, The Los
Angeles Times y Mother Jones
57. Senate Expenses – Team Video
http://www.youtube.com/watch?v=qEZ2xMwPMWo&feature=youtu.be
58. La Plata City Major
Floodings
(April 2013)
• Collaborative Tools:
Google Spreadsheets
Google Maps
Google Fusion Tables
• Other tools: Excel, Tableau
Public.
59. Hypothesis: Gov was hiding real number of deaths to
diminish impact of its own responsabilities
• We got 150 copies of
handwritten death certificates in
La Plata for April (1st-15th).
• We made a database model,
typed each case details into a
spreadsheet, then ordered,
filtered, analysed…
60. • Visualizations for time & place
helped us confirm that most
deaths happened between April
2nd and 4th (or were directly
related) and many were located
over water streams running
under the city and/or flooded
blocks.
61. No Geocoding in GFT.
La Plata: “Where the streets have no name”
63. Then the developer, combined 2 JPGs we got from different sources,
water streams (some running under the city) and flooded blocks
64. And we published this map, based on
Google FTables + 2 combined overlayed JPGs.
65. Impact. Starting from 51 deaths …
One day after publishing: A judge
confirms 60 deaths due to major
floodings
45 days after: 78 deaths officially
confirmed
66. Team explains how collaboration worked
http://youtu.be/a56fWexw8uo <- Meet the team. Journalist,
dataminer, programmer, designer, data producer and me, multitasker.
Goodie for later!
72. STEP BY STEP MANUALLY BUILDING DATASET
•
•
•
•
•
Step 1: Data entry + 15.000 rows x 28 columns
Step 2: Raw data checking (paper vs spreadsheet)
Step 3: Normalizing
Step 4: Front end field by field checking (news app vs paper)
Step 5: Publish and…OPEN THE DATA!!
73. Open Assets Declarations
NGOs testimony on collaboration project
http://www.youtube.com/watch?v=OmsDdzTvp0E&feature=youtu.be
74. WE WORK WITH THE COMMUNITY
OPEN, OPEN, LEARN, SHARE, SHARE…
@fcoel - @momiperalta
78. • Besides daily efforts, we opened 21 datasets and made “Dataset cheat sheets”
to make them accessible and ready for analysis with data mining techniques or
data visualization in our first DATAFEST in Argentina last November.
• This DATAFEST was organized by LA NACION and UNIVERSIDAD AUSTRAL
Masters Degree in Data Mining and Universidad Austral Communications
Faculty.