Présentation avec l'équipe Gautrin à l'Assemblée Nationale à Québec
Hack reduce introduction
1.
2. What is hack/reduce?
• A Home for the Big Data Community
• 24/7 Access to Cluster Compute Power
• Regular Hackathons
3.
4. hack/reduce
2011
Montreal
Toronto
Boston
Ottawa
2012 hack/reduce
Boston’s Big Data Hackspace
5.
6.
7. Why should you care?
• Work with Millions and Billions of records
• Find patterns in Big Data sets
• Use data to detect, predict, forecast
• Extract new information from raw data
8. APIs Suck
In Big data there are:
• no requests,
• no predefined parameters
• no structured responses.
You are free to intersect anything with anything.
You can analyse, mutate, group, split, reorder in any
way you can imagine.
9. What you can do today
• Access the hack/reduce GoGrid Cluster:
• 240 Cores
• 240GB of RAM
• 10TB of Disk
10. What you can do today
Use Hadoop to Explore big Open Data sets, like:
• 20 Years of the Federal Parliament Hansard
• Hourly Canadian Weather 1953 to 2001
• The 1881 Census. Details about 4.3M people
• One Summer of Bixi Station Status Updates
11.
12. What is Map/Reduce?
• Framework for distributed computing on
large data sets on clusters of computers
• MapReduce patented by Google
• Hadoop implementation is Googlesque
• Michael Stonebraker hates it
13. What is Map/Reduce?
• Map = function applied in parallel to every
item in the dataset
• Reduce = function applied in parallel to
groups of values emitted by Map function
14. What is Map/Reduce?
map(String docId, String document):
for each word w in document: emit(w, 1);
reduce(String word, Iterator counts):
int sum = 0;
for each count in counts: sum += count;
emit(word, sum);
We are hopper. Hopper is using Big Data to solve travel planning.
Hopper ’ s Montreal office was home to the inaugural Hack/Reduce event two years ago.
Hack/reduce is a community We held 4 events, in Montreal, Toronto, Boston and Ottawa. More than 300 hackers participated. Now we ’ re building a permanent Hack/Reduce community hackspace in Boston.
We are hopper. Hopper is using Big Data to solve travel planning.
GoGrid is sponsoring the cluster
GoGrid is sponsoring the cluster
If you ’ re interested in learning something different. Come talk to us.
If you ’ re interested in learning something different. Come talk to us.