Data comes in all forms and shapes. Data also evolves as life and people adapt to new situations, and so should your database.
When working with data, traditional relational database systems come to mind because that is how most of us have been trained. However, data is rarely homogeneous, and your database should not force you into a certain schema if your data is not relational.
During this talk we analyse the composition of "documents" in the context of a document-based database, and cover the basic principles of Map-Reduce and its potential use in the context of computational statistics.
What then happens when the amount of data you have no longer fits on 1 server? How easy is it for your favourite database to currently expand and adapt to your new growing requirements? What is your contingency plan if your server goes down?
We then go over some of the features that CouchDB, Riak and MongoDB provide you with, alongside some of David's personal opinions.
This is an intermediary talk. Listeners should have a working concept of Bayesian statistics, standard internet protocols as such as HTTP, and a minimum understanding of programming languages as such as JavaScript and Erlang as some of the examples for those database are using those languages.
12. id name age address phone
1 david 26 IE 353
2 divad 27 US 1
3 foo 42 IE 353
4 bar 31 CA 1
5 john 17 NZ 131
6 jack 128 DK 311
7 jill 21 IE 353
... ... ... ... ...
Wednesday 27 March 13
76. Mapper:
Divide vectors into subgroups,
Calculate d(p,q) between
vectors, find centroids,
sum them up.
Reducer:
Sum up the sums,
get new centroids.
Wednesday 27 March 13