The analysis of government data, data held by business, the web, social science survey data will support new research directions and findings. Big Data is one of David Willetts’ 8 great technologies, and in order to secure the UK’s competitive advantage new investments have been made by the Economic Social Science Research Council ( ESRC) in Big Data, for example the Business Datasafe and Understanding Populations investments. In this session the benefits of the use of Big Data in social science , and the ESRCs Big Data strategy will be explained by Professor David De Roure.of the Oxford e-Research Centre and advisor to the ESRC.
8. The Big Picture
More people
Moremachines
Big Data
Big Compute
Conventional
Computation
“Big Social”
Social Networks
e-infrastructure
online
R&D
Big Data
Production
& Analytics
deeply
about
society
9. RCUK and Big Data
▶ ‘Big data is a term for a collection of datasets so large
and complex that it is beyond the ability of typical
database software tools to capture, store, manage, and
analyse them. ‘Big’ is not defined as being larger than a
certain number of ‘bytes’ because as technology advances
over time, the size of datasets that qualify as big data will
also increase’ (RCUK)
▶ But why do we want it?
New forms of data enable us to
1. Answer existing research questions in new ways
2. Ask entirely new research questions
10.
11. NERC Big Data
...as diverse as our science
• From micro- to macro-scale
• Many sources:
• Monitoring campaigns
• Field sites & sensors
• State-of-the-art laboratories
• Ships & aircraft
• Remote Sensing & EO
• Regulator networks
• Volunteers/citizen science
• Model output
• Long-term and unique!
10µm
12. 100 TB
Big data: time-based media including film, tv, cctv footage -
retail data - geospatial data - email and social media - images
and associated metadata - performance data including raw
data of recordings, choreography, performance structure -
open government data - music - large-scale digital scans -
library, museum & gallery archives and metadata
13. Research benefits of new data
▶ Undertaking research on pressing policy-related issues
without the need for new data collection
• Food consumption, social background and obesity
• Energy consumption, housing type and climatic conditions
• Rural location, private/public transport alternatives and
incomes
• School attainment, higher education participation, subject
choices, student debt and later incomes
▶ New data such as social media enable us to ask big questions,
about big populations, and in real time – this is
transformative
25. Real life is and must be full of all kinds of social
constraint – the very processes from which society
arises. Computers can help if we use them to create
abstract social machines on the Web: processes in
which the people do the creative work and the
machine does the administration...The stage is set for
an evolutionary growth of new social engines.The
ability to create new forms of social process would be
given to the world at large, and development would be
rapid.
Berners-Lee, Weaving the Web, 1999 (pp. 172–175)
The Order of Social Machines
26. Some Social Machines
SOCIAM:TheTheory and Practice of Social Machines is funded by the UK Engineering and Physical Sciences Research Council (EPSRC) under grant
number EPJ017728/1 and comprises the Universities of Southampton, Oxford and Edinburgh. See sociam.org
27. Edwards, P. N., et al. (2013) Knowledge Infrastructures: Intellectual Frameworks and Research
Challenges.Ann Arbor: Deep Blue. http://hdl.handle.net/2027.42/97552
29. Big data elephant versus sense-making network?
The challenge is to foster the co-constituted socio-technical
system on the right i.e. a computationally-enabled sense-making
network of expertise, data, models and narratives.
Iain Buchan
30. Join the W3C Community Group www.w3.org/community/rosc
Jun Zhao
www.researchobject.org
32. Take homes
▶ New forms of data enable us answer old questions in
new ways and to answer entirely new questions
▶ There are multiple shifts occurring:
– Volumes of data
– Realtime analytics
– Computational infrastructure
– Dataflows vs datasets (and curation infrastructure)
– Correlation vs causation
– Increasing automation
– Machine-to-Machine in Internet of Things