Big and Social Media data opens up new scenarios and opportunities for management research (such as using internal communication data to map knowledge networks inside firms, or using web data to study firm capabilities and strategies). This presentation, given at the British Academy of Management 2014 conference proposes a typology of such scenarios, describes the skills required to exploit them, and considers implications for the education and training of management researchers.
The profile of the management (data) scientist: Potential scenarios and skills for B/SMD-based Management research
1. The profile of the management
(data) scientist: Potential scenarios
and skills for B/SMD-based
Management research
Juan Mateos-Garcia, Nesta P&R
NEMODE PDW
BAM Conference 9-11 September, 2014
2. Organisational + personal context
• Nesta: The UK’s innovation foundation.,
Figure III.3: Video game company incorporation across the UK
1980s or earlier 1990s
2000s 2010s
periods 1980s or earlier 1990s 2000s 2010s
with a mission to help people and
organisations bring great ideas to life.
• Doing research on data skills for BIS data
capability strategy in partnership with RSS
and Creative Skillset
• Doing some ‘big’ data work myself
• I used to do management research
(CENTRIM).
Draw on all this to reflect on the
implications of big data for management
research, focusing on skills.
2
3. More online activity, digital processes, better hardware.
Data-driven (automated,
personalised) products,
1. Definitions
processes and services. New
formats for data communication
More
varieties
of data
Generated
at faster
velocities
Larger
volumes
of data
New
applications
3
5. New opportunities for researchers
• Coverage: Large samples
• Revelation: Make the invisible
visible, reveal preferences, run
experiments.
• Granularity: High level of
resolution (temporal +
dimensional).
• Cheap! £££
5
6. 3. MOR examples
I looked at abstracts of 103 papers in last three issues of [1] AOMJ,
[2] BJM, [3] Management Science. No ‘big data’ papers in [1] and
[2]. 11 in MS (8 in a ‘Business Analytics’ special issue)
Data source Topic
Aral +
Walker
Facebook
(Proprietary)
Use RCTs to study social influence. Large samples and high levels of
granularity allows them to consider how social influence interacts with tie
embeddedness and tie strength.
Bao +
Datta
SEC (Open) Use unsupervised learning to identify and quantify risk types in ~14,000
annual reports, benchmark them against other methods for classification,
and develop an interactive platform to explore the findings.
Goshe
+ Han
App Store +
Google Play
(open)
Scrape App Store and Google Play data to create a sales panel they use to
estimate consumer demand and how it is affected by App features,
including pricing model.
Tambe LinkedIn
(Proprietary)
Quantify business big data capabilities and measure inter-company
recruitment networks to estimate inter-company skill investment spillover
6
7. Technical skills required, or the profile of
the management data scientist
Get data: Web scraping/API programming skills
Run experiments: Experimental designs
Manage and process the data: Database management
Clean the data: ‘wrangling’ (and patience).
Initial visualisation: Exploratory data analysis
Dimension reduction: Cluster analysis, PCA.
Model selection, estimation, evaluation:
Econometrics/statistics/machine learning
Display findings visually + interactively: Data visualisation
Access
data
Model
data
Present
findings
Data Pipeline
7
8. Challenges (not all technical)
Obtain proprietary data
Manage anonymity and ethical issues (including
experimental research cf. Facebook infamous
RCT).
Ask the right questions: “The best dimension
reduction tool that there is.”
Be careful with biases: N = All? Rarely. It is
important to understand the (administrative and
organisational) processes that generated the data.
Dealing with false positives bound to happen
with large samples and multiple tests.
Encouraging consilience through reproducibility
and relating finding to wider bodies of knowledge
Access
data
Model
data
Present
findings
Data Pipeline
8
Requires theory and domain knowledge
9. Institutional solutions
9
• People with technical skills and domain
knowledge are rare -> Unicorns.
• Supply push + Demand pull to increase
MOR big data capabilities.
• Internal dialogue within the discipline
and with other disciplines (Computer
Science, Information Systems)
• Acknowledge big data limitations for
looking at important issues (power,
perceptions, structural change.)
10. 10
THANK YOU
Juan.mateos-garcia@nesta.org.uk
@JMateosGarcia