Watch talk ➟ http://bit.ly/1NKPpQh
Eduardo Arino De La Rubia, VP of Product and Data Scientist in residence at Domino Data Lab talks about how to manage conflict in growing data science teams.
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Data Science Popup Austin: Conflict in Growing Data Science Organizations
1. DATA
SCIENCE
POP UP
AUSTIN
Design of Conflict Management
Systems in Data Science
Eduardo Ariño de la Rubia
earino
VP of Product & Data Scientist in Residence,
Domino Data Lab
3. Oh The Conflicts You’ll
Face
Conflict in Growing
Data Science Organizations
4. A Quick Introduction
● Eduardo Ariño de la Rubia
● VP of Product & Data Scientist in Residence at Domino Data Lab
● Computer programmer for… too long
● HPC (PVM & MPI), ML since the mid 90s
● Husband, father, dog owner
● I share too much on twitter (@earino)
5. conflict
noun: conflict; plural noun: conflicts /känˌflikt/
an incompatibility between two or more opinions, principles, or interests.
7. Theories of Conflict
1. Individual Characteristics
2. Social Process
3. Social Structure
4. Formal Theories
8. Individual Characteristic Theories
These theories focus on understanding individual aggression,
and see such aggression as the source of conflict.
Conflict resolution focuses on containing or redirecting
aggressive tendencies.
Examples:
1. That data scientist is hard to work with
2. Steve hates it when Laura has a better answer
9. Social Process Theories
Social process theories treat conflict and conflict resolution as
processes which cannot be explained entirely in terms of either
individual behavior, or social structures.
Social process theorists may focus on such issues as patterns
of conflict escalation, the role of conflict in society, or the
relation between conflict and competition.
Examples:
1. PhDs in Physics are trained to be difficult
2. That department is just ornery
10. Social Structure Theories
These theories view the social organization as the main
source of conflict. Class divisions, racial or ethnic divisions
or sex divisions form the basis for social conflict.
Such theories recommend one of five basic approaches to
conflict resolution: avoidance, acceptance, gradual social
reform, nonviolent confrontation, or violent confrontation
Examples:
1. The marketing department refuses to share their data
2. Sales is always making promises we can’t keep
11. Formal Theories
Example:
1. We could have predicted the data lake would
become a swamp, it’s a tragedy of the
commons
2. We’re just going to go tit-for-tat until one of us
defects
Formal theories attempt to explain conflict by use of logical or mathematical
models. Formal models are both powerful and flexible, but can be difficult to
understand and apply.
13. Evolution of Data Science in an Organization
Company X has a problem.
Company X realizes that they’re losing out on business, their customers seen to
be one step ahead of them, and that they need to be smarter.
Company X realizes that the way to get ahead of this is by hiring someone who
will help them curate their data, gain insights, and come up with ways to enhance
their offering using data.
Company X is going to hire a data scientist.
15. Phase #1 - Single Data Scientist
● Probably the most productive way of running a data science department
Conflicts:
● Why does this data scientist get unfettered access to our data?
● This data scientist doesn’t understand DEPARTMENT_X and is
misrepresenting the state of things
● How do we know we can “trust” these models?
Conflict Types:
● Usually individual characteristics
16. Phase #2 - Get the Data Scientist Some Help
● The data scientist gets help from either a Junior Data Scientist (rare) or more
likely a Data Engineer
Conflict:
● Usually doesn’t actually speed up anything since now the data scientist takes
on larger problems and
● It turns out managing someone and coming up with good tasks for them to do
isn’t a trivial task (who knew?)
Conflict Types:
● Usually individual characteristics, sometimes social structure.
17. Phase #3 - A Data Scientist in Every Pot
● This is probably the most common place organizations stop.
● Every product team / department gets their own data scientist (LinkedIn)
Conflicts
● Why are we doing so much “redundant work”?
● Managing feedback loops and complexity
Read: Machine Learning: The High Interest Credit Card of Technical Debt
Conflict Types
● Lots of social process stuff, become tribes, etc...
18. A Short Pitch
The complexity of a DS department grows quite a bit beyond this point. You need
a series of conventions. It’s hard to keep experiments straight, there are wacky
feedback loops, the world gets hard. There are 3 principles you should follow:
1. Focus on interests
2. Build in feedback loops
3. Consultation before, feedback after
In short, either build tooling that supports these conflict resolution principles at
every turn, or use a platform that supports this.
19.
20. Phase #4 - Why aren’t these Data Scientists under
IT● A relatively rare powerplay, but I have seen it happen in “pure tech”
businesses such as SaaS, apps, and games.
● Sometimes also “let’s just put Data Science under BI”
Conflicts
● Data science is not the same thing as product engineering / BI
● Engineering management is poorly calibrated for EDA, feature engineering,
amorphous poorly specified goals, etc…
● You will be forced to use Agile
Conflict Types
● Pretty much everything.
22. 10 Assumptions of Agile for Software (there’s more)
1. Teams stay together over time
2. People are specializing generalists
3. People are engaged and motivated
4. Teams deliver products
5. Projects come to teams
6. Teams are loosely coupled to the organization
7. Teams have minimal external dependencies
8. Fully engaged customers
9. Established architecture and processes
10. There are clearly understood goals and metrics
23. How do those assumptions stack up for DS?
1. Teams stay together over time
2. People are specializing generalists
3. People are engaged and motivated
4. Teams deliver products (well, maybe?)
5. Projects come to teams
6. Teams are loosely coupled to the organization
7. Teams have minimal external dependencies
8. Fully engaged customers
9. Established architecture and processes
10. There are clearly understood goals and metrics
24.
25. Phase #5 - COE / Internal Consulting Model
● Probably the most successful model I have seen
● Data science reports to data science leadership, but individual data scientists
are deployed to project teams
● Very flexible
Conflicts
● Constant fight for resources (now you’re just another department)
● Challenging to invest time and effort to learn specifics of silos in the business
Conflict Types
● Social structure
26. Conclusion
1. Try to understand what is driving conflict in your organization
2. Apply 3 principles of dispute resolution
a. Interests first
b. Build in feedback loops
c. Consultation before, feedback after
3. Build / Use tooling which allows you to formalize conflict resolution processes,
so that each time is not an ad-hoc adventure
4. Be data informed, not data driven. Remember that sampling a data
generating process creates bias.
5. Figure out where you are in your organization’s development and where you
want to be.