Allow multiple groups to access shared resources while ensuring some dedicated share of the resource
Allow multiple groups to access shared resources while ensuring some dedicated share of the resource
Spark makes building a proof of concept with a subset of data relatively easy.
Every connection in the previous slide can transmit sensitive data!
Input data transmitted via broadcast variables
Computed data during shuffles
Data in serialized tasks, files uploaded with the job
How to prevent other users from seeing this data?
Spark makes building a proof of concept with a subset of data relatively easy.