Lilian Edwards is a professor who studies algorithms and their impact on society. She discusses how algorithms are increasingly governing many aspects of society through tasks like predictive profiling, targeted advertising, and content filtering. However, algorithms may not be as neutral and objective as commonly believed since their design and training can embed certain biases. She analyzes legal issues around algorithms and questions whether concepts like fairness can be imposed on proprietary algorithms. Transparency, accountability, and human oversight of algorithms are important topics to consider.
1. Lilian Edwards
Professor of E-Governance
University of Strathclyde
Lilian.edwards@strath.ac.uk
@lilianedwards
Pangloss:
http://blogscript.blogspot.co.uk
/
Slave to the
Algorithm
March 2016
2. Slave to the algo-ri(y)thm
Algorithms are Big News : ATI, Royal Society,
CCCAV
Historical pre-digital notion – everything from
knitting patterns to – “a set of logical
instructions intended to solve a problem” –
often different answers depending on
variables. (Braun)
3.
4. Machine learning algorithms
Eg machine vision. Is that a table or a packing case?
Machine language analysis. Eg number 2
Movements eg backing up a truck
Typically, use a training set of many thousand or million
examples – inputs outputs
Already defined what a successful output is. (eg yes, that was
a 2)
Train system to recognise inputs that produce success output
Apply to new data
Usually much preparation (“cleaning”) of data sets used -
Gillespie; context of use;
Not nearly as automated or neutral a process as it is usually
presented
Arguably running the world – “algorithmic governance”
5. Algorithm + data = ?
Now –twin sibling to Big Data?
the “key logic governing the flows of information upon which our
society depends” (Tarleton, 2013)
“together data structures and algorithms are two halves of the
ontology of the world according to a computer” (Manovich, 1999)
Considerable interest, coders-> business, sociologists,
politicians, lawyers; “Governing Algorithms”, MIT, 2013; Mayer-
Schoenberger & Cukier Big Data (2013); Morozow To save
Everything Click here (2013); Pariser The Filter Bubble (2011),
Kitchin The Programmable City
Law - Pasquale Black Box Society(2015), Edwards
Has change happened because of better algorithms or more
data? esp in the Internet industries, eg, search.
6. Why are algorithms important to society,
governance,innovation?
Predictive profiling of persons
OBA/targeted ads on Google/social networks, etc;
price , service discrimination;
criminal/terrorist profiling; -> pre-crime?
Future health, obesity, Alzheimers risks
Non-personal predictions of what is important/significant/popular/profitable;
eg “trending topics” on Twitter;
Google News; top Google results on search by keywords;
automated stock exchanges;
recommendations on Netflix/Amazon etc
Filtering online of unwanted content
Spam algorithms, Google Hell (anti SEO)
Ideal? Twitter UK anti-women trolling cases summer 2013: ACPO “They [Twitter]
are ingenious people, it can't be beyond their wit to stop these crimes”
“Real world” as well as online effects: Algorithms to instruct robots on how to
behave adaptively when circumstances change from original programming;
driverless cars liability?
Almost hopelessly wide topic! See *Kohl (2013) 12 IJLIT 187.
8. Please please believe me..
Are algorithms “fair”, “neutral”,“objective”? Some key themes
Kitchin: algorithms alleged to be “technical, benign and commonsensical”
“strictly rational concerns marrying the certainties of mathematics with
the objectivity of technology”
Why is “automated” taken to => “neutral” / “objective”? A game can after
all be rigged..
Yet tendency to regard algorithmic reality as real eg
first page of Google search;
Facebook Newsfeed emotional manipulation experiment (merely beta testing?)
June 2014
Curated newsfeeds favouring one or other of polarised political views – the
“echo chamber” worry and the Trump/Brexit effect?
Not new : cf “computer says no”, early credit scoring errors
Kohl : “the automation-neutrality platitude”
9.
10.
11. Why might algorithms not be
neutral?
Revealed unintentional bias eg Harvard medical admissions
BUT ALSO
Selection of data for training set, and ..
Selection of data as inputs:
How was the data to which the algorithm is applied selected and made
“algorithm-ready”? (“translation”)
The evaluation of relevance to a successful task – (exclusion/inclusion;
demotion/promotion; manual intervention;)
Complexity and iterative dynamic change : “algorithmic systems are not
standalone little boxes but massive networked ones wit hundreds of
hands”
Overwhemingly – the black box effect – that causality not shown, merely
statistical correlation
Legal implications (and remedies):
discrimination in profile based advertising ( Latanya Sweeney);
ranking of legal vs illegal download sites on Google;
adult content ranking eg not in Amazon top sellers ;
unfair competition issues – see also re Google Search results; defamation
and autocomplete cases.
12. Enslaving the algorithm: (1)
competition remedy
Repeated claims Google manipulates search to demote competitors,
promote own products
Early US case law : Search King v Google 2003 Google’s rankings not
challengeable as “opinion”, 1st Am protected!
However EU competition regulators, national & Commission and FTC in
USA have taken allegations more seriously eg Foundem (UK), Ciao (EU),
ejustice.fr (Fr) –proposed remedies April 2013 - architectural, labelling
remedies – fairly minor.
Can a notion of fairness/neutrality/impartiality be reasonably imposed on
Google’s proprietary algorithm?
Is there any canonical form of Googles front page?
It’s Google’s game and they make the rules? But can clearly make or break
businesses due to market dominance.
Reliance and trust. Google: “Our users trust our objectivity and no short term goal
could ever justify breaching that trust”
Could it ever be “neutral” to suit everyone?
13. Enslaving
the
algorithm:
(2)libel
Algorithmic defamation! Eg. Bettina Wulff case, Germany 2012/3
Google’s defenses: "The search terms in Google Autocomplete reflect the
actual search terms of all users“ (“Crowdsourcing defense”) Also - Automation;
objectivity;
(If Google’s rankings are only “opinion” , are its autocomplete suggestions not
even more so?
But Wulff won: although Google only liable on notice and failureto remove
But French and UK courts disagree,
“Crowdsourced” defense could inspire astroturfing – Morozow suggests
competitors could hire Mechanical Turks to diss opponents..
Is there social interest in making autocomplete too risky to keep turned on?
Is repressing questionable autocompletes a further version of the filter bubble?
14. Example
Try looking up “Tories are”, “labout are”, Lib dems
are”,,,
“Users who enter “Labour are” are offered
completed terms including “… finished”, “… a
joke”, and “… right wing”. Similarly, entering “Lib
Dems are” offers up “… finished”, “… pointless”
and “… traitors”. But entering “Conservatives are”
or “Tories are” offers no search suggestions at all.”
Guardian, 2 feb 2016
Shows lack of transparency and accountability
within Google algorithm?
15. The algorithm as black box
Example: Google search algorithm is not just Page
Rank (counting links) but c 200 other signals, changed
regularly – c 500-600 times/year – some clues given to
SEO industry
Why accepted as trade secret?
revenue depends on it – key market advantage?
Secrecy prevents rampant gaming/ “SEO”
? Disclosure might disrupt the useful claims of
automation, neutrality, objectivity
Do we have any rights to audit the algorithm? Should
we? Would it help any?
Would it be disastrous for Google to disclose given:
Value comes from the big data not the algorithm?
The algorithm is constantly changed?
Does Google KNOW what its algorithm is doing??
Could DP data subject access rights help??
16. Data Protection Directive
Art 12: "every data subject [has] the right to obtain from the
controller..
- knowledge of the logic involved in any automatic processing of
data concerning him at least in the case of the automated
decisions referred to in Article 15 (1)“
Art 15(1) : every person has the right "not to be subject to a
decision which produces legal effects concerning him or
significantly affects him and which is based solely on automated
processing of data intended to evaluate certain personal aspects
relating to him, such as his performance at work, creditworthiness,
reliability, conduct, etc.“
Rec 41: "any person must be able to exercise the right of access to
data relating to him which are being processed, in order to verify in
particular the accuracy of the data and the lawfulness of the
processing“
..” this right must not adversely affect trade secrets or intellectual
property and in particular the copyright protecting the software”
17. Draft DP Regulation (Jan 16)
New Art 15: Rts of access
Right to obtain where personal data is being
processed..
“(h) the existence of automated decision making
including profiling [see art 20] .. And at least in
those cases, meaningful information about the
logic involved, as well as the significance and
envisaged consequences of such processing..”
*Rec 51: “This right should not adversely affect
the rights and freedoms of others, including trade
secrets or intellectual property…However, the
result of these considerations should not be that
all information is refused to the data subject…”
18. Final issues to consider
Transparency
Audit
Comprehension (“algorithmists”)
Control
Remedies?
Existing laws – discrimination, employment, health and safety, data
protection, liability for algorithms (eg stock mkt?)
Better consents?
Access to data laws
Access to algorithm?
“Human in the loop”
Human rights infringement? Title to sue?
Second and third party audit – trust seals etc
Architecture control – privacy and equality by design?
cf PbDesign literature – PIAs etc
Hinweis der Redaktion
(Eg, Amazon does not measure “sales rank” of adult books)
2. How far can humans interfere, both overtly and in devising/tweaking the algorithms, before not “automated”? What interventions are justified/neutral?
punish those not playing ball (eg Italian newspapers refusing to be spidered)
How difficult would it be for Google to police this given they filter for copyright autosuggest (since 2012)?