The presentation deals with ethical issues in a few currently widely used machine learning (or AI) technologies and algorithms. The ML applications are described in details, their current state of the art, their specific challenges and ethical problems. Current solutions from academic and industrial perspective are given. A mixture of academic and applied sources are used for the presentation - it aims to be more interesting for students and practitioners.
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
Ethical Issues in Machine Learning Algorithms. (Part 3)
1. Ethical Issues in
Machine Learning Algorithms
(Part 3)
IEEE Young Professionals Bulgaria,
Vladimir Kanchev, PhD
1
2. Introduction
2
Dr. Kim, (2018, May 31) Human ethics for artificial intelligent
beings. An Ethics Scary Tale. Retrieved from
https://aistrategyblog.com/category/utilitarianism/
3. Contents
1. Advances in Data Science (DS) and
Machine learning (ML) fields.
2. Ethics and ethical issues.
3. Current legislation. GDPR.
4. ML data bias, algorithmic bias, and
interpretability issues.
5. Ongoing academic research problems.
3
4. Recent ML ethical issues
Fields of application:
bias in face recognition systems
gender-biased results and chatbot issues in NLP
credit score computation
user profiling and personalization
4
5. Bias in face recognition systems
5 https://bit.ly/2ygssbo
6. Face recognition
Def: a biometric software application capable of
uniquely identifying or verifying a person by
comparing and analyzing patterns based on the
person's facial contours.
https://www.techopedia.com/definition/32071/facial-recognition6
7. Face recognition
developed, commercialized biometric technology;
can be found on mobile phones
widely used by law enforcement agencies in USA
and China
non-contact, non-invasive technology
very high accuracy
depends on lighting; can be tricked by make-up
and glasses
7
9. Face recognition algorithms
Wang, Mei, and Weihong Deng. "Deep face recognition: A
survey." arXiv preprint arXiv:1804.06655 (2018).
9
10. Bias in face recognition systems
Bias appears in face recognition systems because
of the use of:
older algorithms
features related to facial features, such as color
racial-biased datasets
deep learning classifiers
.
10
11. Consequences
inefficiency of video surveillance systems in public
city areas
increased privacy concerns caused by video
surveillance systems
lower accuracy rate in Afro-American and Asian
males and females; innocent black suspects come
under police scrutiny
a major lag in mass implementation and acceptance
of the technology
.11
14. Bias cases
Use of machine learning to detect features of the
human face, associated with criminality*:
Some research about the problem in the past,
currently abandoned (Cesare Lambroso).
Wu and Zhang(2016)* trained few classifiers with
two classes – criminal (from ID photos) and non-
criminal faces (from their professional pages).
Finally the authors constructed a smile detector **
– with over 90 % accuracy.
Wu, Xiaolin, and Xi Zhang. "Automated inference on criminality using face
images." arXiv preprint arXiv:1611.04135 (2016): 4038-4052.14
15. Bias cases
15
Wu, Xiaolin, and Xi Zhang. "Automated inference on criminality using face
images." arXiv preprint arXiv:1611.04135 (2016): 4038-4052.
https://bit.ly/2O07D8o
16. Detection of sexual orientation
by face recognition
16
Detection wx. people are gay or straight based on
their photo images – 81% accuracy (men) and
74% (women). Human judges-61% and 54% resp.
Theory said that sexual orientation comes from
exposure to certain hormones.
Gays have narrower jaws, longer noses and larger
foreheads than straight men, while gay women
have larger jaw and smaller foreheads.
Use of a sample of 35 thousand images from a US
dating site; no color, transgender, bisexual people
Use of a deep learning classifier.
Wang, Yilun, and Michal Kosinski. "Deep neural networks are more accurate
than humans at detecting sexual orientation from facial images." Journal of
personality and social psychology114.2 (2018):
17. Detection of sexual orientation
by face recognition
Wang, Yilun, and Michal Kosinski. "Deep neural networks are more accurate
than humans at detecting sexual orientation from facial images." Journal of
personality and social psychology114.2 (2018):
17
18. Dealing with bias
How bias can be prevented:
make train datasets more diverse
additional operations of detection of faces and set
more sensitive parameters of classifier
not allow users to search terms as gorilla,
chimpanzee, or monkey (Google photo service)
.
18
19. Recent ML ethical issues
Fields of application:
bias in face recognition systems
gender-biased results and chatbot issues in NLP
credit score computation
user profiling and personalization
19
21. Ethical issues in NLP
Natural language processing (NLP):
Def: is an AI branch that deals with analyzing,
understanding and generating the languages
that humans use naturally in order
to interface with computers using natural
human languages.
Challenge: Ambiguity of human language.
https://www.webopedia.com/TERM/N/NLP.html21
22. Human language and NLP
But human language is also:
proxy for human behavior
a sign of membership in a certain group
always context-specific – is related to and
depends on a specific situation, time and place
22
23. Current tasks of NLP
automatic summarization
translation
named entity recognition
parts-of-speech tagging
sentiment analysis
speech recognition
topic segmentation
question answering
23
24. Major approaches in NLP
Until the 80s, hand-written rules; after that
statistical machine learning came into life.
During the 2010s, DL neural networks and
representation learning; state-of-the-art results.
Now use of word embeddings to capture the
semantic properties of words; increased end-to-end
learning.
24
25. Word embeddings
Word embeddings (WE) is a model that maps
English words to high-dimensional vector of numbers.
WE:
is trained on a large body of text (corpus)–
word2Vec.
correlates semantic similarity with spatial proximity
- “Man is to Woman as Brother is to Sister”.
uses cos distance to calculate similarity between
vectors.
is characterized by a social bias, shown by the
Word Embedding Association Test (WEAT).
25
26. Word embeddings bias
WEAT says that:
Male names and pronouns were closer to words
about career, while female ones were closer to
concepts like homemaking and family.
Young people’s names were closer to pleasant
words, while old people’s names were closer
to unpleasant words.
Male names were closer to words about math and
science, while female names were closer to the art.
https://bit.ly/2IJXMDP26
27. Word embeddings bias
Examples:
Man: Woman as King: Queen
Man: Computer_Programmer as Woman:
Homemaker
Father: Doctor as Mother: Nurse
Word embeddings can reflect gender, ethnicity,
age, sexual orientation and other biases of texts used
to train the model.
Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to
homemaker? debiasing word embeddings." Advances in neural information27
28. Word embeddings bias
Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to
homemaker? debiasing word embeddings." Advances in Neural Information
Processing Systems. 2016
28
29. Word embeddings bias
Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to
homemaker? debiasing word embeddings." Advances in Neural Information
Processing Systems. 2016
29
30. Debiasing word embeddings
Algorithm for debiasing:
1. Learn word embeddings from a text corpus –
obtain vectors for words.
2. Identify bias direction.
Calculate difference values between vectors of pairs of words
- he and she, male and female and average them.
3. Neutralize words, which are not gender-specific –
e.g. doctors.
4. Equalize pairs as girl-boy, grandfather-
grandmother.
Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to homemaker?
debiasing word embeddings." Advances in Neural Information Processing Systems.
2016.
30
31. Discussion
Blind application of word embeddings can amplify
gender biases presented in data.
Word embeddings in data reflects biases presenting
in society.
There are similar biases related to race, ethnic and
cultural groups
The focus is on word embeddings in English
language.
Bolukbasi, Tolga, et al. "Man is to computer programmer as woman is to homemaker?
debiasing word embeddings." Advances in Neural Information Processing Systems.
2016.
31
33. Chatbots
Chatbot (conversational agent):
Def: is a computer program or an artificial intelligence which
conducts a conversation via auditory or textual
methods*.
Conversational user interface:
Def: A CI is a hybrid UI that interacts with users combining
chat, voice or any other natural language interface
with graphical UI elements like buttons, images,
menus, videos, etc.**
They are also related to Turing test.
*„What is a chatbot? techtarget.com. Retrieved 30 January2017
**https://bit.ly/2CMP8On
33
34. Types of chatbots
Basic bots – use pre-written keywords
ELIZA (1966), Alice (1995)
Text-based Assistant
Facebook M (2015), Google Allo (2016), Slack‘s slackbot
Voice Assistant
Google Assistant (2016), Apple Siri (2011), Google Now
(2012), Amazon Alexa (2014), Microsoft Cortana (2014)
A lot of specialized text-based assistants
customer support bots, news bots, entertainment bots, etc.
34
37. Current situation
Chatbots are good for repetitive, well-defined
(common questions & answers) tasks and scenarios.
Attract a lot of interest of industry to reduce labor
expenses.
They are integrated in websites, enterprise systems,
etc.
There are a lot of chatbot development platforms on
market.
It is hard to reach production quality; often humans
are frustrated when dealing with chatbots.
37
39. Application of chatbots
Chatbots can replace FAQ sections.
They are used in customer service operations,
automatic emailing. Straightforward problem are
solved by chatbots, more complicated – by humans.
By using them, customer support agents improve
the shopping process and personalize it.
They have improved response rate, compared with
human support agents.
39
40. Current DS approaches
Gathering data from users – sex, age, habits; thus
aiming to achieve personalization.
Using of large data and reinforcement learning.
Aiming to build feeling of trust and empathy with
human users – use of a sentiment analysis.
Extracting intent (the purpose) and entity (object,
context for intent) from user input.
Deciding on the next best action in a conversation
using DL (RNN, LSTM) and input, training data, and
conversation history (Resa chatbot).
40
42. Ethical issues
User information gathered for personalization of
chatbots – data privacy issues
Training chatbots with obscenities and extremist
view data - bias through interaction (MS chatbot
Tay)
Algorithmic NLP bias in chatbots
42
43. Solving ethical issues
Filtering political topics of conversation (as MS
chatbot Zo – heir of chatbot Tay).
Training with data of diverse topics, encouraging a
diverse set of real users.
Building a diverse team of developers – of a
technical and non-technical background.
Applying a bias tracking system of developers –
more control and then ML algorithm black-box
testing.
Providing more transparency of ML algorithms (as in
an open-source community).
https://bit.ly/2Hn6AKH
https://bit.ly/2v4L0XH
43
44. Recent ML ethical issues
Fields of application:
bias in face recognition systems
gender-biased results and chatbot issues in NLP
credit score computation
user profiling and personalization
44
46. Credit score
Credit score is a numeric expression, measuring
people’s or company’s credit-worthiness.
Banks use it for decision-making for credit
application.
Depends on credit history.
It indicates how dependable an individual or a
company is.
46
47. Scorecard algorithm
Scorecard:
Def: a standard and easy to understand credit scoring
algorithm. A Binary problem:
1st class – default – a customer fails to pay install.
2nd class – a customer pays regular installments for
a given time period.
It consists of:
building and training a statistical or a ML model.
applying the chosen model to assign a score to
every credit application.
47
48. Scorecard algorithm
Use of ML algorithms as logistic regression, random
trees, boosting, neural networks, generalized
additive models
Use of Area under curve (AUC) based on ROC
analysis for model evaluation, Gini coefficients
The data should be comprehensive – allowing few
missing values, and including as many data points
as possible from the financial records of customers
and their payment history
48
52. Current DS issues
52
Customers with no credit history need to be set
into predefined groups.
Wide introduction of automated credit score - aims
to make markets more efficient and low cost
financial services but introduces algorithmic bias.
Incomplete data can influence negatively the
accuracy of the final results.
54. Ethical issues
protection of personal data - necessary for credit
score calculation
explainability and transparency of the used ML
algorithm
introduction of bias – danger of discrimination for
ethnic minorities by implicit correlation
lack of accuracy, objectivity, and accountability of
credit score computation
54
55. Solving ethical issues
use of interpretable ML algorithms/models
preparation of training data samples to avoid bias
protection of personal data against breaches
through anonymization
training all employees to work with ML algorithms
and know their biases
continuous human supervision of ML algorithms
auditability of AI algorithms
55
56. Recent ML ethical issues
Fields of application:
bias in face recognition systems
gender-biased results and chatbot issues in NLP
credit score computation
user profiling and personalization
56
58. User profiling
A user profile:
Def: is a set of information representing a user via user related
rules, settings, needs, interests, behaviors and
preference*.
Personalization:
Def: a process to change the functionality, information content
or distinctiveness of a system to increase its personal
relevance to an individual**.
S. Henczel (2004). Creating user profiles to improve information quality,
Factiva, 28(3), p. 30.
J. Blom (2000). Personalization-a taxonomy, Conference on Human
Factors in Computing Systems, pp. 313-314.
58
59. User profiling methods
User profile aims to provide a personalized
service – matching users’ requirements, preferences
and needs with the service delivery.
Approaches of retrieving information about the user:
Explicit method – information is provided
explicitly by the user – static profiling.
Implicit method – analyzes user‘s behavior
pattern to determine user‘s interest – dynamic user
profiling
Hybrid method – a combination of both methods.
59
60. User profiling methods
Content-Based Method – assumes the user
behaves the same way under the same
circumstances.
Vector-space model, Latent Semantic Indexing,
Learning Information Agents, Neural Network Agents …
Collaborative method - assumes that users who
belong to the same group behave similarly.
Memory-Based and Model-Based
Hybrid method – a combination of both methods.
60
61. Current challenges
Generation of an initial user profile for a new user
Continuous update of the profile information to
adapt to user‘s changing preferences, interests and
needs – data drift
Changing regulations to protect user‘s data – GDPR
legislation
61
62. Recommender systems
62
Aim to predict user’s interest, recommend items,
increase sales and revenues of companies.
Use characteristic information (keywords,
categories) and users (preferences, profiles, etc.);
needs a lot of data for training.
Use of item-to-item and user-to-user
recommendations to train the RS.
Reduce feature space by matrix factorization
(SVD) and DL; use injected randomness or
exploitation-exploration to avoid overfitting.
https://bit.ly/2GbUHbV
65. Content personalization
Def: delivering the right message to the right visitor
at the right time.
Main purposes:
to increase visitor engagement
to improve customer experience
to increase conversion rates
to increase customer acquisition
65 https://bit.ly/2XRlYaO
68. Ethical issues
privacy issues during user data gathering
underrepresentation of minorities, societal bias
construction of bubbles around users, political
debates within echo chambers
objectivity of search results (Google) is impaired
due to user profiling and corporate politics
68
69. Solving ethical issues
Transparency of personalization ML algorithms -
users should know how it works and to have an
option to change it.
Ensuring interactivity - opportunity to provide
correction actions, when biases are spotted by
users.
Robustness of the ML system against manipulation
- against rumors and false information.
Fast reaction to ethically compromised input.
69
70. Discussion
ongoing topic of research, a public debate among
researchers, practioners, and general users
a major obstacle to the introduction of many ML
systems
a lack of standardized set of algorithms to solve
them, or debiasing; only general approaches
What do you think is the most important ethical
issue related to the mentioned (or other) ML
technologies?
70
Banksy „tagging robot“ new street piece (coney island, nyc 2013) on the wall of a former convenience store, devastated by the hurricane sandy. His residancy in NYC was named ‚Better out than in‘ and then he produced a series of popular graffiti
Barcode is 13274125 - DNA code for homo sapience.
Banksy is an anonymous british graffiti writer
Banksy „tagging robot“ new street piece (coney island, nyc 2013) on the wall of a former convenience store, devastated by the hurricane sandy. His residancy in NYC was named ‚Better out than in‘ and then he produced a series of popular graffiti
Barcode is 13274125 - DNA code for homo sapience.
Banksy is anonimous graffiti writter
Now let’s move into specific ethical issues related to Data Science and Machine Learning Algorithm. Some current cases will be shown with corresponding ethical issues. List of descriptions will be given for two main sources of ethical issues .
Face recognition replaces eyewitnesses, which are notirously unreliable
https://www.media.mit.edu/projects/gender-shades/overview/
MIT project
Cesare Lombroso was an Italian physician and psychiatrist.
His 1876 book Criminal Man argued some people were born criminals - it claimed they were ‘atavistic’, or throwbacks to a primitive stage of evolution. Lombroso believed ‘primitiveness’ could be read from the bodies and habits of such born criminals - for instance, facial features, body type, … .
Make train datasets more diverse with Asian faces; get faces of celebrities in internet in order to build large train datasets of faces; predominantly white celebrities ….
Different accuracy milestones for smile detection, according to race (Google paper*), Their system detects first gender, than race ,and finally smiles; danger of gender and race profiling.
*Ryu, Hee Jung, Margaret Mitchell, and Hartwig Adam. "Improving Smiling Detection with Race and Gender Diversity." arXiv preprint arXiv:1712.00193 (2017).
Make train datasets more diverse with Asian faces; get faces of celebrities in internet in order to build large train datasets of faces; predominantly white celebrities ….
Different accuracy milestones for smile detection, according to race (Google paper*), Their system detects first gender, than race ,and finally smiles; danger of gender and race profiling.
*Ryu, Hee Jung, Margaret Mitchell, and Hartwig Adam. "Improving Smiling Detection with Race and Gender Diversity." arXiv preprint arXiv:1712.00193 (2017).
Make train datasets more diverse with Asian faces; get faces of celebrities in internet in order to build large train datasets of faces; predominantly white celebrities ….
Different accuracy milestones for smile detection, according to race (Google paper*), Their system detects first gender, than race ,and finally smiles; danger of gender and race profiling.
*Ryu, Hee Jung, Margaret Mitchell, and Hartwig Adam. "Improving Smiling Detection with Race and Gender Diversity." arXiv preprint arXiv:1712.00193 (2017).
Make train datasets more diverse with Asian faces; get faces of celebrities in internet in order to build large train datasets of faces; predominantly white celebrities ….
Different accuracy milestones for smile detection, according to race (Google paper*), Their system detects first gender, than race ,and finally smiles; danger of gender and race profiling.
*Ryu, Hee Jung, Margaret Mitchell, and Hartwig Adam. "Improving Smiling Detection with Race and Gender Diversity." arXiv preprint arXiv:1712.00193 (2017).
Automatic summarization
Produce a readable summary of a chunk of text. Often used to provide summaries of text of a known type, such as articles in the financial section of a newspaper.
(Machine) translation
Automatically translate text from one human language to another.
Named entity recognition
Given a stream of text, determine which items in the text map to proper names, such as people or places, and what the type of each such name is (e.g. person, location, organization).
Part-of-speech tagging
Given a sentence, determine the part of speech for each word
Sentiment analysis
Extract subjective information usually from a set of documents, often using online reviews to determine "polarity" about specific objects.
Speech recognition
Given a sound clip of a person or people speaking, determine the textual representation of the speech.
Topic segmentation
Given a chunk of text, separate it into segments each of which is devoted to a topic, and identify the topic of the segment.
Question answering
Given a human-language question, determine its answer. Typical questions have a specific right answer (such as "What is the capital of Canada?"), but sometimes open-ended questions are also considered (such as "What is the meaning of life?").
ML algorithms – Hidden Markov models, decision trees,
Here use of word alignment and language modeling;
DL ML algorithms – sequence to sequence transformation
Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimension, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.
https://en.wikipedia.org/wiki/Word2vec
What these results show that the text is loaded with historical inequality. Embeddings measure the similarity between two words by how often they occur near one another. If most doctors historically have been male, for instance, then words like doctor would appear near male names more often, and would be associated with those names. The standard concern is that the machine might reproduce this inequality: for instance, a résumé-screening algorithm that naïvely used word embeddings to measure how “professional” or “career-oriented” a candidate was might unfairly discriminate against female candidates, simply on the basis of their names.
What these results show that the text is loaded with historical inequality. Embeddings measure the similarity between two words by how often they occur near one another. If most doctors historically have been male, for instance, then words like doctor would appear near male names more often, and would be associated with those names. The standard concern is that the machine might reproduce this inequality: for instance, a résumé-screening algorithm that naïvely used word embeddings to measure how “professional” or “career-oriented” a candidate was might unfairly discriminate against female candidates, simply on the basis of their names.
A projection of word embeddings. The x-axis is parallel to vhe−vshe; the y-axis measures the strength of the gender association.
A projection of word embeddings. The x-axis is parallel to vhe−vshe; the y-axis measures the strength of the gender association.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Turing test
A criterion of intelligence – ability of a computer program to communicate with a human judge in such a way that human is not capable of distinguishing it form real humans.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Turing test
A criterion of intelligence – ability of a computer program to communicate with a human judge in such a way that human is not capable of distinguishing it form real humans.
Basic bots
Inputs for basic chatbots are rather limited. The design of the interface is basic, allowing for basic commands and basic inputs.
Text-based Assistant
The other type of conversational interface is through typing. This is the one you usually experience when you interact with a chatbot. Simply you type the word and provide the input. Depending on the quality of your input, chatbot would provide you with an answer. The library for building this type of chatbot is more extensive.
Alice – user heuristic pattern matching rules, online form use hidden human. – unable to pass Turing test.
Voice Assistant
While basic bots and text based assistants leverage images and video to convey their message, voice assistants have the difficulty of only relying on voice. While
voice is sufficient for some use cases like re-ordering a frequently purchased item, voice is not a good interface for examining a new product or picking an item
from a menu.
Criteria to evaluate chatbots:
notable skills and flaws, orientation (limitation) to specific technology, level of humanity, number of supported languages, level of personalization,
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Empathy
Empathy is the capability of understanding or feeling what another person is experiencing from within her frame of reference, i.e., the ability to place oneself in the other person’s position
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
Chatbots are algorithmic conversational agents which companies are coming up with to interact with their customers
Consider ATMs. When they emerged, many balked at the idea of feeding their money into a machine and preferred to interact with a bank teller. Now, ATMs are the norm, and, unlike tellers, they're available day and night to handle transactions.
The common objective behind machine learning and traditional statistical learning tools is to learn from data. Both approaches aim to investigate the underlying relationships by using a training dataset. Typically, statistical learning methods assume formal relationships between variables in the form of mathematical equations, while machine learning methods can learn from data without requiring any rules-based programming.
https://www.moodysanalytics.com/risk-perspectives-magazine/managing-disruption/spotlight/machine-learning-challenges-lessons-and-opportunities-in-credit-risk-modeling
Loans that are past due for more than 90 days can be classified as default as per the Basel II definition (Basel Committee on Banking Supervision, 2004)
companies will have incentive to alter customers creditworthiness according to stage of economical cycle
introduction of bias through use of alternative data –danger of discrimination for ethnic minorities by implicit correlation with other characteristics
Use of historical data to build credit score; lack of historic data for ML models and algorithms
Implicit method
Here, the accuracy of the user profile depends on the amount of generated data through user-system interaction.
In implicit personalization, information about the user for user profiles is gathered implicitly (e.g. click streams, scrolling, saving). Therefore, the user is unaware of the information gathering process.
In explicit personalization, on the other hand, user profile information is gathered via direct involvement with the user (e.g. questionnaires, ratings and feedback forms). Here, the user is aware of the information gathering process.
In implicit personalization, the accuracy improves with the continuous use of the system by the user. In explicit personalization, on the other hand, accuracy of the personalized information is based on manually provided information that is updated by the user.
D. Kelly and J. Teevan (2003). Implicit feedback for inferring user preference: a bibliography, ACM Special Interest Group on Information Retrieval (SIGIR) forum, 37(2), pp. 18-28
Content-Based Method
User’s current behaviour is predicted from his past behaviour. In this scheme user profiles are represented similar with queries and the system selects the items that have high content correlation with the user profile. The content dependence is the main drawback of the content-based filtering. Therefore, this method performs badly if the item’s content is very limited and cannot be analysed easily by the content-based filtering.
D. Godoy and A. Amandi (2005). User profiling in personal information agents: a survey, The Knowledge Engineering Review Journal, 20(4), pp. 329-361
Collaborative method
The collaborative method is based on the rating patterns of similar users. In this method people with similar rating patterns, or in other words people with similar taste, are referred to as ‘like minded people’ [2]. Unlike contentbased method, collaborative method ignores the item’s content and does recommendation of the items only based on the similar users’ item rating.
Two main drawbacks: The sparsity is the situation when there is a lack of ratings available that is caused by an insufficient number of user or very few ratings per user. Moreover, the first-rater problem, also referred as cold-start problem, can be observed when a new user has a deficient number of ratings.
Memory-based and Model-based methods:
Memory-based and model-based techniques enable users to filter the received information according to the ratings, which is the feedback given by the like minded users of the system. in these techniques the user can be provided recommendations from the categories which are not previously declared as interesting or relevant by the user but have received high ratings from the users with similar tastes. In these techniques, user’s profile is a set of ratings that the user have given to a selection of items from the system database
Hybrid (filtering) method
A hybrid method, also referred as hybrid filtering method, uses content-based and collaborative methods to combine the advantages and overcome the limitations of both methods. This method guaranties the immediate availability of a profile for each user. The system that employs the hybrid method provides a more accurate description of the user interests and preferences, as it continuously monitors and retrieves the user related information through the user-system interaction [1]. Generally, the hybrid method assigns the new user a default profile with the use of the collaborative method and further enhances the profile using the content-based method
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++
User privacy
Selling information for third parties for profits, privacy breaches, disclosure of personal information.
++++