SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Development of	Explainable NLP	
Models:	"You	show	me	the	man	and	
I	will show	you the	rule"
SVITLANA GALESHCHUK
About	me
vData	Scientist:	6	years
vIn	NLP:	3	years
vFulbright Scholar in	2015-2016,	USA
vVisiting Associate Prof.	at	University of	Grenoble,	France	2017
vData	Scientist,	Lecturer and	Researcher at	PSL/University of	Paris	
Dauphine,	France,	since 2017
vData	Scientist at	Starclay Consulting,	France,	since 2019
vEmail:	svitlana.galeshchuk@gmail.com
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
NLP — Natural	Language	“Processing” =	
NLU — Natural	Language	“Understanding” (Sentiment	Analysis,	Topic	
Classification,	Entity	Detection)	 +
NLG — Natural	Language	“Generation”	(textual	summaries,	etc)	
I.	What is NLP	?
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
Word	Embedding
NLP	:	Natural	Language	Processing
• 2001 : Neural language models: word embedding
> converting the words into vectors
Bengio, Y., Ducharme, R. & Vincent, P. A. Neural probabilistic language model.
Proc. Advances. Neural Information Processing Systems 13. 932–938 (2001)
• 2013 : Model Word2vec : Linguistic Contextualisation of words
> Predict the word based on the context
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. Distributed representations of words and phrases
and their compositionality. Advances in neural information processing systems. 3111-3119 (2013)
• 2018 : Le modèle révolutionnaire BERT de Google
> Bidirectional Encoder Representations from Transformers
[1]	Vaswani,	A.,	Shazeer,	N.,	Parmar,	N.,	Uszkoreit,	J.,	Jones,	L.,	Gomez,	A.	N.,	...	&	Polosukhin,	I.	Attention	is all	you need.	
Advances in	neural	information	processing systems.	5998-6008.	(2017)
[2]	Devlin,	Jacob,	et	al.	Bert:	Pre-training	of	deep bidirectional transformers for	language understanding.	arXiv preprint
arXiv:1810.04805 (2018)
BERT
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
ØText is a	set	of	words;
ØWords are	discrete values,	hence the	curse of	dimensionality;
ØEmbedding (converting words into vectors)	is the	way to	use	text in	
ML;
ØAutoregressive nature	of	natural language makes ML	practitioners to	
often use	LSTM	in	NLP	tasks;
ØBERT	being a	major	breakthrough since 2017	is difficult to	put	into
production;	it is good	for	texts less than 512	tokens.
To	retain:	
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
II.	Explainable AI
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
Explainable AI	Methods
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
Ribeiro,	M.	T.,	Singh,	S.,	&	Guestrin,	C.	"	Why should i	trust	you?"	Explaining the	predictions of	any classifier.	
In Proceedings of	the	22nd	ACM	SIGKDD	international	conference on	knowledge discovery and	data	mining.
1135-1144.	(2016)
LIME	Intuition
LIME
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
+	LIME	delivers only
local	explanations
LIME	Disadvantages
Medical records
(14.000	patients)
Comments of	the	clinical history
of	patient,	lifestyle and	the	
symptomes
(features)
Motive of	
hospitalisation
(features)
Principal	diagnosis
(target)
Used Data
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
Use	Case:	Hospital Data
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
Ingelsson, E., Lundholm, C., Johansson, A. L., & Altman, D. Hysterectomy and risk of cardiovascular disease: a
population-based cohort study. European heart journal, 32(6), 745-750. (2011)
Laughlin-Tommaso, S. K., Khan, Z., Weaver, A. L., Smith, C. Y., Rocca, W. A., & Stewart, E. A. Cardiovascular and
metabolic morbidity after hysterectomy with ovarian conservation: a cohort study. Menopause (New York,
NY), 25(5), 483. (2018)
« Women who have had a hysterectomy, especially before the age of 35, have a higher
risk of having a stroke. About 70,000 hysterectomies are performed each year in France »
Stroke (863 patients) :
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
Coumbaras, M., A. Duval, P. Le Hir, N. Jomaah, L. Arrivé, and J. M. Tubiana. "Fibrolipome du filum terminal." J Radiol 84.
721-7222 (2003)
«When the lipoma is located in the thoracic region, it can be responsible for chronic back pain
and sometimes headaches»
Low back pain (1040 observations) :
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
« the	Shapley	value:	It	is the	average of	the	marginal	contributions	across all	
permutations »
« What Shapley	does is quantifying the	contribution	that each player brings to	the	game.	
What SHAP	does is quantifying the	contribution	that each feature brings to	the	prediction
made	by	the	model »
SHAP:	both local	and	global	explainability
Lundberg,	Scott	M.,	and	Su-In	Lee.	"A	unified approach to	interpreting model	predictions." Advances
in	neural	information	processing systems.	2017.
Shap Local	Results
SHAP
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
LIME	and	SHAP	into generating innocuous explanations which do	not	reflect the	
underlying biases
Takes a	long	time	to	compute.	For	large	datasets,	it is computationally expensive
to	use	the	entire dataset and	we have	to	rely on	approximations	(e.g.,	
subsample the	data).	This	has	implications	for	the	accuracy of	the	explanation.
Original	SHAP	implementation has	issues	with visualization when more	than 20	
words are	in	the	text:
Slack,	Dylan,	et	al.	"Fooling lime	and	shap:	Adversarial attacks on	post	hoc	explanation methods." Proceedings of	the	
AAAI/ACM	Conference on	AI,	Ethics,	and	Society.	2020.
SHAP	Disadvantages
Sundararajan,	M.,	Taly,	A.,	&	Yan,	Q.	(2017).	Axiomatic attribution	for	deep networks. arXiv preprint
arXiv:1703.01365.	
Mudrakarta,	Pramod Kaushik,	et	al.	"Did the	model	understand the	question?." arXiv preprint
arXiv:1805.05492 (2018).:
« As	the	input	varies	along the	straight	line	path between the	baseline and	the	
input	at	hand,	the	prediction moves	along a	trajectory from uncertainty to	
certainty (the	final	prediction probability).	At	each point	on	this trajectory,	one	
can use	the	gradient	with respect	to	the	input	features to	attribute the	change	in	
the	prediction probability back	to	the	input	features.	IG	aggregates these
gradients	along the	trajectory using a	path integral »
Øapt for	all	differentiable models;
Øeasy to	implement;
Øcomputationally scalable to	massive	deep networks;
Ømuch faster than a	naive Shapley-value-based method
INTEGRATED	GRADIENTS
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
§ Not	working with non-differentiable model	types	(random forest,	etc);
§ Some counterintuitive explanations
IG	Disadvantages
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
DeepLIFT proceeds in	a	backward fashion.	Each unit	is assigned an	
attribution	that represents the	relative	effect of	the	unit	activated at	the	
original	network	input	x	compared to	the	activation	at	some reference
input.	Reference	values	for	all	hidden units are	determined running	a	
forward pass through the	network,	using the	baseline as	input,	and	
recording the	activation	of	each unit
Pros:	very fast
Cons:	picking	the	baseline inputs
Gabriel	Tseng
Nov	5,	2020																																																																																																																		 UA	Online	Data	Science	Marathon
DeepLIFT
Integrated	Gradients: Mukund Sundararajan,	Ankur Taly,	Qiqi
Yan, Axiomatic	Attribution	for	Deep	Networks,	2017
DeepLIFT: Avanti Shrikumar,	Peyton Greenside,	Anshul
Kundaje, Learning	Important	Features	Through	Propagating	Activation	
Differences,	2017
SHAP	values: Scott	M.	Lundberg,	Su-In	Lee, A	Unified	Approach	to	
Interpreting	Model	Predictions,	2017
LIME:	Ribeiro,	M.	T.,	Singh,	S.,	&	Guestrin,	C. Why should i	trust	you?								
Explaining the	predictions of	any classifier	(2016)
Literature
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon
•Explanation of	the	black-box	models’	outputs	is an	important step towards
making the	bridge	between the	model	and	its end-user;
•Explainable AI	methods may deliver global	or/and	local	interpretaions;
•Most	of	the	current approaches are	based on	the	cooperative game theory;
•Validation of	interpretations is usually provided by	field experts.	Kullback-
Leibler divergence	are	sometimes used to	assess the	interpretations.
•Python	implementation:
Shap:	https://github.com/slundberg/shap
LIME:	https://eli5.readthedocs.io/en/latest/overview.html
IG,	DeepLIFT:	https://captum.ai/
To	retain:
Nov	5,	2020																																																																																																						UA	Online	Data	Science	Marathon

Weitere ähnliche Inhalte

Ähnlich wie Svitlana Galeshchuk Development of Explainable NLP Models: "You show me the man and I will show you the rule"

Talk on reproducibility in EEG research
Talk on reproducibility in EEG researchTalk on reproducibility in EEG research
Talk on reproducibility in EEG researchDorothy Bishop
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as CommoditiesMathieu d'Aquin
 
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docxgertrudebellgrove
 
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...A. Hasib Uddin
 
If only access were our only infrastructure problem!
If only access were our only infrastructure problem!If only access were our only infrastructure problem!
If only access were our only infrastructure problem!Björn Brembs
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataPablo Bernabeu
 
Normative Modeling & Patients Stratifications: Dealing with Dimensions & Cat...
Normative Modeling & Patients Stratifications:  Dealing with Dimensions & Cat...Normative Modeling & Patients Stratifications:  Dealing with Dimensions & Cat...
Normative Modeling & Patients Stratifications: Dealing with Dimensions & Cat...Guillaume Dumas
 
Cognitive Computing at University Osnabrück
Cognitive Computing at University OsnabrückCognitive Computing at University Osnabrück
Cognitive Computing at University OsnabrückSteven Miller
 
The future of scholarly publishing
The future of scholarly publishingThe future of scholarly publishing
The future of scholarly publishingBjörn Brembs
 
Lariviere - Unraveling gender disparities in science
Lariviere - Unraveling gender disparities in scienceLariviere - Unraveling gender disparities in science
Lariviere - Unraveling gender disparities in scienceinnovationoecd
 
Keynote at VR in Science and Industry
Keynote at VR in Science and Industry Keynote at VR in Science and Industry
Keynote at VR in Science and Industry Christian Sandor
 
CV_JNorris_Oct2016
CV_JNorris_Oct2016CV_JNorris_Oct2016
CV_JNorris_Oct2016Jade Norris
 
BIOSKETCH
BIOSKETCHBIOSKETCH
BIOSKETCHbutest
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformaticsJan Aerts
 

Ähnlich wie Svitlana Galeshchuk Development of Explainable NLP Models: "You show me the man and I will show you the rule" (20)

Haladjian CV
Haladjian CVHaladjian CV
Haladjian CV
 
Talk on reproducibility in EEG research
Talk on reproducibility in EEG researchTalk on reproducibility in EEG research
Talk on reproducibility in EEG research
 
Data and Knowledge as Commodities
Data and Knowledge as CommoditiesData and Knowledge as Commodities
Data and Knowledge as Commodities
 
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docx
 
2014 mmg-talk
2014 mmg-talk2014 mmg-talk
2014 mmg-talk
 
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...
Depression Analysis of Bangla Social Media Data using Gated Recurrent Neural ...
 
thesis_final.pdf
thesis_final.pdfthesis_final.pdf
thesis_final.pdf
 
If only access were our only infrastructure problem!
If only access were our only infrastructure problem!If only access were our only infrastructure problem!
If only access were our only infrastructure problem!
 
Towards reproducibility and maximally-open data
Towards reproducibility and maximally-open dataTowards reproducibility and maximally-open data
Towards reproducibility and maximally-open data
 
Normative Modeling & Patients Stratifications: Dealing with Dimensions & Cat...
Normative Modeling & Patients Stratifications:  Dealing with Dimensions & Cat...Normative Modeling & Patients Stratifications:  Dealing with Dimensions & Cat...
Normative Modeling & Patients Stratifications: Dealing with Dimensions & Cat...
 
Cognitive Computing at University Osnabrück
Cognitive Computing at University OsnabrückCognitive Computing at University Osnabrück
Cognitive Computing at University Osnabrück
 
The future of scholarly publishing
The future of scholarly publishingThe future of scholarly publishing
The future of scholarly publishing
 
Lariviere - Unraveling gender disparities in science
Lariviere - Unraveling gender disparities in scienceLariviere - Unraveling gender disparities in science
Lariviere - Unraveling gender disparities in science
 
One Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and VisionOne Perceptron to Rule Them All: Language and Vision
One Perceptron to Rule Them All: Language and Vision
 
Keynote at VR in Science and Industry
Keynote at VR in Science and Industry Keynote at VR in Science and Industry
Keynote at VR in Science and Industry
 
CV_JNorris_Oct2016
CV_JNorris_Oct2016CV_JNorris_Oct2016
CV_JNorris_Oct2016
 
Measuring Research Impact
Measuring Research ImpactMeasuring Research Impact
Measuring Research Impact
 
Quality of Life Technologies: From Cure to Care
Quality of Life Technologies: From Cure to CareQuality of Life Technologies: From Cure to Care
Quality of Life Technologies: From Cure to Care
 
BIOSKETCH
BIOSKETCHBIOSKETCH
BIOSKETCH
 
Humanizing bioinformatics
Humanizing bioinformaticsHumanizing bioinformatics
Humanizing bioinformatics
 

Mehr von Lviv Startup Club

Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...
Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...
Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...Lviv Startup Club
 
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)Lviv Startup Club
 
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...Lviv Startup Club
 
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)Lviv Startup Club
 
Nikita Zahurdaiev: PMO Tools and Technologies (UA)
Nikita Zahurdaiev: PMO Tools and Technologies (UA)Nikita Zahurdaiev: PMO Tools and Technologies (UA)
Nikita Zahurdaiev: PMO Tools and Technologies (UA)Lviv Startup Club
 
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Lviv Startup Club
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Lviv Startup Club
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Lviv Startup Club
 
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Lviv Startup Club
 
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Lviv Startup Club
 
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Lviv Startup Club
 
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Lviv Startup Club
 
Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Lviv Startup Club
 
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Lviv Startup Club
 
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Lviv Startup Club
 
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Lviv Startup Club
 
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Lviv Startup Club
 
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Lviv Startup Club
 
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Lviv Startup Club
 
Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Lviv Startup Club
 

Mehr von Lviv Startup Club (20)

Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...
Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...
Anastasiia Khait: Building Product Passion: Empowering Development Teams thro...
 
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)
Oksana Krykun: Перші 90 днів в роботі над новим продуктом (UA)
 
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
Yaroslav Osolikhin: «Неідеальний» проєктний менеджер: People Management під ч...
 
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
Mykhailo Hryhorash: What can be good in a "bad" project? (UA)
 
Nikita Zahurdaiev: PMO Tools and Technologies (UA)
Nikita Zahurdaiev: PMO Tools and Technologies (UA)Nikita Zahurdaiev: PMO Tools and Technologies (UA)
Nikita Zahurdaiev: PMO Tools and Technologies (UA)
 
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
Nikita Zahurdaiev: Developing PMO Services and Functions (UA)
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
 
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
Oleksandr Krakovetskyi: What's wrong with Generative AI? (UA)
 
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
Stanislav Podyachev: AI Agents as Role-Playing Business Modeling Tools (UA)
 
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
Kyryl Truskovskyi: Training and Serving Open-Sourced Foundational Models (UA)
 
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...Andrii Rodionov: What can go wrong in a distributed system – experience from ...
Andrii Rodionov: What can go wrong in a distributed system – experience from ...
 
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
Dmytro Tkachenko: Можливості АІ відео для бізнесу (UA)
 
Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)Roman Kyslyi: Використання та побудова LLM агентів (UA)
Roman Kyslyi: Використання та побудова LLM агентів (UA)
 
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
Veronika Snizhko: Штучний інтелект як каталізатор інноваційної культури в ком...
 
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
Volodymyr Zhukov: Ключові труднощі в реальних імплементаціях AI. Досвід з пра...
 
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
Volodymyr Zhukov: Куди рухається ринок AI у 2024 році. Інсайти від Stanford H...
 
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
Andrii Boichuk: The RAG is dead, long live the RAG або як сучасні LLM змінюют...
 
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)Vladyslav Fliahin: Applications of Gen AI in CV (UA)
Vladyslav Fliahin: Applications of Gen AI in CV (UA)
 
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
Artem Ternov: Побудова платформи під DataEngineering та DataScience в ентерпр...
 
Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)Michael Vidyakin: Defining PMO Structure and Governance (UA)
Michael Vidyakin: Defining PMO Structure and Governance (UA)
 

Kürzlich hochgeladen

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 

Kürzlich hochgeladen (20)

Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 

Svitlana Galeshchuk Development of Explainable NLP Models: "You show me the man and I will show you the rule"

  • 2. About me vData Scientist: 6 years vIn NLP: 3 years vFulbright Scholar in 2015-2016, USA vVisiting Associate Prof. at University of Grenoble, France 2017 vData Scientist, Lecturer and Researcher at PSL/University of Paris Dauphine, France, since 2017 vData Scientist at Starclay Consulting, France, since 2019 vEmail: svitlana.galeshchuk@gmail.com Nov 5, 2020 UA Online Data Science Marathon
  • 3. Nov 5, 2020 UA Online Data Science Marathon NLP — Natural Language “Processing” = NLU — Natural Language “Understanding” (Sentiment Analysis, Topic Classification, Entity Detection) + NLG — Natural Language “Generation” (textual summaries, etc) I. What is NLP ?
  • 4. Nov 5, 2020 UA Online Data Science Marathon Word Embedding NLP : Natural Language Processing • 2001 : Neural language models: word embedding > converting the words into vectors Bengio, Y., Ducharme, R. & Vincent, P. A. Neural probabilistic language model. Proc. Advances. Neural Information Processing Systems 13. 932–938 (2001) • 2013 : Model Word2vec : Linguistic Contextualisation of words > Predict the word based on the context Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems. 3111-3119 (2013)
  • 5. • 2018 : Le modèle révolutionnaire BERT de Google > Bidirectional Encoder Representations from Transformers [1] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. Attention is all you need. Advances in neural information processing systems. 5998-6008. (2017) [2] Devlin, Jacob, et al. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) BERT Nov 5, 2020 UA Online Data Science Marathon
  • 6. ØText is a set of words; ØWords are discrete values, hence the curse of dimensionality; ØEmbedding (converting words into vectors) is the way to use text in ML; ØAutoregressive nature of natural language makes ML practitioners to often use LSTM in NLP tasks; ØBERT being a major breakthrough since 2017 is difficult to put into production; it is good for texts less than 512 tokens. To retain: Nov 5, 2020 UA Online Data Science Marathon
  • 9. Nov 5, 2020 UA Online Data Science Marathon Ribeiro, M. T., Singh, S., & Guestrin, C. " Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135-1144. (2016) LIME Intuition LIME
  • 11. Medical records (14.000 patients) Comments of the clinical history of patient, lifestyle and the symptomes (features) Motive of hospitalisation (features) Principal diagnosis (target) Used Data Nov 5, 2020 UA Online Data Science Marathon Use Case: Hospital Data
  • 12. Nov 5, 2020 UA Online Data Science Marathon Ingelsson, E., Lundholm, C., Johansson, A. L., & Altman, D. Hysterectomy and risk of cardiovascular disease: a population-based cohort study. European heart journal, 32(6), 745-750. (2011) Laughlin-Tommaso, S. K., Khan, Z., Weaver, A. L., Smith, C. Y., Rocca, W. A., & Stewart, E. A. Cardiovascular and metabolic morbidity after hysterectomy with ovarian conservation: a cohort study. Menopause (New York, NY), 25(5), 483. (2018) « Women who have had a hysterectomy, especially before the age of 35, have a higher risk of having a stroke. About 70,000 hysterectomies are performed each year in France » Stroke (863 patients) :
  • 13. Nov 5, 2020 UA Online Data Science Marathon Coumbaras, M., A. Duval, P. Le Hir, N. Jomaah, L. Arrivé, and J. M. Tubiana. "Fibrolipome du filum terminal." J Radiol 84. 721-7222 (2003) «When the lipoma is located in the thoracic region, it can be responsible for chronic back pain and sometimes headaches» Low back pain (1040 observations) :
  • 14. Nov 5, 2020 UA Online Data Science Marathon « the Shapley value: It is the average of the marginal contributions across all permutations » « What Shapley does is quantifying the contribution that each player brings to the game. What SHAP does is quantifying the contribution that each feature brings to the prediction made by the model » SHAP: both local and global explainability Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." Advances in neural information processing systems. 2017. Shap Local Results SHAP
  • 15. Nov 5, 2020 UA Online Data Science Marathon LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases Takes a long time to compute. For large datasets, it is computationally expensive to use the entire dataset and we have to rely on approximations (e.g., subsample the data). This has implications for the accuracy of the explanation. Original SHAP implementation has issues with visualization when more than 20 words are in the text: Slack, Dylan, et al. "Fooling lime and shap: Adversarial attacks on post hoc explanation methods." Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 2020. SHAP Disadvantages
  • 16. Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic attribution for deep networks. arXiv preprint arXiv:1703.01365. Mudrakarta, Pramod Kaushik, et al. "Did the model understand the question?." arXiv preprint arXiv:1805.05492 (2018).: « As the input varies along the straight line path between the baseline and the input at hand, the prediction moves along a trajectory from uncertainty to certainty (the final prediction probability). At each point on this trajectory, one can use the gradient with respect to the input features to attribute the change in the prediction probability back to the input features. IG aggregates these gradients along the trajectory using a path integral » Øapt for all differentiable models; Øeasy to implement; Øcomputationally scalable to massive deep networks; Ømuch faster than a naive Shapley-value-based method INTEGRATED GRADIENTS Nov 5, 2020 UA Online Data Science Marathon
  • 17. § Not working with non-differentiable model types (random forest, etc); § Some counterintuitive explanations IG Disadvantages Nov 5, 2020 UA Online Data Science Marathon
  • 18. DeepLIFT proceeds in a backward fashion. Each unit is assigned an attribution that represents the relative effect of the unit activated at the original network input x compared to the activation at some reference input. Reference values for all hidden units are determined running a forward pass through the network, using the baseline as input, and recording the activation of each unit Pros: very fast Cons: picking the baseline inputs Gabriel Tseng Nov 5, 2020 UA Online Data Science Marathon DeepLIFT
  • 19. Integrated Gradients: Mukund Sundararajan, Ankur Taly, Qiqi Yan, Axiomatic Attribution for Deep Networks, 2017 DeepLIFT: Avanti Shrikumar, Peyton Greenside, Anshul Kundaje, Learning Important Features Through Propagating Activation Differences, 2017 SHAP values: Scott M. Lundberg, Su-In Lee, A Unified Approach to Interpreting Model Predictions, 2017 LIME: Ribeiro, M. T., Singh, S., & Guestrin, C. Why should i trust you? Explaining the predictions of any classifier (2016) Literature Nov 5, 2020 UA Online Data Science Marathon
  • 20. •Explanation of the black-box models’ outputs is an important step towards making the bridge between the model and its end-user; •Explainable AI methods may deliver global or/and local interpretaions; •Most of the current approaches are based on the cooperative game theory; •Validation of interpretations is usually provided by field experts. Kullback- Leibler divergence are sometimes used to assess the interpretations. •Python implementation: Shap: https://github.com/slundberg/shap LIME: https://eli5.readthedocs.io/en/latest/overview.html IG, DeepLIFT: https://captum.ai/ To retain: Nov 5, 2020 UA Online Data Science Marathon