SlideShare ist ein Scribd-Unternehmen logo
1 von 24
TOWARDS ADVANCED DATA RETRIEVAL
FROM LEARNING OBJECTS REPOSITORIES
Valentina Paunovic
Belgrade Metropolitan University
Slobodan Jovanovic
Belgrade Metropolitan University
This work was supported by Ministry of Education, Science and Technology
(Project III44006).
What problem do we solve?
Popularity of personalized distance based learning
Demands

Effective creation of learning materials
Enables

REUSABILITY

SEARCH

Enables
Textual search
Learning Object

Search

Type

Text
Image
Video
...

Type

Meta
data

TEXTUAL

Effective textual search in large LOR is important
Our system - contributions
• Search engine
– Steiner-trees approach
– Algorithm for graph representation of LOR.

• Query language
– Extension based on formal logic.
– Algorithm for parsing extended language.
Steiner trees search
Traditional search:
(for example - text processing applications)
alternative

Steiner trees
Steiner trees approach
• Query
– word1, word2, word3

• Possible interpretation
– Find all objects such that each object contains all
words from query
– Issue: what if there is no such object?

• Alternative interpretation
– Find all groups of related objects such that each
group contains all words form query
Example – possible alternatives
Ranking
• Smaller number of LO:
– Stronger relationships among terms from query
– Conclusion: advantage in rankings
– Example: the best solutions consist of only one LO

• Group which contains more similar LO (from
same area or subject)
– Stronger relationships among terms from query
– Conclusion: advantage in rankings
– Example: the best solution are groups of LO from the
same area
Main advantages
• Situation: there is no object which satisfies all
terms from query
– Traditional search – no results
– Steiner trees search – returns results

• Possible to detect implicit relationships among
learning objects
Vector space model from text mining
• How to determine which LO are related?

• LO is represented as an m-dimensional TF-IDF vector:

r (d ) (tfidf1 , tfidf 2 ,..., tfidf m )
• Each component is calculated as tfidf
•

tf * idf

Term frequency:

tfi

h j n(i, j )
j

– n(i,j) - number of occurrences of i-th term in the j-th slot of LO d
– hj - weight associated with the j-th slot.
Vector space model II
• Weights :
– The highest impact (weight) have terms from metadata
title, keywords and description.
– Medium impact have terms from content (if there is
textual content).
– Low impact have terms from the rest of searchable
metadata

• Inverse document frequency has purpose to reduce
impact of common words
| LOR |
idfi log
| {d LOR : wi d } |
LO similarity measure
• Now we can introduce similarity measure
• One possibility - Cosine similarity
sim(d1, d 2)

r (d1) r (d 2)
|| r (d1) || * || r (d1) ||
Search algorithm
• Issue: finding top k minimum cost Steiner
trees (MCST-k) is NP complete
• DBPF-k developed for keyword search on DB:
– Has polynomial solution
– First returned result is optimal
– The rest of (k-1) solutions are approximate

• Efficiency of DBPF-k algorithm depends on
graph sparseness.
Graph representation of LOR
• Steiner-trees search requires sparse graph
• Graph representation of LOR:
– Nodes: LO
– Weighted edges: defined by similarity measure
between any two nodes

• Issue: dense graph - number of edges:
2
O(( number of LO ) )
• Result: Slow search
Graph sparsification - rules
•
•
•
•

No node should be removed from the graph.
Low similarity edges should be removed from the graph.
Edge removal should not violate graph connectivity.
Targeted number of edges is specified by parameter T.
Graph obtained by sparsification process should have less
than T edges, unless it violates connectivity constraint.
• No priority among edges of equal weight
• If two learning objects are in relationship specified by the
metadata relation, it should be preserved in the graph
regardless of similarity degree between these two learning
objects.
Sparsification
• Complexity of the
algorithm is:

O(| E | log | E |)
O((number of

2

LO) )
Query language
• Example query: exponential function
• Issue 1: What if there is a term exp instead of
exponential?
– Possible solution: dictionary of synonyms + dictionary of
acronyms and abbreviations
– Problem: Can be complicated to implement

• Issue 2: Find all exponential or logarithmic
functions
– Possible solution: submit two different queries
– Problem: Can be inconvenient for a user
Query language - extension
1. Operator and, marked by reserved word %AND.
2. Operator or, marked by reserved word %OR.
• Both operators have the same precedence priority.
• Expressions are evaluated from left to right.
• If there is no operator between two terms, implicitly
is assumed %AND operation. For example, “math
function” is evaluated as “math %AND function”.
• Associativity rule is preserved from formal logic
Query language
• How to evaluate complex expression like
(a %OR b) %AND ((c %OR d) %AND e)
• We can not submit such query directly to search
algorithm
• We need a query parsing algorithm
,
,

Query language - terminology

.

• Term (t) – word used in a query
• Simple Query (Q) – set of terms:
Q {t1 , t 2 ,..., t|Q| }
• Expression (E) – set of simple queries:
E {Q1 , Q2 ,..., Q|E| }

• Operation corresponds to operator %AND:

E1

E2

{Qi  Q j | Qi

E1 , Q j

• Operation corresponds to operator %OR:

E1

E2

E1  E 2

E2 }
Parsing algorithm
initialize S as empty stack of expressions;
initialize empty set of search results R;
foreach token w of query
switch(w):
case “(”,“%AND”,“%OR”: push w to S;
case “)”:
E<-evaluateTopExpression(S);
push E to S;
default:
if(previous token is term)
push “%AND” to S;
Q = {w};
E = {Q};
push E to S;
end switch;
E<-evaluateTopExpression(S);
foreach simple query Q from E
result = DBPF-k(Q);
add result to R;

evaluateTopExpression(S)
{
initialize SH as empty stack;
while (S not empty)
wh<-pop from S;
if(wh = “(”)
break;
push wh to SH;
while (true)
first<-pop from SH;
if (SH is empty) return first;
operator<-pop from SH;
second<-pop from SH;
switch(operator)
case “%AND”:
result = first ^ second;
case “%OR”:
result = first v second;
end switch;
push result to SH;
}
Architecture of search system
Conclusion
• Proposed architectural solution for advanced
search through repositories of learning objects
• Search based on finding top-k min-cost Steiner
trees
• Proposed algorithm for sparse weighted graph
representation of a LO repository
• Proposed extension of query language based
on formal logic and designed an algorithm for
parsing it

Weitere ähnliche Inhalte

Was ist angesagt?

Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkSaurav Jha
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLPSatyam Saxena
 
TextRank: Bringing Order into Texts
TextRank: Bringing Order into TextsTextRank: Bringing Order into Texts
TextRank: Bringing Order into TextsShubhangi Tandon
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNNSomnath Banerjee
 
Scheme Programming Language
Scheme Programming LanguageScheme Programming Language
Scheme Programming LanguageReham AlBlehid
 
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...IRJET Journal
 
Unit 1 polynomial manipulation
Unit 1   polynomial manipulationUnit 1   polynomial manipulation
Unit 1 polynomial manipulationLavanyaJ28
 
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...Minh Pham
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools Lifeng (Aaron) Han
 
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...Lifeng (Aaron) Han
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distanceananth
 
A New Paradigm for Alignment Extraction
A New Paradigm for Alignment ExtractionA New Paradigm for Alignment Extraction
A New Paradigm for Alignment Extractioncmeilicke
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...Jinho Choi
 
Algorithm Name Detection & Extraction
Algorithm Name Detection & ExtractionAlgorithm Name Detection & Extraction
Algorithm Name Detection & ExtractionDeeksha thakur
 
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...Minsuk Kahng
 
The Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhoneThe Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhoneJames Long
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
 

Was ist angesagt? (20)

Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
 
DLBLR talk
DLBLR talkDLBLR talk
DLBLR talk
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
TextRank: Bringing Order into Texts
TextRank: Bringing Order into TextsTextRank: Bringing Order into Texts
TextRank: Bringing Order into Texts
 
Classifying Text using CNN
Classifying Text using CNNClassifying Text using CNN
Classifying Text using CNN
 
7 Methods and Functional Programming
7  Methods and Functional Programming7  Methods and Functional Programming
7 Methods and Functional Programming
 
Scheme Programming Language
Scheme Programming LanguageScheme Programming Language
Scheme Programming Language
 
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
 
Unit 1 polynomial manipulation
Unit 1   polynomial manipulationUnit 1   polynomial manipulation
Unit 1 polynomial manipulation
 
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...
A Feature-Based Model for Nested Named-Entity Recognition at VLSP-2018 NER Ev...
 
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
 
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distance
 
A New Paradigm for Alignment Extraction
A New Paradigm for Alignment ExtractionA New Paradigm for Alignment Extraction
A New Paradigm for Alignment Extraction
 
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
The Pupil Has Become the Master: Teacher-Student Model-Based Word Embedding D...
 
Algorithm Name Detection & Extraction
Algorithm Name Detection & ExtractionAlgorithm Name Detection & Extraction
Algorithm Name Detection & Extraction
 
Scheme language
Scheme languageScheme language
Scheme language
 
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...
Ranking Objects by Following Paths in Entity-Relationship Graphs (PhD Worksho...
 
The Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhoneThe Scheme Language -- Using it on the iPhone
The Scheme Language -- Using it on the iPhone
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
 

Ähnlich wie Towards advanced data retrieval from learning objects repositories

a581a6a2cb5778045788f0b1d7da1c0236f.pptx
a581a6a2cb5778045788f0b1d7da1c0236f.pptxa581a6a2cb5778045788f0b1d7da1c0236f.pptx
a581a6a2cb5778045788f0b1d7da1c0236f.pptxchristinamary2620
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...DrkhanchanaR
 
DAA-Unit1.pptx
DAA-Unit1.pptxDAA-Unit1.pptx
DAA-Unit1.pptxNishaS88
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues listsJames Wong
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
StacksqueueslistsFraboni Ec
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsTony Nguyen
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsHarry Potter
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues listsYoung Alista
 
Discrete Mathematics
Discrete MathematicsDiscrete Mathematics
Discrete Mathematicsmetamath
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph dataPetra Selmer
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfMAJDABDALLAH3
 
Topic Set Size Design with the Evaluation Measures for Short Text Conversation
Topic Set Size Design with the Evaluation Measures for Short Text ConversationTopic Set Size Design with the Evaluation Measures for Short Text Conversation
Topic Set Size Design with the Evaluation Measures for Short Text ConversationTetsuya Sakai
 
Algorithms & Complexity Calculation
Algorithms & Complexity CalculationAlgorithms & Complexity Calculation
Algorithms & Complexity CalculationAkhil Kaushik
 
Data Structures and Algorithm - Week 11 - Algorithm Analysis
Data Structures and Algorithm - Week 11 - Algorithm AnalysisData Structures and Algorithm - Week 11 - Algorithm Analysis
Data Structures and Algorithm - Week 11 - Algorithm AnalysisFerdin Joe John Joseph PhD
 

Ähnlich wie Towards advanced data retrieval from learning objects repositories (20)

a581a6a2cb5778045788f0b1d7da1c0236f.pptx
a581a6a2cb5778045788f0b1d7da1c0236f.pptxa581a6a2cb5778045788f0b1d7da1c0236f.pptx
a581a6a2cb5778045788f0b1d7da1c0236f.pptx
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
 
lecture 01.1.ppt
lecture 01.1.pptlecture 01.1.ppt
lecture 01.1.ppt
 
Text Mining Analytics 101
Text Mining Analytics 101Text Mining Analytics 101
Text Mining Analytics 101
 
Q
QQ
Q
 
DAA-Unit1.pptx
DAA-Unit1.pptxDAA-Unit1.pptx
DAA-Unit1.pptx
 
Stack squeues lists
Stack squeues listsStack squeues lists
Stack squeues lists
 
Stacksqueueslists
StacksqueueslistsStacksqueueslists
Stacksqueueslists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Stacks queues lists
Stacks queues listsStacks queues lists
Stacks queues lists
 
Discrete Mathematics
Discrete MathematicsDiscrete Mathematics
Discrete Mathematics
 
Fosdem 2013 petra selmer flexible querying of graph data
Fosdem 2013 petra selmer   flexible querying of graph dataFosdem 2013 petra selmer   flexible querying of graph data
Fosdem 2013 petra selmer flexible querying of graph data
 
Searching.pptx
Searching.pptxSearching.pptx
Searching.pptx
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdf
 
Topic Set Size Design with the Evaluation Measures for Short Text Conversation
Topic Set Size Design with the Evaluation Measures for Short Text ConversationTopic Set Size Design with the Evaluation Measures for Short Text Conversation
Topic Set Size Design with the Evaluation Measures for Short Text Conversation
 
Algorithms & Complexity Calculation
Algorithms & Complexity CalculationAlgorithms & Complexity Calculation
Algorithms & Complexity Calculation
 
Unit 1.pptx
Unit 1.pptxUnit 1.pptx
Unit 1.pptx
 
Data Structures and Algorithm - Week 11 - Algorithm Analysis
Data Structures and Algorithm - Week 11 - Algorithm AnalysisData Structures and Algorithm - Week 11 - Algorithm Analysis
Data Structures and Algorithm - Week 11 - Algorithm Analysis
 

Kürzlich hochgeladen

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Kürzlich hochgeladen (20)

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

Towards advanced data retrieval from learning objects repositories

  • 1.
  • 2. TOWARDS ADVANCED DATA RETRIEVAL FROM LEARNING OBJECTS REPOSITORIES Valentina Paunovic Belgrade Metropolitan University Slobodan Jovanovic Belgrade Metropolitan University This work was supported by Ministry of Education, Science and Technology (Project III44006).
  • 3. What problem do we solve? Popularity of personalized distance based learning Demands Effective creation of learning materials Enables REUSABILITY SEARCH Enables
  • 5. Our system - contributions • Search engine – Steiner-trees approach – Algorithm for graph representation of LOR. • Query language – Extension based on formal logic. – Algorithm for parsing extended language.
  • 6. Steiner trees search Traditional search: (for example - text processing applications) alternative Steiner trees
  • 7. Steiner trees approach • Query – word1, word2, word3 • Possible interpretation – Find all objects such that each object contains all words from query – Issue: what if there is no such object? • Alternative interpretation – Find all groups of related objects such that each group contains all words form query
  • 8. Example – possible alternatives
  • 9. Ranking • Smaller number of LO: – Stronger relationships among terms from query – Conclusion: advantage in rankings – Example: the best solutions consist of only one LO • Group which contains more similar LO (from same area or subject) – Stronger relationships among terms from query – Conclusion: advantage in rankings – Example: the best solution are groups of LO from the same area
  • 10. Main advantages • Situation: there is no object which satisfies all terms from query – Traditional search – no results – Steiner trees search – returns results • Possible to detect implicit relationships among learning objects
  • 11. Vector space model from text mining • How to determine which LO are related? • LO is represented as an m-dimensional TF-IDF vector: r (d ) (tfidf1 , tfidf 2 ,..., tfidf m ) • Each component is calculated as tfidf • tf * idf Term frequency: tfi h j n(i, j ) j – n(i,j) - number of occurrences of i-th term in the j-th slot of LO d – hj - weight associated with the j-th slot.
  • 12. Vector space model II • Weights : – The highest impact (weight) have terms from metadata title, keywords and description. – Medium impact have terms from content (if there is textual content). – Low impact have terms from the rest of searchable metadata • Inverse document frequency has purpose to reduce impact of common words | LOR | idfi log | {d LOR : wi d } |
  • 13. LO similarity measure • Now we can introduce similarity measure • One possibility - Cosine similarity sim(d1, d 2) r (d1) r (d 2) || r (d1) || * || r (d1) ||
  • 14. Search algorithm • Issue: finding top k minimum cost Steiner trees (MCST-k) is NP complete • DBPF-k developed for keyword search on DB: – Has polynomial solution – First returned result is optimal – The rest of (k-1) solutions are approximate • Efficiency of DBPF-k algorithm depends on graph sparseness.
  • 15. Graph representation of LOR • Steiner-trees search requires sparse graph • Graph representation of LOR: – Nodes: LO – Weighted edges: defined by similarity measure between any two nodes • Issue: dense graph - number of edges: 2 O(( number of LO ) ) • Result: Slow search
  • 16. Graph sparsification - rules • • • • No node should be removed from the graph. Low similarity edges should be removed from the graph. Edge removal should not violate graph connectivity. Targeted number of edges is specified by parameter T. Graph obtained by sparsification process should have less than T edges, unless it violates connectivity constraint. • No priority among edges of equal weight • If two learning objects are in relationship specified by the metadata relation, it should be preserved in the graph regardless of similarity degree between these two learning objects.
  • 17. Sparsification • Complexity of the algorithm is: O(| E | log | E |) O((number of 2 LO) )
  • 18. Query language • Example query: exponential function • Issue 1: What if there is a term exp instead of exponential? – Possible solution: dictionary of synonyms + dictionary of acronyms and abbreviations – Problem: Can be complicated to implement • Issue 2: Find all exponential or logarithmic functions – Possible solution: submit two different queries – Problem: Can be inconvenient for a user
  • 19. Query language - extension 1. Operator and, marked by reserved word %AND. 2. Operator or, marked by reserved word %OR. • Both operators have the same precedence priority. • Expressions are evaluated from left to right. • If there is no operator between two terms, implicitly is assumed %AND operation. For example, “math function” is evaluated as “math %AND function”. • Associativity rule is preserved from formal logic
  • 20. Query language • How to evaluate complex expression like (a %OR b) %AND ((c %OR d) %AND e) • We can not submit such query directly to search algorithm • We need a query parsing algorithm
  • 21. , , Query language - terminology . • Term (t) – word used in a query • Simple Query (Q) – set of terms: Q {t1 , t 2 ,..., t|Q| } • Expression (E) – set of simple queries: E {Q1 , Q2 ,..., Q|E| } • Operation corresponds to operator %AND: E1 E2 {Qi  Q j | Qi E1 , Q j • Operation corresponds to operator %OR: E1 E2 E1  E 2 E2 }
  • 22. Parsing algorithm initialize S as empty stack of expressions; initialize empty set of search results R; foreach token w of query switch(w): case “(”,“%AND”,“%OR”: push w to S; case “)”: E<-evaluateTopExpression(S); push E to S; default: if(previous token is term) push “%AND” to S; Q = {w}; E = {Q}; push E to S; end switch; E<-evaluateTopExpression(S); foreach simple query Q from E result = DBPF-k(Q); add result to R; evaluateTopExpression(S) { initialize SH as empty stack; while (S not empty) wh<-pop from S; if(wh = “(”) break; push wh to SH; while (true) first<-pop from SH; if (SH is empty) return first; operator<-pop from SH; second<-pop from SH; switch(operator) case “%AND”: result = first ^ second; case “%OR”: result = first v second; end switch; push result to SH; }
  • 24. Conclusion • Proposed architectural solution for advanced search through repositories of learning objects • Search based on finding top-k min-cost Steiner trees • Proposed algorithm for sparse weighted graph representation of a LO repository • Proposed extension of query language based on formal logic and designed an algorithm for parsing it