Companies like Amazon, Netflix, and Youtube have popularized mantras like "More Like This" recommendations. Now, almost all online shops/content sites implement such solutions.
But is it possible to build a scalable Content-Based Recommendation System using open-source software that is easy-to-maintain, simple to tune and straightforward to deploy?
I will present how to use "More Like This" from Apache SOLR. Built as a Search Tool, Apache SOLR can also be used as a Recommendation System as they both operate with computing relevance.
However, the "More Like This" functionality of SOLR uses only text fields. I will show you how to overcome this and fully profit from the powerful capabilities of SOLR.
I will also present how an Inverted Index works, the TF/IDF scoring formula and how to measure the performance of a Recommendation System. All with a step-by-step example.
This presentation proposes a smart garden system that monitors soil moisture using sensors and controls watering automatically using a pump motor and microcontroller. The system aims to automate garden irrigation to make it more efficient and reduce human effort. It lists the components required, including an ESP8266 module, water pump, motor driver, soil moisture sensor, batteries and LEDs. The total cost is estimated at 1770 Tk. The system is intended to help users grow vegetables with less water by adjusting supply based on soil conditions, while saving water and labor compared to manual irrigation.
This document describes an IOT based fish aquarium project. A group of students designed a system to automatically monitor and control the water temperature, level, pH, and feeding using sensors connected to an Arduino and ESP8266. The sensor data is sent to the cloud and displayed on a website, allowing remote control and monitoring. Key components include temperature, water level, and pH sensors, water pumps, a heater, fan and a motorized feeding system. The goal is to reduce fish care needs and deaths through automation.
Revised edition of IoT with more knowledge ,advantages of iot,results of iot,methodology,block diagram,flowchart of iot,details of hardware and software,details of sensor and powerfull features with diagram ,digramatical representation of iot will found very useful to the beginners also .domain iot in healthcare
This document discusses mobile application development. It is classified into logical and physical landscapes. The logical landscape describes how mobility benefits consumers and enterprises, while the physical landscape depicts the infrastructure components like mobile devices, platforms, and app stores. It also describes the three main approaches to developing mobile apps: native, web, and hybrid. Native apps are developed for a specific platform using native APIs, while web apps are developed with web technologies and run in a mobile browser. Hybrid apps combine features of native and web apps, allowing the same codebase to work across multiple platforms.
This presentation provides an overview of embedded systems and describes a collision avoidance robot project. It introduces embedded systems and gives examples. It then describes the key components of embedded systems like processors and memory. It discusses the software used for the project. It introduces the collision avoidance robot project, describing its sensors, control unit, actuators and working. It provides code snippets to show how the robot's movement is controlled based on sensor input to avoid collisions.
Embedded systems are application-specific circuits that combine hardware and software to perform dedicated tasks. Examples include MP3 players, cell phones, medical equipment, appliances, and vehicle components. The first modern embedded system was the Apollo Guidance Computer, while the first mass-produced one was the Autonetics computer for the Minuteman missile. Embedded systems have real-time performance needs, operate with limited resources, and are built into the device they control rather than being general-purpose computers. Common CPU platforms include microprocessors and microcontrollers using architectures like ARM and architectures. Development requires selecting hardware components, a programming language and tools, and debugging the system.
Project Implementation
Real-Time Data Analysis of fabricated hardware & conclusions
Proposed Implementation using the concepts of IoT
Challenges faced in Smart Farming with perspective of India
Further Scope for Innovation from Electrical Engineer’s POV
This document provides an overview of embedded systems, including what they are, where they are commonly found, and the key components and differences between microprocessors and microcontrollers. Embedded systems combine hardware and software to perform a dedicated function, and are found in many everyday devices like home appliances, vehicles, phones, and medical equipment. They contain microprocessors or microcontrollers that act as the central computing element. Microcontrollers are typically lower cost and contain CPU, memory and I/O on a single chip, making them well-suited for embedded applications where size, cost and power are constraints.
This presentation proposes a smart garden system that monitors soil moisture using sensors and controls watering automatically using a pump motor and microcontroller. The system aims to automate garden irrigation to make it more efficient and reduce human effort. It lists the components required, including an ESP8266 module, water pump, motor driver, soil moisture sensor, batteries and LEDs. The total cost is estimated at 1770 Tk. The system is intended to help users grow vegetables with less water by adjusting supply based on soil conditions, while saving water and labor compared to manual irrigation.
This document describes an IOT based fish aquarium project. A group of students designed a system to automatically monitor and control the water temperature, level, pH, and feeding using sensors connected to an Arduino and ESP8266. The sensor data is sent to the cloud and displayed on a website, allowing remote control and monitoring. Key components include temperature, water level, and pH sensors, water pumps, a heater, fan and a motorized feeding system. The goal is to reduce fish care needs and deaths through automation.
Revised edition of IoT with more knowledge ,advantages of iot,results of iot,methodology,block diagram,flowchart of iot,details of hardware and software,details of sensor and powerfull features with diagram ,digramatical representation of iot will found very useful to the beginners also .domain iot in healthcare
This document discusses mobile application development. It is classified into logical and physical landscapes. The logical landscape describes how mobility benefits consumers and enterprises, while the physical landscape depicts the infrastructure components like mobile devices, platforms, and app stores. It also describes the three main approaches to developing mobile apps: native, web, and hybrid. Native apps are developed for a specific platform using native APIs, while web apps are developed with web technologies and run in a mobile browser. Hybrid apps combine features of native and web apps, allowing the same codebase to work across multiple platforms.
This presentation provides an overview of embedded systems and describes a collision avoidance robot project. It introduces embedded systems and gives examples. It then describes the key components of embedded systems like processors and memory. It discusses the software used for the project. It introduces the collision avoidance robot project, describing its sensors, control unit, actuators and working. It provides code snippets to show how the robot's movement is controlled based on sensor input to avoid collisions.
Embedded systems are application-specific circuits that combine hardware and software to perform dedicated tasks. Examples include MP3 players, cell phones, medical equipment, appliances, and vehicle components. The first modern embedded system was the Apollo Guidance Computer, while the first mass-produced one was the Autonetics computer for the Minuteman missile. Embedded systems have real-time performance needs, operate with limited resources, and are built into the device they control rather than being general-purpose computers. Common CPU platforms include microprocessors and microcontrollers using architectures like ARM and architectures. Development requires selecting hardware components, a programming language and tools, and debugging the system.
Project Implementation
Real-Time Data Analysis of fabricated hardware & conclusions
Proposed Implementation using the concepts of IoT
Challenges faced in Smart Farming with perspective of India
Further Scope for Innovation from Electrical Engineer’s POV
This document provides an overview of embedded systems, including what they are, where they are commonly found, and the key components and differences between microprocessors and microcontrollers. Embedded systems combine hardware and software to perform a dedicated function, and are found in many everyday devices like home appliances, vehicles, phones, and medical equipment. They contain microprocessors or microcontrollers that act as the central computing element. Microcontrollers are typically lower cost and contain CPU, memory and I/O on a single chip, making them well-suited for embedded applications where size, cost and power are constraints.
This document outlines plans for a DIY smart home hub controller project. It discusses requirements like working without internet, visual/audio feedback, and low cost. The project will use an ESP32 microcontroller to control actuators and sensors via various protocols. Non-functional requirements around usability, security, reliability and more are also presented. The system architecture diagram shows key functions like OTA updates, GUI, communication protocols and more. Challenges addressed include driver conversion, temperature issues, enclosure design and memory optimizations. Future plans include upgrading the display and adding LoRa wireless support.
The document traces the history and development of microprocessors from 4-bit to 64-bit models over several decades. It details the major microprocessors released by Intel, including the 4004 (1971), the first microprocessor; the 8086 (1978), the first 16-bit microprocessor; the 80386 (1985), the first 32-bit microprocessor; and the Core 2 (2006), one of the first 64-bit microprocessors. The document outlines the increasing complexity and capabilities of microprocessors over time in terms of transistor count, clock speed, memory addressing, and more.
An high-level introduction to Phaser.js.
https://github.com/sH4rk0/meetupRush
https://github.com/sH4rk0/xmas2016
Thanks to Michel Wacker (@starnut) for some input.
This document is a practical training report submitted by Roshan Mani, a student of Electronics and Communication Engineering at GCET Bikaner, as part of an industrial training completed at CMC Academy in Jaipur. The report provides details about the training, including an overview of CMC Academy and the topics covered during the training such as microprocessors vs microcontrollers, embedded systems, memory addressing types, and the AT89C51 microcontroller. It also describes various electronic components and a bidirectional visitor counter home automation project developed during the training.
iot based low cost smart irrigation system
Cloud Technologies providing Complete Solution for all
AcademicProjects Final Year/Semester Student Projects
For More Details,
Contact:
Mobile:- +91 8121953811,
whatsapp:- +91 8522991105,
Office:- 040-66411811
Email ID: cloudtechnologiesprojects@gmail.com
This document provides an overview of embedded systems used in digital cameras, automobiles, and smart card readers. It discusses the key components of digital cameras including image sensors, processors, memory, and interfaces. It describes various embedded applications in automobiles like engine control, safety systems, and infotainment. It also outlines the components of a smart card reader system including a microcontroller, smart card reader IC, memory, and power supply. Case studies of these three embedded applications are presented in detail across multiple pages.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
This document outlines plans for a DIY smart home hub controller project. It discusses requirements like working without internet, visual/audio feedback, and low cost. The project will use an ESP32 microcontroller to control actuators and sensors via various protocols. Non-functional requirements around usability, security, reliability and more are also presented. The system architecture diagram shows key functions like OTA updates, GUI, communication protocols and more. Challenges addressed include driver conversion, temperature issues, enclosure design and memory optimizations. Future plans include upgrading the display and adding LoRa wireless support.
The document traces the history and development of microprocessors from 4-bit to 64-bit models over several decades. It details the major microprocessors released by Intel, including the 4004 (1971), the first microprocessor; the 8086 (1978), the first 16-bit microprocessor; the 80386 (1985), the first 32-bit microprocessor; and the Core 2 (2006), one of the first 64-bit microprocessors. The document outlines the increasing complexity and capabilities of microprocessors over time in terms of transistor count, clock speed, memory addressing, and more.
An high-level introduction to Phaser.js.
https://github.com/sH4rk0/meetupRush
https://github.com/sH4rk0/xmas2016
Thanks to Michel Wacker (@starnut) for some input.
This document is a practical training report submitted by Roshan Mani, a student of Electronics and Communication Engineering at GCET Bikaner, as part of an industrial training completed at CMC Academy in Jaipur. The report provides details about the training, including an overview of CMC Academy and the topics covered during the training such as microprocessors vs microcontrollers, embedded systems, memory addressing types, and the AT89C51 microcontroller. It also describes various electronic components and a bidirectional visitor counter home automation project developed during the training.
iot based low cost smart irrigation system
Cloud Technologies providing Complete Solution for all
AcademicProjects Final Year/Semester Student Projects
For More Details,
Contact:
Mobile:- +91 8121953811,
whatsapp:- +91 8522991105,
Office:- 040-66411811
Email ID: cloudtechnologiesprojects@gmail.com
This document provides an overview of embedded systems used in digital cameras, automobiles, and smart card readers. It discusses the key components of digital cameras including image sensors, processors, memory, and interfaces. It describes various embedded applications in automobiles like engine control, safety systems, and infotainment. It also outlines the components of a smart card reader system including a microcontroller, smart card reader IC, memory, and power supply. Case studies of these three embedded applications are presented in detail across multiple pages.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
6. “Movie Store”
Use Case
When
● A user visualizes the details of a
movie
Then
● The application recommends
“similar” movies
7. Example
Target Movie
● The Lord of the Rings: The
Fellowship of the Ring
Recommendations
1) The Lord of the Rings: The Return of
the King
2) The Lord of the Rings: The Two
Towers
3) The Lord of the Rings
4) Lord of War
5) The Lord Protector
8. What Does
“Similar”
Mean?
Target Movie
● “The Lord of the Rings: The
Fellowship of the Ring”
Action / Adventure / Drama
8.8 on IMDB
Recommended (Similar) Movies
● The same words in the title
● The same movie genre
● The same words in the description
● Similar IMDB vote
9. Questions
Questions for our
Recommendation System
● Do all the words have the
same importance?
● Do all the fields have the same
importance?
● How does the engine
differentiate between results?
13. Movie Fields -> with Types
● imdb_title_id -> string
● original_title -> “analyzed” text
● description -> “analyzed” text
● genre -> array of strings
● avg_vote -> number
14. String vs “Analyzed” Text Field Types
● Field Type: String
● Example: “Comedy” (field: genre)
Indexed: “Comedy”
● Field Type: “Analyzed” Text
● Example: “The Lord of the Rings: The Fellowship of the Ring” (field:
original_title)
Indexed (lowercased and without stopwords):
○ “lord”
○ “rings”
○ “fellowship”
○ “ring”
15. “The Lord of the Rings: The Fellowship of the
Ring”
● Movie Id (imdb_title_id): tt0120737
● Original Title
“The Lord of the Rings: The Fellowship of the Ring”
● Description
“A meek Hobbit from the Shire and eight companions set out on a
journey to destroy the powerful One Ring and save Middle-earth from the
Dark Lord Sauron.”
● Genre
“Action, Adventure, Drama”
● Imdb vote (avg_vote): 8.8
16.
17. “More Like
This” Feature
in SOLR
More Like This
● Given a movie id => list
“similar” movies
● Uses the “Search” functionality
20. “Search”
Example 2:
Query
original_title: “Lord” AND
original_title: “Rings”
Results (4)
1) "The Lord of the Rings"
2) "The Lord of the Rings: The
Fellowship of the Ring"
3) "The Lord of the Rings: The
Return of the King"
4) "The Lord of the Rings: The Two
Towers”
Execution time: 21 ms
21. How Does the Search original_title: “Lord”
AND original_title: “Rings” Function?
● Searches in the original_title index all the movies that contain
the words “lord” AND “rings” (lowercased!)
● Computes search score based on Boosting, Term Frequency (TF)
and Inverse Document Frequency (IDF)
● Displays the results in descending order of the score
22. The TF / IDF Scoring Formula
score[movie] =∑(boost(field[j]) * tf(word[i]) * idf(word[i]))
where:
boost(field[j]) = custom weight given to the field j
tf(word[i]) = countTermFreq/(countTermFreq + 1.2 * (1 - 0.75 + 0.75 * fieldLength/avgFieldLength))
idf(word[i]) = log(1 + (countDocumentFreq - countTermFreq + 0.5) / (countTermFreq + 0.5))
word[i] = every word in the field, excluding stop words (in our case)
fieldLength = count of words in the field, excluding stop words (in our case)
avgFieldLength = average length of field
23. original_title = “The Lord of the Rings”
genre = “Animation, Adventure, Fantasy”
description = “The Fellowship of the Ring embark ...”
score = 1 * tf(“lord”) * idf(“lord”) +
1 * tf(“rings”) * idf(“rings”) +
1 * tf(“Animation”) * idf(“Animation”) + ...
Debug the Scoring Formula
score[movie] =∑(boost(field[j]) * tf(word[i]) * idf(word[i]))
24. Debug the TF / IDF Formula for the
QUERY = original_title:Lord AND original_title:Rings
Original title CTF (Field)
Lord Rings
CDF (Corpus)
Lord Rings
Field
Length
Score
The Lord of the Rings 1 1 26 10 2 8.29
The Lord of the Rings:
The Fellowship of the Ring
1 1 26 10 4 6.06
The Lord of the Rings:
The Return of the King
1 1 26 10 4 6.06
The Lord of the Rings:
The Two Towers
1 1 26 10 4 6.06
tf(word[i]) = countTermFreq/(countTermFreq + 1.2 * (1 - 0.75 + 0.75 * fieldLength / avgFieldLength))
idf(word[i]) = log(1 + (countDocumentFreq - countTermFreq + 0.5) / (countTermFreq + 0.5))
26. Inverted Index (original_title)
Id
(imdb_title_id)
Tile (original_title)
tt0120737 The Lord of the Rings:
The Fellowship of the Ring
tt0167260 The Lord of the Rings:
The Return of the King
tt0167261 The Lord of the Rings:
The Two Towers
tt0077869 The Lord of the Rings
Word Ids (imbd_title_id)
lord tt0120737,
tt0167260,
tt0167261, tt0077869
rings tt0120737,
tt0167260,
tt0167261, tt0077869
ring tt0120737
fellowship tt0120737
return tt0167260
king tt0167260
towers tt0167261
two tt0167261
28. “More Like
This”
Example
Query
● q = imdb_title_id:tt0120737
(“The Lord of the Rings: The
Fellowship of the Ring”)
● Other parameters:
mlt = true
mlt.fl=original_title,
description, genre, avg_vote
mlt.mintf = 1
mlt.count = 5
30. Results
Results (“The Lord of the
Rings: The Fellowship of the
Ring”)
● Execution Time: <100 ms
● Total Results: 62387
31. Score Title Year Genre Vote
24.49 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
14.78 The Ring Thing 2004 Adventure / Comedy 3.5
13.11 The Dork of the Rings 2006 Adventure / Comedy / Fantasy 3.2
12.65 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
11.23 The Lord Protector 1996 Action / Adventure / Fantasy 4.2
Results for “The Lord of the Rings: The Fellowship of the
Ring” (Action, Adventure, Drama - 8.8)
32. Score Title Year Genre Vote
24.49 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
14.78 The Ring Thing 2004 Adventure / Comedy 3.5
13.11 The Dork of the Rings 2006 Adventure / Comedy / Fantasy 3.2
12.65 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
11.23 The Lord Protector 1996 Action / Adventure / Fantasy 4.2
Results for “The Lord of the Rings: The Fellowship of
the Ring” (Action, Adventure, Drama - 8.8)
37. Results for “The Lord of the Rings: The Fellowship of the
Ring” (Action, Adventure, Drama - 8.8)
Score Title Year Genre Vote
1132 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
894 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
881 The Lord of the Rings:
The Two Towers
2002 Action / Adventure / Drama 8.7
667 Rings 2017 Drama / Horror / Mystery 4.5
661 The Dork of the Rings 2006 Adventure / Comedy / Fantasy 3.2
38. Results for “The Lord of the Rings: The Fellowship of the
Ring” (Action, Adventure, Drama - 8.8)
Score Title Year Genre Vote
1132 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
894 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
881 The Lord of the Rings:
The Two Towers
2002 Action / Adventure / Drama 8.7
667 Rings 2017 Drama / Horror / Mystery 4.5
661 The Dork of the Rings 2006 Adventure / Comedy / Fantasy 3.2
40. Numeric Fields
Ignored in MLT
Issue
● Only text fields are used in MLT
queries
Solution
● Rewrite the whole query as a
search query and include also
the numeric fields
42. “More Like This”
Steps
1) Extract the “interesting terms”
from the target movie
2) Add boostings / field (as given in
the query) for every interesting term
3) Perform a Search with those words
and boostings
43. “More Like This” Step 1
1) Extract the “interesting terms” from the target movie (from the field list in
the query): take all the words from all the fields and compute their relevance. Keep
the first 25.
Ex: word “ring” -> very relevant for the movie: “The Lord of the Rings: The
Fellowship of the Ring”:
- 2 occurrences: once in “original_title” and once in “description”
- in the whole corpus of 85855 movies:
- 35 times in the field “original_title” and
- 282 times in the field “description”
2) Add boostings / field (as given in the query) for every interesting term
3) Perform a Search with those words and boostings
45. “More Like This” Step 2
1) Extract the “interesting terms” from the target movie (from the field list in
the query)
2) Add boostings / field (as given in the query) for every interesting term:
avg_vote^40 genre^30 original_title^20 description
3) Perform a Search with those words and boostings
47. “More Like This” Step 3
1) Extract the “interesting terms” from the target movie (from the field list in
the query)
2) Add boostings / field (as given in the query) for every interesting term
3) Perform a Search with those words and boostings
48. Results for “The Lord of the Rings: The Fellowship of the
Ring” (Action, Adventure, Drama - 8.8)
Score Title Year Genre Vote
1132 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
894 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
881 The Lord of the Rings:
The Two Towers
2002 Action / Adventure / Drama 8.7
667 Rings 2017 Drama / Horror / Mystery 4.5
661 The Dork of the Rings 2006 Adventure / Comedy / Fantasy 3.2
49. Add Numeric
Fields to
“More Like This”
1) SOLR Request 1: perform a MLT and
get the “interesting terms”
2) Add boostings
3) Add numeric fields with their
boostings
4) SOLR Request 2: perform a Search
with numeric fields and “interesting
terms” with their respective
boostings
50. Example of Numeric Field Syntax
Target movie: avg_vote = 8.8
=> a similar movie would have:
avg_vote: [8.8 - 1.5 TO 8.8 + 1.5]
=> add boosting factor:
avg_vote: [7.3 TO 10.3] ^ 40
53. Final Results for “The Lord of the Rings: The Fellowship of
the Ring”(Action, Adventure, Drama - 8.8)
Score Title Year Genre Vote
249 The Lord of the Rings:
The Return of the King
2003 Action / Adventure / Drama 8.9
246 The Lord of the Rings:
The Two Towers
2002 Action / Adventure / Drama 8.7
222 The Lord of the Rings 1978 Animation / Adventure / Fantasy 6.2
161 Lord of War 2005 Action / Crime / Drama 7.6
157 The Lord Protector 1996 Action / Adventure / Fantasy 4.2