Here are the steps to evaluate the performance attributes of search engines using LSP model:1. Develop an attribute tree to identify key attributes for evaluation. The tree shown above identifies attributes like search input, filtering etc. 2. Define an elementary criterion function for each lowest level attribute to quantitatively measure it. For example, for keyword searching define criteria like number of relevant results, time to fetch results etc. 3. Measure the attributes for different search engines and obtain elementary preferences on the defined scale. 4. Aggregate the elementary preferences using appropriate logical operators based on user needs to obtain preferences for higher level attributes. 5. Similarly aggregate all preferences to obtain an overall preference score for each search engine
The document outlines the requirements for evaluating different search engines using the Logic Score Preference (LSP) model. It identifies user profiles, tools used, and defines a hierarchical attribute tree to measure qualities like search functionality, usability, performance, and reliability. Elementary criteria are established to quantitatively measure attributes and calculate preferences that can be aggregated to determine the overall quality of each search engine.
Karra SKD Conference Presentation Revised.pptxAshokKarra1
More Related Content
Similar to Here are the steps to evaluate the performance attributes of search engines using LSP model:1. Develop an attribute tree to identify key attributes for evaluation. The tree shown above identifies attributes like search input, filtering etc. 2. Define an elementary criterion function for each lowest level attribute to quantitatively measure it. For example, for keyword searching define criteria like number of relevant results, time to fetch results etc. 3. Measure the attributes for different search engines and obtain elementary preferences on the defined scale. 4. Aggregate the elementary preferences using appropriate logical operators based on user needs to obtain preferences for higher level attributes. 5. Similarly aggregate all preferences to obtain an overall preference score for each search engine
Similar to Here are the steps to evaluate the performance attributes of search engines using LSP model:1. Develop an attribute tree to identify key attributes for evaluation. The tree shown above identifies attributes like search input, filtering etc. 2. Define an elementary criterion function for each lowest level attribute to quantitatively measure it. For example, for keyword searching define criteria like number of relevant results, time to fetch results etc. 3. Measure the attributes for different search engines and obtain elementary preferences on the defined scale. 4. Aggregate the elementary preferences using appropriate logical operators based on user needs to obtain preferences for higher level attributes. 5. Similarly aggregate all preferences to obtain an overall preference score for each search engine (20)
Here are the steps to evaluate the performance attributes of search engines using LSP model:1. Develop an attribute tree to identify key attributes for evaluation. The tree shown above identifies attributes like search input, filtering etc. 2. Define an elementary criterion function for each lowest level attribute to quantitatively measure it. For example, for keyword searching define criteria like number of relevant results, time to fetch results etc. 3. Measure the attributes for different search engines and obtain elementary preferences on the defined scale. 4. Aggregate the elementary preferences using appropriate logical operators based on user needs to obtain preferences for higher level attributes. 5. Similarly aggregate all preferences to obtain an overall preference score for each search engine
2. Table of Contain :
(1) Introduction……………………………………………………………………………………………………………………………………………….. 3
(2) Goal of the project……………………………………………………………………………………………………………………………………….4
(3) Project requirement…………………………………………………………………………………………………………………………………….4
User Profile
Tools used
(4) Steps for design quality evaluation………………………………………………………………………………………………………………..6
(5) Identify performance attribute………………………………………………………………………………………………………………………7
(6) Elementary criteria……………………………………………………………………………………………………………………………………….10
(7) Aggregation of preference………………………………………………………………………………………………………………………….…28
(8) Competitive system…………………………………………………………………………………………………………………………………….…34
(9) Result…………………………………………………………………………………………………………………………………………………………....38
(1) EVALUATION REPORT FOR THE SearchEngine PROJECT [SearchEngine.txt] ……………………………………………..38
(2) DETAILED EVALUATION RESULTS FOR THE SearchEngine PROJECT [SearchEngine.lst]……………………………….44
(3) Result of SearchEngines [SearchEngine.res]……………………………………………………………………………………………...70
(4) SUMMARY OF RESULTS FOR THE SearchEngine PROJECT [SearchEngine.sum]…………………………………………..72
3. 1. Introduction
The software Quality Analysis is a measure of properties of a piece of software or its
specifications. The direct measurement of software quality is quite difficult due to lack of
quality factor measurement. To resolve this measurement problem, there is a model which
measures the quality of the software in terms of the attributes, specifications and
characteristics. This model is known as LSP (Logic Score Preference) .When client gives
specifications of the software to the developer then client expects the good quality of
software from developers. Hence, to decide the quality of software we can use this LSP
model.
This model validates following software quality attributes.
(1) Functionality
Suitability
Accuracy
Security
Interoperability
Compliance
(2) Usability
Understandability
Learn ability
Operability
(3) Performance
Processing time
Throughput
Resource consumption
(4) Maintainability
(5) Portability
(6) Reusability
In LSP, the features are decomposed into above aggregation blocks. And this decomposition
continues with in the each block until the all the lowest level features are directly measurable
and makes tree of decomposed features. And for each feature, an elementary criterion is
defined. And LSP calculates elementary preference for each criterion and then aggregate all
of them to calculate final global preference. And this global preference shows the quality of
the software. We can calculate global preference for different systems and we can analyze
and compare the systems’ quality.
4. 2. Goal of the Project
The very first tool used for searching on the internet was Archie in 1990. And from that day
the era of search engines begins. And effective search engines such as W2Catalog and Aliweb
were introduced in 1993. After that, search engines were improved by adding complexity and
functionality. And now days we have efficient and qualitative search engines such as Google,
Yahoo Search, bing, Ask, etc. Hence, the goal of this project is to analyze the functionality
and the complexity of those search engines. And we will find how these search engines satisfy
the user requirements.
But before the evaluation, evaluator must know the user requirements. It means evaluator
must know that who is the user and what user expects from the system. And next stage is that
evaluator must have the expert knowledge in the system. Hence, evaluator can sort out the
software attributes very well. And for further stage, evaluator should select the system
components which can be compared between two systems.
Hence, as being an evaluator we will choose the different search engines to compare their
system quality. And we will choose different attributes for the search engine system. And
finally we will give different scale of preference to each attribute for every search-engine
(Google, Yahoo Search, bing, Ask). And after going through whole process of LSP, we will find
out global preference of each search-engine and analyze their quality.
In this project we will put more focus on building a comprehensive evaluation model for five
search engines: Google, Yahoo Search, bing, Ask, altavista. This model aggregates all the
features those reflect the functionality and usability of the search engines and generates a
compound indication for the overall quality. Hence finally it reflects the measurement of user
satisfaction for all aspects of the search-engines.
5. 3. Project Requirements
Here we discuss initial requirements to evaluate this model.
3.1 User Profile
To evaluate Logic score preference model for different search-engine, we need profile
of user that uses these search engine. And also we need to have purpose of user to use search-
engine.
User: Student
Purpose:
Search study related documentations
Research papers
Presentation slides [pdf, ppt, etc. files]
Search books on web
Search images, videos and audio related to projects of study.
Search geographical location of universities or any other places
3.2 Tools Used
LSPcalc128
ANSY
Google Chrome 4.1
Internet Explorer 7.0
Secure Shell Client
6. 4. STEPS FOR DESIGN QUALITY EVALUATION
Step 1:
Develop a hierarchical model for quality characteristics and attributes (i.e. A1 …. An):
Such as so with the help of them, we can compare the systems. This step defines and specifies
the quality characteristics and attributes, grouping them into a model. For each quantifiable
attribute Ai, model associates a variable Wi, which is a weight of that attribute in the system
and it can take a real value.
- So, for search engine I have developed attribute-tree [4.1]
Step 2:
Define criterion function for each attribute, and apply attribute measurement for each
search-engine.
Elementary evaluation criteria specify how to measure quantifiable attributes. The result is an
elementary preference, which can be interpreted as a Degree of satisfied requirement. For
each attribute, it is necessary to establish an acceptable range of values and define a
function, called the elementary criterion. This function maps the measured value in the
numerical domain.
- So, for search-engine I have developed elementary preference and attribute measurement
values in section [5.1]
Step 3:
- Evaluating elementary preferences
- Logic aggregation of preferences
- So, for search engine I have developed logic aggregation preference using pictorial diagram.
Aggregators are chosen based on the user needs which are expressed as relationship between
inputs to be simultaneity, replaceability and neutrality. There are some inputs which are to be
satisfied simultaneously and finally we aggregate the preferences and compute the overall
suitability of the system.
Step 4:
- Evaluating competitive functions
- Here, we have five different such engines such as Google, Yahoo Search, bing, Ask, altavista.
And we will evaluate competitive feature analysis for each above search-engines.
Step 5:
- Ranking and selection of the best system and analysis of the system
- And finally using LSPCalc, we can give rank to each search-engine according to their global
reference. And select best among them.
- How to use LSPCalc128 tool ?
Ans: Create .cri and .dat files
Create a directory in the projects folder named with your project
Create two files as projectname.cri and projectname.dat
By default the LSPcalc128 looks into projects folder to search for all the projects
And then we can open any client shell to access the LSPcalc128
Run the executable and follow the instructions on the screen of this tool.
7. 5. Identifying performance attributes
5.1: System attribute tree
Attribute tree
1.1 Searching Input
1.1.1 Text Searching
1.1.1.1 Keyword Searching
1.1.1.1.1 Search with one keyword
1.1.1.1.2 Search with group of keywords
1.1.1.2 Statement Searching
1.1.1.2.1 Search exact statement
1.1.1.2.2 Search part of statement
1.1.1.2.3 Case sensitive Search
1.1.1.2.4 Ignore stop words (the,a,an,..etc)
1.1.1.3 Search with different language
1.1.1.4 Numeric Expression Search
1.1.1.4.1 Search numeric expression in documents
1.1.1.4.2 Compute numeric expression
1.1.1.5 Relational Searching
1.1.1.5.1 Abbreviation Search
1.1.1.5.2 Synonyms Search
1.1.1.5.3 Stemming Search
1.1.1.5.4 Misspell Corrected Search
1.1.1.6 Operator Searching
1.1.1.6.1 Including Search (using ‘+’ operator)
1.1.1.6.2 Excluding Search (using ‘-‘ operator)
1.1.1.6.3 Combinational Search (using ‘*’ operator)
1.1.1.6.4 Optional Search (using ‘OR’ operator)
1.1.2 Multimedia Search
1.1.2.1 Image Searching
1.1.2.1.1 Filename search
1.1.2.1.2 Image link search
1.1.2.1.3 Adjacent text search
1.1.2.2 Video Searching
1.1.2.2.1 Popularity based Search
1.1.2.2.2 Content based Search
1.1.2.2.3 Content Controlled Search
1.1.2.3 Audio Searching
1.2 Searching filter
1.2.1 Security filter
1.2.1.1 Pages with Spam, doorway
1.2.1.2 Duplicate Content
1.2.2 Citation filter
1.2.2.1 Adult content filter
1.2.2.2 Casino content filter
1.2.3 Domain filter
1.2.3.1 Pages related to same content
8. 1.2.3.2 Pages related to same website
1.2.3.3 Linked page
1.2.4 Extended filter
1.2.4.1 RSS support pages
1.2.4.2 Usage rights
1.2.4.3 Numeric range filter
1.2.4.4 field to search keyword [title,text,URL,Link]
1.2.5 File specific filter [pdf, word, excel sheet,..]
1.2.6 Broken link filter
1.2.7 Time filter
1.2.7.1 Page created time
1.2.7.2 Recent update time
1.2.8 Location filter
1.2.8.1 Location of searching user [i.e. weather forecast]
1.2.8.2 Location of country
1.3 Specific activity Search
1.3.1 Weather
1.3.2 Blog
1.3.3 Movie time
1.3.4 Sport score
1.3.5 Stock price
1.3.6 Literature [books,..]
1.3.7 Maps
2. Usability
2.1 Interface usability
2.1.1 Interface visibility
2.1.2 Operability
2.1.3 Customization
2.1.3.1 Customization of page size
2.1.3.2 Customization of page rank
2.2 Result Evaluation
2.2.1 Result visibility
2.2.2 Accessibility of results
2.2.3 Availability of cached result
2.3 User guide
2.3.1 Online help
2.3.2 Manual user-guide
2.3.3 FAQ
2.3.4 Related tutorial material
3. Performance
3.1 Loading time
3.1.1 Load page time
3.1.2 Automatic search suggestion time
3.1.3 Result evaluation
3.1.3.1 Time to evaluate best result
3.1.3.2 Time to evaluate top N result
3.2 Resource Consumption
9. 4. Reliability
4.1 User satisfaction
4.1.1 Popular pages results
4.1.2 High rank pages results
4.1.3 Coverage of user need
4.2 Confusion matrix
4.2.1 Accuracy
4.2.2 Precision
4.2.3 Recall
4.2.4 Specificity
10. 6 : Elementary Criteria
Point additive discrete absolute criteria
[6.1] 1.1.1 Text Searching
Keyword Searching Weight
Search with one - Search engine must search text with the single keyword. And that
keyword single word may relate to any issue. It may from any study subjects
(Computer, Mathematics, Mechanical, etc..), any dictionary word, or 50
any un-traditional word.
- So, search engine should have capability to cover N no of words.
- Efficiency for Dictionary words - 5
- Efficiency for Non-Dictionary words - 5
Search with one keyword [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Search with group of - Search engine should have capability to correlate group of keywords 50
keywords and find combination of these keywords on the web.
- Find all words of group – 8
- Find some of words of group - 2
Group of keywords [Rank of Accuracy]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Statement Searching
Search exact - Search REL= 100 * S/ S max 40
statement Efficiency to search exact expression on web as user wants
Search exact statement [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Search parts of - Search REL= 100 * S/ S max 60
statement Efficiency to search all divided parts of the expression on web.
Search part of statement [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
11. Case sensitive - if search-engine is case sensitive - 0 20
Search - otherwise - 1
Case sensitive Search [Yes=1 and No=0]
0 1
0 10 20 30 40 50 60 70 80 90 100
Ignore stop words - Ignore REL= 100 * Ignore/ Ignore max 80
(the,a,an,..etc) Efficiency to search by removing remove words for efficient search
Ignore stop words [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Search with different - Number of languages which search engine can support. 20
language
Search with different languages[No of languages]
1 25 51
0 10 20 30 40 50 60 70 80 90 100
Numeric Expression Search
Search numeric - Search REL= 100 * S/ S max 80
expression in Efficiency to search Numeric expression
documents i.e. population of country
Numeric exp in document [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Compute numeric - Number of mathematical expression search engine can compute. 20
expression - i.e. Mathematical operation +,-,*,/,sinX,tanX,sqrt,etc..
Compute numeric Exp [No of Expression]
4 15 30
12. Relational Searching
Abbreviation Search - AS REL= 100 * AS/ AS max 20
Efficiency to search Abbreviation on web as user wants
Abbreviation Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Synonyms Search - SS REL= 100 * SS/ SS max 30
Efficiency to search synonyms of searching word
Synonyms Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Stemming Search - SS REL= 100 * SS/ SS max 20
Efficiency to search with stemming technique for efficient search.
Stemming Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Misspell Corrected - MS REL= 100 * MS/ MS max 30
Search Efficiency to search with correction of misspell word
Misspell Corrected Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Operator Searching
Including Search - IS REL= 100 * IS/ IS max
(using ‘+’ operator) Efficiency to search multiple expression with the operator ‘+’
13. 55
Including Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Excluding Search - ES REL= 100 * ES/ ES max
(using ‘-‘ operator) Efficiency to search with subtraction facilities , so user can search
specific part from whole part of expression
10
Excluding Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Combinational - CS REL= 100 * CS/ CS max
Search (using ‘*’ Efficiency to search combinational search-expression on web as user
operator) wants
15
Combination Seach [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Optional Search - OS REL= 100 * OS/ OS max
(using ‘OR’ Efficiency to search with “Optional search” facility on web as user
operator) wants 20
Optional Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Image Searching
Filename search - FNS REL= 100 * FNS/ FNS max
Efficiency to search image as its file name on the web
20
Filename Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
14. Image link search - ILS REL= 100 * ILS/ ILS max
Efficiency to search Image on web as user wants
30
Image link Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Adjacent text search - ATS REL= 100 * ATS/ ATS max 50
Efficiency to search image using adjacent text with target image on the
web.
Adjacent text Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Video Searching
Popularity based - VPS REL= 100 * VPS/ VPS max
Search Efficiency to search video as its popularity 50
Popularity based Search [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Content based - VCS REL= 100 * VAS/ VAS max
Search Efficiency to search video using content related to that video.
Content based Search [Level of efficiency] 50
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Content Controlled - VCCS REL= 100 * VCCS/ VCCS max 25
Search Efficiency to search video with some specific user defined criteria.
- Funny, drama, Action, etc..
Content Controlled Search [Level of efficiency]
10 5 1
15. Audio Searching - AudioS REL= 100 * AudioS/AudioS max
Efficiency to Search audio as per user requirement.
Audio Searching [Level of efficiency] 30
10 5 1
0 10 20 30 40 50 60 70 80 90 100
[6.2] 1.2 Searching Filter
Security filter Weight
Pages with Spam, - SSS REL= 100 * SSS/ SSS max
doorway Efficiency of Security Spam filter is to prevent search of web-sites 55
which cause to download spam and doorway in user system.
Pages with Spam,doorway [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Duplicate Content - DCS REL= 100 * DCS/ DCS max 45
Efficiency to prevent searching duplicate content in fraud web-
sites to follow copyright right rule.
Duplicate Content [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Citation filter
Adult content filter - ACS REL= 100 * ACS/ ACS max
Efficiency of filter to prevent Searching Adult content on the web.
Adult Content filter [Level of efficiency] 90
10 5 1
0 10 20 30 40 50 60 70 80 90 100
16. Casino content filter - CCS REL= 100 * CCS/ CCS max 10
Efficiency of filter to prevent Searching Casino and gambling
content on the web.
Casino content filter [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Domain filter
Pages related to same - DPS REL= 100 * DPS/ DPS max
content Efficiency of filter to search pages which has same kind of 45
contents.
Content Related Pages [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Pages related to same - DWS REL= 100 * DWS/ DWS max 35
website Efficiency of filter to pages which are related to same website
domain.
Website related website [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Linked page - DLS REL= 100 * DLS/ DLS max 20
Efficiency of filter to search pages which are linked with each-
other.
Linked Page [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
17. Extended filter
RSS support pages - RSS REL= 100 * RSS/ RSS max 35
Efficiency of filter to search RSS support pages.
RSS support pages [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Usage rights - US REL= 100 * US/ US max
Efficiency of filter to give access to user who has usage right for
target searched page. 30
Usage rights [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Numeric range filter - NS REL= 100 * NS/ NS max 35
Efficiency of filter to search with sorting facility [For numeric
range sort]
Numeric range filter [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
field to search keyword Search with specific field
[title,text,URL,Link] - URL of page 55
- Title
- Specific page text
- Link of the page
Field to search keyword [No of Field]
1 2 4
0 10 20 30 40 50 60 70 80 90 100
File specific filter [pdf, - Ability to find out different kind of file on the web.
word, excel sheet,..] - i.e. .ppt, .doc, .pdf, etc..
35
18. File specific filter [No of types]
1 15 30
0 10 20 30 40 50 60 70 80 90 100
Broken link filter - BLS REL= 100 * BLS/ BLS max
Efficiency of filter to prevent to search broken link pages or pages
no longer exist.
15
Broken link filter [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Time filter
Page created time - PTS REL= 100 * PTS/ PTS max 20
Efficiency of filter to search page with created year or time for
that page.
Page created time [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Recent update time - RUS REL= 100 * RUS/ RUS max
Efficiency of filter to search page with recent update result.
Recent update time [Level of efficiency] 80
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Location filter
Location of searching user - LS REL= 100 * LS/ LS max
[i.e. weather forecast] Efficiency of filter to search pages related to geographic location
of user. 30
Location of user [Level of efficiency]
19. 10 5 1
Location of country - LCS REL= 100 * LCS/ LCS max 70
Efficiency of filter to search pages related to country to which
user belongs.
Location of Country [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
[6.3] 1.3 Specific activity Search
Weather - WS REL= 100 * WS/ WS max
Efficiency to search efficient weather result.
8
Wether [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Blog - BS REL= 100 * BS/ BS max 15
Efficiency to search Blogs on web as user wants
Blog [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Movie time - MS REL= 100 * MS/ MS max
Efficiency to search Movie-timing of cinemas on web as user
wants 8
Movie time [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Sport score - SSS REL= 100 *SSS/ SSS max 20
20. Efficiency to search latest different sports score on web as user
wants
Search with one keyword [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Stock price - SPS REL= 100 * SPS/ SPS max
Efficiency to search latest stock-price on web as user wants
14
Search with one keyword [Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Literature Efficiency to search different field related books and literatures on 15
[books,..] web as user wants
author – 3 points
content – 4 points
publisher – 3 points
Search with one keyword [Level of efficiency]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Maps Map scoring points 20
Public transportation trace root – 2
Driving direction – 2.5
Walking root – 0.5
Save Map – 0.5
Write Map description – 0.5
Review of place - 1
Search place by place-properties - 1
Printable Map – 1
Maps for mobile or handy-instruments - 1
Search with one keyword [Level of efficiency]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
21. [6.4] 2. Usability
Interface usability Weight
Interface visibility - Specification of visibility
Textbox to search – 3 40
Target Search Button – 3
Select search options [image,video] – 1
Without redundant component – 1
Easy and Clear - 2
Interface visibility [Level of visibility]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Operability - Specification of visibility 60
No of Input component – 4
No of click to search – 3
Simple Searching criteria – 3
Operability [Level of user operability]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Customization
Customization of page - Facility to customize page according to user need
size No of Results per page – 8 20
Searching customization – 1.5
Page-Theme selection – 0.5
Page size[Level of customization]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Customization of page - CPR REL= 100 *CPR/ CPR max 80
rank Efficiency to give importance of user’s taste[Using users’ visits on
the pages] and give high page rank count those pages and show
those pages on top results.
22. Page Rank [Level of page rank algorithm]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Result Evaluation
Result visibility - Result visibility
Top N result on the page – 3
Easy to access – 5 20
Result without redundant data on page – 2
Result visibility [No of result per page]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Accessibility of results - Accessibility 80
No of click – 4
Brief info of result on result page – 3
Result according to ranking – 3
Accessibility of Results [Level of accessibility]
1 5 10
0 10 20 30 40 50 60 70 80 90 100
Availability of cached - CR REL= 100 *CR/ CR max
result Efficiency to store and use cache result for next result evaluation. 20
cached Result[Level of efficiency]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
User guide
Online help - ONH REL= 100 *ONH/ ONH max
Availability of online help to guide users.
20
Online Help [Level of satisfied Material]
10 5 1
23. Manual user-guide - UG REL= 100 *UG/ UG max
Availability of manual user-guide to guide users. 30
Manual user-guide [Level of Quality of Guide ]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
FAQ - Frequent ask question
No of FAQ provided by the search engine to solve user doubts. 20
Frequent Ask Questions [No of Questions]
5 25 45
0 10 20 30 40 50 60 70 80 90 100
Related tutorial - TM REL= 100 *TM/ TM max
material Availability of Related tutorial materials to guide users.
30
Tutorial Material [Rank of Material Quality ]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
24. `
[6.5] 3. Performance
Loading time Weight
Load page time - Time taken by page[Search-engine] to load on user system 70
[in Second]
Load page time [seconds]
12 8 4
0 10 20 30 40 50 60 70 80 90 100
Automatic search - Time taken by suggestion-search to assist user for searching
suggestion time [in Second]
30
Suggestion Time [Seconds]
3 1 0.5
0 10 20 30 40 50 60 70 80 90 100
Result evaluation
Time to evaluate best - Time to evaluate result for target search. 80
result [in Second]
Time to having result [Seconds]
18 11 4
0 10 20 30 40 50 60 70 80 90 100
Time to evaluate top N - Time taken to evaluate top N result [in Second]
result 20
Time to having top N Results [ Seconds]
16 8 0.9
0 10 20 30 40 50 60 70 80 90 100
Resource - Memory occupied by the HTML source page of the search-
Consumption engine.
[Kilo-bytes] 30
Resource Consumption [Page Size [kb]]
520 20
0 10 20 30 40 50 60 70 80 90 100
25. [6.6] 3. Reliability
User satisfaction
Popular pages results - Efficiency for reliable search
Popular pages - 4 30
Informative pages - 3
Trustworthy pages - 1
Ability to search target page of specific web-site
domain rather Home Page – 2
Popular Pages Results [Level of reliability]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
High rank pages - Efficiency to computing page rank and evaluate result using 20
results ranking
Rank pages results [Level of reliability]
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Coverage of user need - Efficiency to satisfy and cover need of user
Coverage need [Level of user satisfaction] 50
10 5 1
0 10 20 30 40 50 60 70 80 90 100
Confusion matrix for result
Accuracy - Accuracy of the Result 20
Accuracy [% accuracy]
100 50 1
0 10 20 30 40 50 60 70 80 90 100
Precision - Precision for evaluated result for a target search
Precision [% precision]
100 50 1
30
0 10 20 30 40 50 60 70 80 90 100
26. Recall - Recall for evaluated result for a target search
Recall [% recall] 20
100 50 1
0 10 20 30 40 50 60 70 80 90 100
Specificity - Specification for evaluated result for a target search 30
Specificity [% Specificity]
100 50 1
0 10 20 30 40 50 60 70 80 90 100
27. 7. Ranking and Aggregation of the preferences
The system requirement tree has 73 performance variables as listed above. We make a weighted graph of
the performance variables and using aggregators .
7.1Functionality
7.1.1 Text Searching :
[Fig: 1 Functionality: Text-Searching]
37. 9.1 Results :
9.1.1 EVALUATION REPORT FOR THE SearchEngine PROJECT [SearchEngine.txt]
This report presents the evaluation results for the following 5 competitive
systems:
1. Google
2. Yahoo
3. bing
4. Ask
5. altavista
The evaluation is based on 71 elementary criteria grouped in the following 4
major groups:
1. Functionality
2. Usability
3. Performance
4. Reliability
This summary includes two parts: (1) System Comparison and Ranking, and
(2) Survey of Individual Systems. Deatailed numerical results can be found in
the report entitled "Detailed Evaluation Results of the SearchEngine Project".
(1) System Comparison and Ranking
---------------------------------
The global preference of a system indicates the global percentage of satisfied
requirements. Therefore, the best system has the highest global preference. The
ranking of competitive systems is based on decreasing global preferences, as
follows:
1. 94.89% Google
2. 89.78% bing
3. 88.18% Yahoo
4. 78.42% altavista
5. 77.35% Ask
Therefore, the best system is Google.
This system satisfies 94.89% of the requirements specified by evaluation criteria.
The absolute value of global preference depends both on the quality of each system
and the level of demand imposed by the evaluation criterion function. So, low global
preferences may sometimes reflect too demanding criteria. The relative ranking of
competitive systems is based on normalized preferences so that the best system has
the normalized global preference of 100%. Following is the ranking according to
normalized preferences:
1. 100.00% Google
2. 94.62% bing
3. 92.93% Yahoo
4. 82.65% altavista
5. 81.51% Ask
The relative differences between systems can be interpreted as follows:
System Google dominates system bing in 66.20% of inputs
System Google dominates system Yahoo in 72.54% of inputs
38. System Google dominates system altavista in 81.69% of inputs
System Google dominates system Ask in 83.80% of inputs
System bing domantes system Yahoo in 59.86% of inputs
The reasons for a specific value of global preference can be explained
by investigating the quality of all major components of the evaluated
systems. Following is the survey of preferences of 4 major system
components: Functionality, Usability, Performance, and Reliability.
Systems Functionality Usability Performance Reliability
-----------------------------------------------------------------------
Google 93.79 96.95 95.82 93.87
Yahoo 89.21 88.40 90.87 85.05
bing 90.12 88.86 97.22 86.18
Ask 82.26 85.55 63.49 73.33
altavis 81.20 88.10 96.62 60.97
-----------------------------------------------------------------------
(2) Survey of Individual Systems
--------------------------------
This survey highlights the strongest and the waekest components of
all evaluated systems. In particular, the survey includes lists of
the weakest components that are primary candidates for improvements.
This is an analysis of relative performance and for high quality
systems the weakest component can still satisfy a substantial
percentage of user's requiremenmts. Therefore, improvements are not
equally urgent for all systems. They are primariliy needed for systems
having a relatively low global preference.
Google
This system satisfies 94.89% of user's requirements. The best subsystem
of Google is Usability.
The best subsystem satisfies 96.95% of specified requirements.
The weakest subsystem of Google is Functionality.
The weakest subsystem satisfies 93.79% of specified requirements.
Weak components of this system are components that are rated
below the global preference. These are components that primarily
need improvement. Following is the sorted list of weak components,
starting with the weakest component:
ID X E[%] Elementary criterion
-------------------------------------------------------------
233 30.00 62.50 FAQ
1271 4.00 66.67 Page created time
126 4.00 66.67 Broken link filter
11162 3.00 77.78 Excluding Search (using ‘-‘ operator)
1222 3.00 77.78 Casino content filter
133 3.00 77.78 Movie time
1231 3.00 77.78 Pages related to same content
39. 3132 3.00 86.09 Time to evaluate top N result
423 88.00 87.88 Recall
32 76.00 88.80 Resource Consumption
11163 2.00 88.89 Combinational Search (using ‘*’ operator)
1233 2.00 88.89 Linked page
1241 2.00 88.89 RSS support pages
11211 2.00 88.89 Filename search
11212 2.00 88.89 Image link search
1272 2.00 88.89 Recent update time
1281 2.00 88.89 Location of searching user [i.e. weather forecast]
1282 2.00 88.89 Location of country
11223 2.00 88.89 Content Controlled Search
134 2.00 88.89 Sport score
135 2.00 88.89 Stock price
136 9.00 88.89 Literature [books,..]
2132 2.00 88.89 Customization of page rank
223 2.00 88.89 Availability of cached result
232 2.00 88.89 Manual user-guide
1123 2.00 88.89 Audio Searching
1212 2.00 88.89 Duplicate Content
1221 2.00 88.89 Adult content filter
413 2.00 88.89 Coverage of user need
11141 2.00 88.89 Search numeric expression in documents
421 90.00 89.90 Accuracy
1113 46.00 90.38 Search with different languages
424 93.00 92.93 Specificity
Yahoo
This system satisfies 88.18% of user's requirements. The best subsystem
of Yahoo is Performance.
The best subsystem satisfies 90.87% of specified requirements.
The weakest subsystem of Yahoo is Reliability.
The weakest subsystem satisfies 85.05% of specified requirements.
Weak components of this system are components that are rated
below the global preference. These are components that primarily
need improvement. Following is the sorted list of weak components,
starting with the weakest component:
ID X E[%] Elementary criterion
-------------------------------------------------------------
11142 4.00 0.00 Compute numeric expression
233 28.00 57.50 FAQ
1222 4.00 66.67 Casino content filter
421 75.00 74.75 Accuracy
1244 3.00 75.00 field to search keyword [title,text,URL,Link]
32 142.00 75.60 Resource Consumption
1231 3.00 77.78 Pages related to same content
1241 3.00 77.78 RSS support pages
1243 3.00 77.78 Numeric range filter
11163 3.00 77.78 Combinational Search (using ‘*’ operator)
126 3.00 77.78 Broken link filter
1271 3.00 77.78 Page created time
1281 3.00 77.78 Location of searching user [i.e. weather forecast]
137 8.00 77.78 Maps
221 8.00 77.78 Result visibility
11164 3.00 77.78 Optional Search (using ‘OR’ operator)
1212 3.00 77.78 Duplicate Content
413 3.00 77.78 Coverage of user need
11162 3.00 77.78 Excluding Search (using ‘-‘ operator)
423 78.00 77.78 Recall
40. 422 80.00 79.80 Precision
1113 41.00 80.77 Search with different languages
424 85.00 84.85 Specificity
3132 3.00 86.09 Time to evaluate top N result
125 26.00 86.67 File specific filter [pdf, word, excel sheet,..]
bing
This system satisfies 89.78% of user's requirements. The best subsystem
of bing is Performance.
The best subsystem satisfies 97.22% of specified requirements.
The weakest subsystem of bing is Reliability.
The weakest subsystem satisfies 86.18% of specified requirements.
Weak components of this system are components that are rated
below the global preference. These are components that primarily
need improvement. Following is the sorted list of weak components,
starting with the weakest component:
ID X E[%] Elementary criterion
-------------------------------------------------------------
233 30.00 62.50 FAQ
11142 20.00 66.67 Compute numeric expression
11163 3.00 77.78 Combinational Search (using ‘*’ operator)
1212 3.00 77.78 Duplicate Content
1222 3.00 77.78 Casino content filter
1243 3.00 77.78 Numeric range filter
126 3.00 77.78 Broken link filter
1271 3.00 77.78 Page created time
11162 3.00 77.78 Excluding Search (using ‘-‘ operator)
413 3.00 77.78 Coverage of user need
423 82.00 81.82 Recall
1113 42.00 82.69 Search with different languages
422 85.00 84.85 Precision
421 85.00 84.85 Accuracy
3132 3.00 86.09 Time to evaluate top N result
424 88.00 87.88 Specificity
11111 2.00 88.89 Search with one keyword
1221 2.00 88.89 Adult content filter
11141 2.00 88.89 Search numeric expression in documents
1231 2.00 88.89 Pages related to same content
1233 2.00 88.89 Linked page
1241 2.00 88.89 RSS support pages
11112 2.00 88.89 Search with group of keywords
11153 2.00 88.89 Stemming Search
11161 2.00 88.89 Including Search (using ‘+’ operator)
1272 2.00 88.89 Recent update time
1281 2.00 88.89 Location of searching user [i.e. weather forecast]
1282 2.00 88.89 Location of country
131 2.00 88.89 Weather
132 2.00 88.89 Blog
133 2.00 88.89 Movie time
135 2.00 88.89 Stock price
136 9.00 88.89 Literature [books,..]
137 9.00 88.89 Maps
211 9.00 88.89 Interface visibility
2132 2.00 88.89 Customization of page rank
221 9.00 88.89 Result visibility
222 9.00 88.89 Accessibility of results
223 2.00 88.89 Availability of cached result
232 2.00 88.89 Manual user-guide
11121 2.00 88.89 Search exact statement
41. 11122 2.00 88.89 Search parts of the statement
412 2.00 88.89 High rank pages results
11211 2.00 88.89 Filename search
11212 2.00 88.89 Image link search
11213 2.00 88.89 Adjacent text search
11223 2.00 88.89 Content Controlled Search
1123 2.00 88.89 Audio Searching
Ask
This system satisfies 77.35% of user's requirements. The best subsystem
of Ask is Usability.
The best subsystem satisfies 85.55% of specified requirements.
The weakest subsystem of Ask is Performance.
The weakest subsystem satisfies 63.49% of specified requirements.
Weak components of this system are components that are rated
below the global preference. These are components that primarily
need improvement. Following is the sorted list of weak components,
starting with the weakest component:
ID X E[%] Elementary criterion
-------------------------------------------------------------
11142 6.00 9.09 Compute numeric expression
1113 6.00 10.42 Search with different languages
32 360.00 32.00 Resource Consumption
233 27.00 55.00 FAQ
1271 4.00 66.67 Page created time
1281 4.00 66.67 Location of searching user [i.e. weather forecast]
1222 4.00 66.67 Casino content filter
126 4.00 66.67 Broken link filter
413 4.00 66.67 Coverage of user need
137 7.00 66.67 Maps
421 70.00 69.70 Accuracy
altavista
This system satisfies 78.42% of user's requirements. The best subsystem
of altavista is Performance.
The best subsystem satisfies 96.62% of specified requirements.
The weakest subsystem of altavista is Reliability.
The weakest subsystem satisfies 60.97% of specified requirements.
Weak components of this system are components that are rated
below the global preference. These are components that primarily
need improvement. Following is the sorted list of weak components,
starting with the weakest component:
ID X E[%] Elementary criterion
-------------------------------------------------------------
11142 4.00 0.00 Compute numeric expression
1113 2.00 2.08 Search with different languages
412 7.00 33.33 High rank pages results
411 6.00 44.44 Popular pages results
233 28.00 57.50 FAQ
1222 4.00 66.67 Casino content filter
1281 4.00 66.67 Location of searching user [i.e. weather forecast]
1282 4.00 66.67 Location of country
137 7.00 66.67 Maps
421 75.00 74.75 Accuracy
1244 3.00 75.00 field to search keyword [title,text,URL,Link]
42. 125 23.00 76.67 File specific filter [pdf, word, excel sheet,..]
1243 3.00 77.78 Numeric range filter
11122 3.00 77.78 Search parts of the statement
11111 3.00 77.78 Search with one keyword
126 3.00 77.78 Broken link filter
1271 3.00 77.78 Page created time
11112 3.00 77.78 Search with group of keywords
11162 3.00 77.78 Excluding Search (using ‘-‘ operator)
132 3.00 77.78 Blog
133 3.00 77.78 Movie time
135 3.00 77.78 Stock price
136 8.00 77.78 Literature [books,..]
11163 3.00 77.78 Combinational Search (using ‘*’ operator)
2132 3.00 77.78 Customization of page rank
221 8.00 77.78 Result visibility
11211 3.00 77.78 Filename search
11121 3.00 77.78 Search exact statement
1231 3.00 77.78 Pages related to same content
413 3.00 77.78 Coverage of user need
1241 3.00 77.78 RSS support pages
423 78.00 77.78 Recall
43. 9.2 DETAILED EVALUATION RESULTS FOR THE SearchEngine PROJECT
[SearchEngine.lst]
Competitive System(s):
1. Google
2. Yahoo
3. bing
4. Ask
5. altavista
PERFORMANCE VARIABLES:
11111. Search with one keyword
11112. Search with group of keywords
11121. Search exact statement
11122. Search parts of the statement
11123. Ignore stop words
11124. Case sensitive Search
1113. Search with different languages
11141. Search numeric expression in documents
11142. Compute numeric expression
11151. Abbreviation Search
11152. Synonyms Search
11153. Stemming Search
11154. Misspell Corrected Search
11161. Including Search (using ‘+’ operator)
11162. Excluding Search (using ‘-‘ operator)
11163. Combinational Search (using ‘*’ operator)
11164. Optional Search (using ‘OR’ operator)
11211. Filename search
11212. Image link search
11213. Adjacent text search
11221. Popularity based Search
11222. Content based Search
11223. Content Controlled Search
1123. Audio Searching
1211. Pages with Spam, doorway
1212. Duplicate Content
1221. Adult content filter
1222. Casino content filter
1231. Pages related to same content
1232. Pages related to same website
1233. Linked page
1241. RSS support pages
1242. Usage rights
1243. Numeric range filter
1244. field to search keyword [title,text,URL,Link]
125. File specific filter [pdf, word, excel sheet,..]
126. Broken link filter
1271. Page created time
1272. Recent update time
1281. Location of searching user [i.e. weather forecast]
1282. Location of country
131. Weather
132. Blog
133. Movie time