This talk summarizes the work I have been doing on modeling user behavior on Web1.0 and Web2.0 systems in the last 13 years
Talk given at a workshop on Cognitive Modeling in Utrecht, Netherlands on March 20, 2010.
Strategies for Landing an Oracle DBA Job as a Fresher
Using Information Scent to Model Users in Web1.0 and Web2.0
1. Modeling of Web Users from Web1.0 to Web2.0 Ed H. Chi, Principal Scientist and Area Manager Augmented Social Cognition Area Palo Alto Research Center Image from: http://www.flickr.com/photos/ourcommon/480538715/ 2010-03-20 Utrecht CogModeling
2.
3.
4.
5.
6. Analogy to Optimal Foraging 2010-03-20 Utrecht CogModeling Information Energy
7.
8.
9. 2010-03-20 Utrecht CogModeling i bread j butter sandwich flour Ai = Bi + WjSji Activation of chunk i Base-level activation of chunk i Activation spread from linked chunks j Activation depends on a base level plus activation spread from associated chunks Bi = log( ) Pr( i ) Pr(not i ) Sji = log( ) Pr( j | i ) Pr( j |not i ) log likelihood of i occurring log likelihood of i occurring with j Base level activation reflects log likelihood of events in the world. Strength of spread reflects log likelihood of event cooccurrance
10.
11. WUFIS: Web User Flow by Information Scent 2010-03-20 Utrecht CogModeling User Information Goal Web site Web Page content links Web user flow simulation Predicted paths
12. InfoScent: How does it work? Utrecht CogModeling Start users at page with some goal Flow users through the network Examine user patterns Scent Values: Probabilities of Transition 2010-03-20
13. InfoScent Simulation Utrecht CogModeling 2 1 2010-03-20 Now with the Scent Matrix, we then perform Spreading Activation. 3 Weight Matrix Query Relevant Docs R = Relevant documents T = Topology matrix Normalize to Probability Scent Matrix
14.
15.
16. Bloodhound Project 2010-03-20 Utrecht CogModeling Starting Point: www.xerox.com Task: look for “ high end copiers ” OUTPUT usability metrics INPUT
24. IUNIS: Inferring User Need by Info Scent 2010-03-20 Utrecht CogModeling User Information Goal Web site Web Page content links Web user flow simulation Observed paths
29. Web page with highlighted link anchors 2010-03-20 Partial information goal: “ remote diagnostic technology” Remainder of information goal: “ speed >= 75” Utrecht CogModeling 62 copies/min. 92 copies/min.
30.
31. Results of user study Utrecht CogModeling (times capped at five minutes) 10/12 subjects preferred ScentTrails to both searching and browsing 2010-03-20
33. ScentHighlight User first type search keywords: “anthrax symptoms” Conceptually highlight any relevant passages and keywords Draw user attention 2010-03-20 Utrecht CogModeling
43. Using Information Theory to Model Social Tagging [Ed H. Chi, Todd Mytkowicz, ACM Hypertext 2008] Topics Users Documents Decoding 2010-03-20 Utrecht CogModeling Concepts Tags T 1 …T n Encoding Noise
50. TagSearch: Use Semantic Analysis to Reduce Noise http://mrtaggy.com 2010-03-20 Utrecht CogModeling Guide Web Howto Tips Help Tools Tip Tricks Tutorial Tutorials Reference Semantic Similarity Graph
51.
52. Understanding a new area… 2010-03-20 Characterization Models Prototypes Evaluations Utrecht CogModeling
53. MrTaggy.com: social search browser with social bookmarks Joint work with Rowan Nairn, Lawrence Lee Kammerer, Y., Nairn, R., Pirolli, P., and Chi, E. H. 2009. Signpost from the masses: learning effects in an exploratory social tag search browser. In Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston, MA, USA, April 04 - 09, 2009). CHI '09. ACM, New York, NY, 625-634. 2010-03-20 Utrecht CogModeling
67. TagSearch Exploratory Focus 3 kinds of search 2010-03-20 Utrecht CogModeling navigational transactional 28% 13% You know what you want and where it is You know what you want to do Existing search engines are OK informational 59% You roughly know what you want but don’t know how to find it Difficult for existing search engines Opportunity
Hinweis der Redaktion
Title: Modeling of Web Users from Web1.0 to Web2.0 Abstract: In this talk, I will provide a perspective on how information scent techniques have taken us to characterize and model individual web surfers in the Web1.0 world, and how we used those techniques to build applications and systems. Then I will present some ideas of we might bridge these ideas to the Web2.0 world by modeling groups of users using Web2.0 systems.
. Example: Media news is fresh. With the right interest, users have a high probability of following that piece of information. . Hunters strategies maximizes the benefit per cost of pursuing the prey. Information gatherers do exactly the same thing.
. Example: Media news is fresh. With the right interest, users have a high probability of following that piece of information. . Hunters strategies maximizes the benefit per cost of pursuing the prey. Information gatherers do exactly the same thing.
Statistically, a correlation coefficient above 0.8 is generally considered to be strong correlation, and between 0.5 and 0.8 is considered moderate, while below 0.5 is considered weak correlation . Twelve correlated strongly, and seventeen of the 32 tasks correlated moderately.
. Using our technology, by telling the web site of your special requirements, each virtual aisle of the web site is pre-highlighted according to your special request, making it easier for you to shop.
In the enterprise, these have become the standard set of Web 2.0 tools in practice. They have several benefits – they can be set up by end users without needing IT, they have familiar UIs from consumer versions, And in terms of knowledge sharing, an important advantage these tools have over traditional KM systems is that knowledge can be captured and archived through the act of communication without requiring extra work by users. These tools will become increasingly important in the office as younger people enter the workforce and expect to be able to use them.
There are really two facets of tagging. The first is encoding: when you encounter a document, have read or skimmed it and have to generate a few words that describe it. The second side of tagging is retrieval: you find a new document that has several tags attached to it, and you read those tags and the document. The tags may give you an idea about what the document is about. I am going to come back to this distinction later.
Vocabulary saturation! shows a marked increase in the entropy of the tag distribution H(T) up until week 75 (mid-2005) at which point the entropy measure hits a plateau. Since the total number of tags keeps increasing, tag entropy can only stay constant in the plateau by having the tag probability distribution become less uniform. What this suggests is that users are having a hard time coming up with “unique” tags. That is to say, a user is more likely to add a tag to del.icio.us that is already popular in the system, than to add a tag that is relatively obscure.
What’s perhaps the most telling data of all is the entropy of documents conditional on tags, H(D|T) , which is increasing rapidly (see Figure 4). What this means is that, even after knowing completely the value of tags, the entropy of the document is still increasing. Conditional Entropy asks the question: “Given that I know a set of tags, how much uncertainty regarding the document set that I was referencing with those tags remains?” This measure gives us a method for analyzing how useful a set of tags is at describing a document set. The fact that this curve is strictly increasing suggests that the specificity of any given tag is decreasing. That is to say, as a navigation aid, tags are becoming harder and harder to use. We are moving closer and closer to the proverbial “needle in a haystack” where any single tag references too many documents to be considered useful.
Figure 6 shows the number of tags per bookmark over time. The trend is clearly increasing, complementing the increase in navigation difficulty.
We introduce a technique for creating novel, textually-enhanced thumbnails of web pages. These thumbnails combine the advantages of image thumbnails and text summaries to provide consistent performance on a variety of tasks. We conducted a study in which participants used three different types of summaries (enhanced thumbnails, plain thumbnails, and text summaries) to search web pages to find several different types of information. Participants took an average of 83 seconds to find the answer to a question. They were approximately 30 seconds faster with enhanced thumbnails than with text summaries, and 19 seconds faster with enhanced thumbnails than with plain thumbnails. Further, performance with enhanced thumbnails was much more consistent than with text summaries or plain thumbnails. In the images shown on this slide, the top row contains plain (scale-reduced) thumbnails of web pages. The bottom row contains thumbnails that have been enhanced in the following way: (1) the fonts in H1 and H2 tags have been modified so that they are readable in the thumbnails; (2) transparent, highlighted callouts have been included for keywords from the search query (appropriate highlighted colors were chosen based on visual attention models); and (3) the contrast level in the thumbnail has been reduced so that the callouts are more prominent and readable.
Informational search – ambiguity in query – where social search has most power