SlideShare a Scribd company logo
1 of 26
Download to read offline
User Guide
       Updated April 3, 2011
Living Document for Sweeper v0.3
     http://swiftly.org/userguide
Table of Contents
Table of Contents
   I. Introduction
   II. Using this Living Document
   III. About the Sweeper Application
        Suggested Uses
                As a FeedReader
                For Passive Data-Processing
                For Active Content Filtering
                For Real-time Social Media Curation
                As a Vertical Content Dashboard
        Terminology
   IV. Explaining the Sweeper UI
        Analytic Dashboard
        Main Content Window
        Admin Panel
        View Tabs
        Filter Panel
        Refresh Staging Area
        Rating Panel
        Content Items
   V. Overview of Plugins
        Duplicate Content Filter
        Google Language Services
        Geo-Location (Yahoo)
        Tagging
        Ushahidi Push
        Tag Clustering
        Annotations*
        Quiver/Bookmarking*
   VI. Adding Sources
        Email (IMAP)
        Email (Gmail)
        FrontlineSMS
        News & Blog Search
        RSS/ATOM
        Flickr
        SMS Gateways
        Twitter
I. Introduction

Thanks for using the Sweeper application! Sweeper is meant to be fairly intuitive but we’re well aware that
sometimes it’s a little overwhelming at first to get started and knowing what’s possible. In this guide we
will walk you though using Sweeper and a handful of the native plugins. This is not a guide for installing
it (for that look here), rather this guide will walk you through use of the Sweeper software and the various
plugins for it. If you are a developer seeking information on how to develop plugins, parsers or other
modules for Sweeper and other SwiftRiver applications, click here.
II. Using this Living Document

Because Sweeper is an open-source product, who’s code and feature-set changes quite frequently, this
user guide is a living document that serves only as a snapshot of what’s possible at the time it was last
updated. We invite you to revisit this link often. If you decide to print it, just be aware that as soon as it’s
transferred from bits to pulp, it’s essentially become outdated.

Likewise, any copy of this document that is distributed in PDF form, DOC form, or FLV form, those
versions too are likely outdated. To ensure you have the latest version, it can always be found at - http://
swiftly.org/userguide/
III. About the Sweeper Application

Sweeper is an application that focuses on the aggregation, curation and filtering of real-time content.
It assumes the user knows exactly what sources they are tracking but needs an application to help
them prioritize their attention. Here is a comparison. Sweeper is sort of like an open source version of
TweetDeck, or to use a Google analogy: Google Reader. The user defines a number of sources to track
and Sweeper offers a number of ways for filtering and viewing that collected content.



Suggested Uses

What can Sweeper be used for? A number of things but here’s a few ideas...

        As a FeedReader
        Sweeper was designed for collecting large amounts of disparate real-time data and sweeping
        through it quickly and efficiently, while also doing things to that content. So there is an emphasis
        on speed and summation of large datasets, allowing the user to decide upon where to spend his
        or her time to delve deeper.

        As mentioned in the examples above, one might consider using Sweeper as a substitute for
        a traditional feed-reader. However, unlike most feed-readers there are no restrictions on the
        type of data that can be aggregated, and there’s smart triggers applied to data going out. ex. If I
        perform this function, content is affected in this way. This functionality can be useful for setting up
        really advanced conditional taskingwhich we’ll cover later.


        For Passive Data-Processing
        Sweeper can also be configured to be a passive filter for data, meaning you can set it to
        aggregate content, then automatically perform certain tasks around that. ex. Aggregate all tweets
        from #hashtag tagged in the state of Maine and send only that data to another platform.

        When used in this way, Sweeper essentially becomes a smart cron tool equipped with geo-
        tagging, natural language processing and other power contexual features.


        For Active Content Filtering
        Users are also provided a number of utilities for quickly searching through content. Clicking on
        a selection of tags allows the user to see content only selecting those tags. The cluster panel
        allows content to be clustered around other content in various channels that are similar. The user
        can also sort by assigned scores (which can represent the favor they might have for some types
        of content over others) in any variation between 1 and 100. ex. show me only the content with a
        score of 40 or above; or only content between 20 and 60.


        For Real-time Social Media Curation
        Sweeper can be used for real-time media curation across channels (Blogs, News, RSS/ATOM,
        Twitter, SMS, Email) and across over 50 languages. For a journalist attempting to collect data
that’s rapidly unfolding across social-media, this can save potentially unprecedented amounts of
        time. Rather that opening 50 different windows for different apps, the Sweeper application can be
        used to mine and add context to disparate content, completely at the users whim. Perhaps even
        more interestingly, all this aggregated data can be annotated, mapped, shared or exported in a
        number of ways after it’s been structured as the user sees fit.


        As a Vertical Content Dashboard
        Perhaps you have a need to know what’s going across various industries at all times. You
        could enter the feeds of several well known bloggers, the @twitternames of thought leaders in
        that industry, a public facing email address you control like sports@mynewsite.com, a public
        facing shortcode (ex. 6060). That might just be your sports page. But when you replicate that
        experience multiple times across Entertainment, World News, Food, Lifestyle etc. you end up with
        an equally rich immerse real-time data-mining tool across all those interests.




Terminology

Before we continue, it will help if you have a basic understanding of the terminology we use to discuss the
application.

Sweeper (capital ‘S’) - the name of a SwiftRiver application for aggregating and processing feeds of
content
sweeper (lowercase ‘s’) - generally, one who performs the function of sweeping through feeds of
content. However, in the Sweeper application the user role of sweeper is assigned to users who can edit
tags and process content but who don’t have administrative rights to the application.
sweep - to process data
channel - the distribution type used to deliver content. Twitter, Email, RSS/ATOM, SMS are all channels.
source - the place (or person) from which content originates. a persons @twittername, email address,
blog or web url, or phone-number would all be considered sources. Several sources may be collected to
reference a single identity ex. this blog, this url, this phone number all belong to the same person
content item - a single item of content collected from a feed, regardless of the channel it came in on or
the source it came from
tag - a layer of taxonomy applied to all content
lat/lon - geospatial coordinates; short for latitude and longitude
veracity - more accurately the subjective favor the user (or users) has for content. The baseline of favor
expressed for certain types of content is uses as a building block for a score applied to content. This
score is then used both for prioritizing sources and for recommending other content the user or users may
favor.
cluster - a collection of content items deemed to be statistically similar based on tags
editors - editors don’t have full administrative rights to the application but they can perform tasks that
sweepers can’t.
turbine - another word for plugins for SwiftRiver applications
impulse turbine - plugins that pre-process content (before the application receives it). Impulse Turbine
plugins affect how data is structured as part of the Swift object module.
reactor turbine - plugins that process content based on human interaction or assigned logic (after
the application has received it). Reactor Turbine plugins can be used to take structured data and do
something with it.
parsers - on the application architecture level parsers are modules that can be written to create new
sources
trusted source - applies a default score of 100 to a source allowing the user to vote against a high-score
as the default. ex. you have my trust now but could lose it over-time
IV. Explaining the Sweeper UI




So now that we’ve got the basics we can walk you through the Sweeper user interface, it’s basic features
and functions. At first look the application can be a little intimidating so hopefully this guide takes the
edge off (like a martini!).
Analytic Dashboard

This dashboard offers a quick survey of the content being collected by Sweeper. Where is data mostly
being collected from? How much content in total? Howe much from each channel? The charts are
dynamic and update with each use of the application.
Main Content Window

Below you see the main content display window. This is where aggregated content can be viewed.
Admin Panel




This area contains four tabs. Login, Impulse Turbines, Reactor Turbines, Sources, Add User

       Login - as you might expect, this area allows users to login to the application
       Impulse Turbine - for enabling or disabling impulse turbine plugins
       Reactor Turbine - for enabling or disabling reactor turbine plugins
       Sources - this is the area where one can add sources to aggregate into Sweeper
       Users - area for adding users and assigning their administrative rights
View Tabs




This area contains several tabs for altering the view of the main content window. The titles are fairly self-
explanatory. Dashboard, New content, Accurate, Inaccurate, Crosstalk, Irrelevant

        Dashboard - contains a collection of charts plotting various aspects of the content being
        collected
        New content - for viewing new content as it’s being collected
        Accurate - shows all content voted up
        Inaccurate - shows all content voted down
        Crosstalk - shows content that is completely off-topic
        Irrelevant - shows content that is on-topic but not relevant to the user’s specific needs
Filter Panel




Filters for changing the view of the main content window.

        Veracity Slider - allows the user to set a range of anything between 1 and 100 to view content by
        assigned score
        Channels - view only the content that came in on a particular channel
        Tags - view only the content containing a selection of tags



Refresh Staging Area




Reveals how much content has been aggregated since the main content window was last refreshed.
Rating Panel




The upper left part of the Rating Panel is for quickly determining information about content. Is this
a ‘trusted’ source or has it been rated as trusted by the people within your bounded (or unbounded) group
of users?

The upper right quadrant shows a score that represents the favor the user or their community has for the
associated source.

In the lower quadrant we have four buttons here is what they essentially do:

        Green (Up) - expresses favor for a content item while positively affecting it’s sources score so
        that in the future content from the same source will be prioritized.
        Red (Down) - expresses disapproval for a content item while negatively affecting it’s sources
        score so that in the future content from the same source will be deprioritized
        Crosstalk - expresses that this content is not relevant because it’s essentially been collected by
        mistake and that it’s not useful. Removes it from the main view without negatively affecting the
        source score.
        Irrelevant - expresses that this content is not germane to the task the user is trying to perform
        and more importantly, is somehow damaging or distracting. Removes the content from the main
        view with negatively affecting the source score.

It’s important to note that these votes whether up or down are not the only things being factored into the
scoring of content. We also factor in a number of things like the tag profile of content, the ratings of the
individuals users rating this individual, and other factors. For an in-depth explanation see the RiverID
System Guide.
Content Items




Content items are divided into three sub-sections: the Header, the Body and the Footer.

In the Header you’ll find an icon denoting what channel this content came in on: Twitter, Email, SMS, or
RSS/Atom. Clicking this icon will reveal more:




A pop-up display reveals information about the source and the content itself:

        Source - the source of the content (a Twitter @name, email address, url or phone number)
        Channel - the channel the content came in on (Twitter, Email, SMS, or RSS/Atom)
        Source Score - the trust score associated with this source
        Link - hyperlink to the original content
In the Body you’ll find a portion of the message (from Twitter and SMS) or headline/subject (Articles,
Blogs, Email)




In the Footer you’ll find tags which add a layer of taxonomy to the content. You can quickly find other
content like this particular content item by clicking on the tags themselves. Users can also add their own
tags*, edit tags* or delete tags to help the system improve**.

* Adding tags and editing tags is not possible in the v0.3.0 of Sweeper UI. However a slight modification of the code exposes this
feature and makes it available.

** There is an active learning element of our Tagging API that allows the system to learn from user feedback that will be available
soon. You can read more about this in the section on Impulse Reactor Plugins.
V. Overview of Plugins

There are a few plugins that ship with Sweeper and that are either enabled by default or commonly used.
There are way too many to list here so in this section we’ll explain what a few of the available plugins are
and what they are used for.

You can always find more plugins for Swiftly applications at http://plugins.swiftly.org



Duplicate Content Filter




When activated, this plugin passes all content through the Duplication Filter API in the Swift Web Service
stack, effectively removing all duplicate content (like retweets) from a feed.



Google Language Services




When activated, this plugin passes all content through the Google Translate API. Google Translate will
automatically detect what language the content is in, translate it and send it back. This allows you to
aggregate content in multiple languages but only see the resulting translated, English content! This is a
huge time saver when doing international research.




But how do you know what content has been translated. When activated, additional info in the content
item’s header will let the user know what has been translated, and from what language. See the example
above.

If you expect large amounts of data you may want to opt for the Google Enterprise Language Service
plugin instead. With this plugin the amount of content that can be translated is increased significantly.
It requires an API key from Google. If you need help getting Enterprise level access, contact us at
support@swiftly.org



Geo-Location (Yahoo)




When activated, this plugin passes all content through the Yahoo Placemaker API where we try to detect
a location where the content is likely to have originated from. We then apply lat/lon coordinates to the
content that are then stored as part of the content meta info. When passed to other systems, this lat/lon
info can be used for geo-spatial reference.

To use this service, you’ll need to acquire a Yahoo Placemaker API key from Yahoo. If you need help
getting Enterprise level access, contact us at support@swiftly.org



Tagging




When activated, all content passing through Sweeper will be tagged by our natural language processing
API. Essentially this services tries to extract what it thinks are the active keywords being used, and uses
that to help the user automatically sort content.

Tags are very important to SwiftRiver and we take a dual taxonomic and folksonomic approach in our
applications. Meaning, although these tags are machine generated, they can be edited and improved
upon by humans which in turns helps to teach the algorithm how to tag content better.



Ushahidi Push




For users of Ushahidi or Crowdmap. This will take any content voted up in the Ratings panel and
automatically plot it on a designated Ushahidi deployment map as an approved report. This is a
significant time saver for large groups who want to use Sweeper to curate data, but use Ushahidi or
Crowdmap to visualize it.
Users will need to enter and API key for an Ushahidi deployment that they have administrative rights to.
ex. http://xxx.xxx.xx.xxx/ushahidi/

There are many variants of this plugin. One is called Ushahidi Passive Push and essentially it turns
Sweeper into a cron suite where content is automatically aggregated, structured, and passed along to
Ushahidi...mostly without any human operators!



Tag Clustering




When activated, this plugin allows the user to view content similar to any particular content item. The
clustering is done by using a statistical profile of the associated Tags for proximity matching. This gives
the user more control over alternative recommendation methods, because it can factor in the users own
tagging methods. For instance if I use unique identifiers or words unique to my organization, they too can
be used as part of the proximity matching algorithm!



Annotations*
Annotations offers the ability to annotate any content item. This can be used to leave individual notes for
reference, or to collaboratively converse around content with your team.



Quiver/Bookmarking*
Quiver is a bookmarklet that allows the ability to quickly collect content from around the web and post it
to your Sweeper deployment (effectively adding them to your quiver). This can be useful for individually
collecting research, or if you have teams of contributors actively recommending content for you to then
apply all our contextual APIs to.

* These features will ship with the forthcoming release of Sweeper.
VI. Adding Sources




To begin using Sweeper at all, one must begin aggregating from predefined sources. Essentially this
is where you inform the system what you want to track. Sweeper currently only accepts inputs that are
updated streams of data - feeds - in XML/ATOM/RSS or JSON format.

To get any content we don’t currently accept into Sweeper, all one would need to do is write a parser, a
few lines of code that tell the application how to structure data coming from that particular feed.

The types of content natively supported are IMAP, Gmail, FrontlineSMS, GoogleNews, any RSS or
Atom feed, Flickr, other SMS gateways and Twitter.



Email (IMAP)




Sweeper will accept the IMAP details of any email account and begin pulling in content allowing you to
aggregate, translate, tag and cluster your email.



Email (Gmail)
Sweeper supports aggregating email from any Gmail account, pulling in content and allowing you to
aggregate, translate, tag and cluster your email. Although Gmail also supports IMAP, the native Gmail
aggregation is recommend.



FrontlineSMS

In combination with FrontlineSMS, Sweeper can become a powerful SMS curation service that
aggregates real-time content (SMS) even if there is no internet connection! There are two ways of
integrating FrontlineSMS with Sweeper. Remote and Local.
Is for users who have access to some type of network, either it’s via the Internet or just a LAN. Simply
enter the details of the FrontlineSMS deployment you want to pull data from. You will need to use this
in combination with the FrontlineFetch go-between servlet which can be downloaded from http://
plugins.swiftly.org/?p=51.
The local option requires that Sweeper deployment and Frontline:SMS be installed on the same machine
or server. This allows the Sweeper application to pull directly from the FSMS database and will work even
if there is no Internet.



News & Blog Search




This source module allows you to set up a keyword search, returning real-time search results from
Google News, Posterous, Blogger and Wordpress.com. The results will appear in the main content view,
translated if necessary.



RSS/ATOM
Self-explanatory, simply enter the URL of a feed in the RSS, ATOM 1.0 or ATOM 2.0 service and
Sweeper will begin aggregating that content.



Flickr




This service allows the user to aggregate content from the photo-sharing service FlickR.




The options are fairly simple. Tag Search will return results aggregated from Flickr based on a search
using a specific keyword ex. cats, dogs, Eiffel Tower. Tag Search with Location will only return geo-
tagged results, great when used in combination with a mapping platform like Crowdmap. Follow User is
for only returning the results from a specific user account.



SMS Gateways
We’ve included a generic SMS gateway aggregator. It’s set up to read from the HTTP posts commonly
used by services that don’t have APIs. However, it’s there largely to fork and modify - a head start on
integrating your own SMS service.




Twitter

Culling content from Twitter is easy. There are two options Search and Follow User.




With Search, the user enters the name for a search (the name that has relevance to you) followed by the
term(s) that they would like to search. These can be common words or hashtags. ex. ‘My Twitter Search’
and ‘#searchword”. There is no limit to the number of search queries one can have, however the return of
results is limited by your individual access to the Twitter search API. If you’d like to increase this access
contact Twitter to get white-listed or contact support@swiftly.org.

Note on Sources and Search: When using a Twitter search please note that the search itself is not a
source. In the Swiftly eco-system, content producers are sources. This means that we will identify all the
individual content producers and help you keep track of them. This allows one to monitor conversations
around keywords that might lead them to great content producers.




With Follow User, the user can enter a unique name for the Twitter handle they want to follow along with
the actual @name on Twitter. For example ‘Bob Smith, Rwanda’ alongside ‘@bobsmith’. This is helpful
because it perhaps allows you to leave notes about who you may be following for yourself, or your team
members.

More Related Content

Similar to Sweeper User Guide v0.3

WEB 2.0
WEB 2.0WEB 2.0
WEB 2.0ARJUN
 
NOW! Get the internet to work for you!
NOW! Get the internet to work for you!NOW! Get the internet to work for you!
NOW! Get the internet to work for you!Philip Hannah
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentationDeb Forsten
 
Add Module Doing Business Over The Internet
Add Module Doing Business Over The InternetAdd Module Doing Business Over The Internet
Add Module Doing Business Over The Internetguest7b126e
 
Detailed study of aggregator for updates
Detailed study of aggregator for updatesDetailed study of aggregator for updates
Detailed study of aggregator for updateseSAT Journals
 
RSS and Social Bookmarking
RSS and Social BookmarkingRSS and Social Bookmarking
RSS and Social BookmarkingNGRF
 
Detailed study of aggregator for updates
Detailed study of aggregator for updatesDetailed study of aggregator for updates
Detailed study of aggregator for updateseSAT Publishing House
 
Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753pradip patel
 
Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753pradip patel
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama AttackIRJET Journal
 
library management system
library management systemlibrary management system
library management systemaniket chauhan
 
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...Anil Mishra
 
Strapi Meetup whitepaper
Strapi Meetup whitepaperStrapi Meetup whitepaper
Strapi Meetup whitepaperStrapi
 
Are you missing out on the RSS revolution?
Are you missing out on the RSS revolution?Are you missing out on the RSS revolution?
Are you missing out on the RSS revolution?Mike Richwalsky
 

Similar to Sweeper User Guide v0.3 (20)

WEB 2.0
WEB 2.0WEB 2.0
WEB 2.0
 
NOW! Get the internet to work for you!
NOW! Get the internet to work for you!NOW! Get the internet to work for you!
NOW! Get the internet to work for you!
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentation
 
Add Module Doing Business Over The Internet
Add Module Doing Business Over The InternetAdd Module Doing Business Over The Internet
Add Module Doing Business Over The Internet
 
Detailed study of aggregator for updates
Detailed study of aggregator for updatesDetailed study of aggregator for updates
Detailed study of aggregator for updates
 
RSS and Social Bookmarking
RSS and Social BookmarkingRSS and Social Bookmarking
RSS and Social Bookmarking
 
Detailed study of aggregator for updates
Detailed study of aggregator for updatesDetailed study of aggregator for updates
Detailed study of aggregator for updates
 
Tabloid
TabloidTabloid
Tabloid
 
Documentary watch
Documentary watchDocumentary watch
Documentary watch
 
Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753
 
Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753Twitter word frequency count using hadoop components 150331221753
Twitter word frequency count using hadoop components 150331221753
 
IRJET- Opinion Mining on Pulwama Attack
IRJET-  	  Opinion Mining on Pulwama AttackIRJET-  	  Opinion Mining on Pulwama Attack
IRJET- Opinion Mining on Pulwama Attack
 
library management system
library management systemlibrary management system
library management system
 
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...
Rss technology -a_tool_to_expedite_up-to-date_information_for_library_users -...
 
Strapi Meetup whitepaper
Strapi Meetup whitepaperStrapi Meetup whitepaper
Strapi Meetup whitepaper
 
Are you missing out on the RSS revolution?
Are you missing out on the RSS revolution?Are you missing out on the RSS revolution?
Are you missing out on the RSS revolution?
 
Documentary watch on the web
Documentary watch on the webDocumentary watch on the web
Documentary watch on the web
 
Eu stud
Eu studEu stud
Eu stud
 
Eu Stud
Eu StudEu Stud
Eu Stud
 
Rss Feeds
Rss FeedsRss Feeds
Rss Feeds
 

More from Ushahidi

Data Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final PresentationData Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final PresentationUshahidi
 
Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)Ushahidi
 
Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)Ushahidi
 
Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework Ushahidi
 
Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)Ushahidi
 
Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)Ushahidi
 
Ushahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time EvaluationUshahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time EvaluationUshahidi
 
Ushahidi Toolbox - Implementation
Ushahidi Toolbox - ImplementationUshahidi Toolbox - Implementation
Ushahidi Toolbox - ImplementationUshahidi
 
Ushahidi Toolbox - Assessment
Ushahidi Toolbox - AssessmentUshahidi Toolbox - Assessment
Ushahidi Toolbox - AssessmentUshahidi
 
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building BridgesKenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building BridgesUshahidi
 
Kenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: UchaguziKenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: UchaguziUshahidi
 
Kenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog SeriesKenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog SeriesUshahidi
 
Ushahidi esri juliana
Ushahidi esri julianaUshahidi esri juliana
Ushahidi esri julianaUshahidi
 
Ushahidi personas scenarios
Ushahidi personas scenariosUshahidi personas scenarios
Ushahidi personas scenariosUshahidi
 
Citizen pollution mapping made easy
Citizen pollution mapping made easy Citizen pollution mapping made easy
Citizen pollution mapping made easy Ushahidi
 
Map it, Change it
Map it, Change itMap it, Change it
Map it, Change itUshahidi
 
Map it, Make it, Hack it
Map it, Make it, Hack itMap it, Make it, Hack it
Map it, Make it, Hack itUshahidi
 
What if Citizens Mapped Health?
What if Citizens Mapped Health?What if Citizens Mapped Health?
What if Citizens Mapped Health?Ushahidi
 
Re-imagining Citizen Engagement
Re-imagining Citizen EngagementRe-imagining Citizen Engagement
Re-imagining Citizen EngagementUshahidi
 

More from Ushahidi (20)

Data Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final PresentationData Science for Social Good and Ushahidi - Final Presentation
Data Science for Social Good and Ushahidi - Final Presentation
 
Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)Corruption mapping (april 2013, part 2)
Corruption mapping (april 2013, part 2)
 
Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)Anti-Corruption Mapping (April 2013, part 1)
Anti-Corruption Mapping (April 2013, part 1)
 
Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework Ushahdi 3.0 Design Framework
Ushahdi 3.0 Design Framework
 
Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)Around the Globe Corruption Mapping (part 2)
Around the Globe Corruption Mapping (part 2)
 
Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)Around the Globe Corruption Mapping (part 1)
Around the Globe Corruption Mapping (part 1)
 
Ushahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time EvaluationUshahidi Toolbox - Real-time Evaluation
Ushahidi Toolbox - Real-time Evaluation
 
Ushahidi Toolbox - Implementation
Ushahidi Toolbox - ImplementationUshahidi Toolbox - Implementation
Ushahidi Toolbox - Implementation
 
Ushahidi Toolbox - Assessment
Ushahidi Toolbox - AssessmentUshahidi Toolbox - Assessment
Ushahidi Toolbox - Assessment
 
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building BridgesKenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
Kenya Ushahidi Evaluation: Unsung Peace Heros/Building Bridges
 
Kenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: UchaguziKenya Ushahidi Evaluation: Uchaguzi
Kenya Ushahidi Evaluation: Uchaguzi
 
Kenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog SeriesKenya Ushahidi Evaluation: Blog Series
Kenya Ushahidi Evaluation: Blog Series
 
Ushahidi esri juliana
Ushahidi esri julianaUshahidi esri juliana
Ushahidi esri juliana
 
Ushahidi personas scenarios
Ushahidi personas scenariosUshahidi personas scenarios
Ushahidi personas scenarios
 
Citizen pollution mapping made easy
Citizen pollution mapping made easy Citizen pollution mapping made easy
Citizen pollution mapping made easy
 
Testimony
TestimonyTestimony
Testimony
 
Map it, Change it
Map it, Change itMap it, Change it
Map it, Change it
 
Map it, Make it, Hack it
Map it, Make it, Hack itMap it, Make it, Hack it
Map it, Make it, Hack it
 
What if Citizens Mapped Health?
What if Citizens Mapped Health?What if Citizens Mapped Health?
What if Citizens Mapped Health?
 
Re-imagining Citizen Engagement
Re-imagining Citizen EngagementRe-imagining Citizen Engagement
Re-imagining Citizen Engagement
 

Recently uploaded

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Sweeper User Guide v0.3

  • 1. User Guide Updated April 3, 2011 Living Document for Sweeper v0.3 http://swiftly.org/userguide
  • 2. Table of Contents Table of Contents I. Introduction II. Using this Living Document III. About the Sweeper Application Suggested Uses As a FeedReader For Passive Data-Processing For Active Content Filtering For Real-time Social Media Curation As a Vertical Content Dashboard Terminology IV. Explaining the Sweeper UI Analytic Dashboard Main Content Window Admin Panel View Tabs Filter Panel Refresh Staging Area Rating Panel Content Items V. Overview of Plugins Duplicate Content Filter Google Language Services Geo-Location (Yahoo) Tagging Ushahidi Push Tag Clustering Annotations* Quiver/Bookmarking* VI. Adding Sources Email (IMAP) Email (Gmail) FrontlineSMS News & Blog Search RSS/ATOM Flickr SMS Gateways Twitter
  • 3. I. Introduction Thanks for using the Sweeper application! Sweeper is meant to be fairly intuitive but we’re well aware that sometimes it’s a little overwhelming at first to get started and knowing what’s possible. In this guide we will walk you though using Sweeper and a handful of the native plugins. This is not a guide for installing it (for that look here), rather this guide will walk you through use of the Sweeper software and the various plugins for it. If you are a developer seeking information on how to develop plugins, parsers or other modules for Sweeper and other SwiftRiver applications, click here.
  • 4. II. Using this Living Document Because Sweeper is an open-source product, who’s code and feature-set changes quite frequently, this user guide is a living document that serves only as a snapshot of what’s possible at the time it was last updated. We invite you to revisit this link often. If you decide to print it, just be aware that as soon as it’s transferred from bits to pulp, it’s essentially become outdated. Likewise, any copy of this document that is distributed in PDF form, DOC form, or FLV form, those versions too are likely outdated. To ensure you have the latest version, it can always be found at - http:// swiftly.org/userguide/
  • 5. III. About the Sweeper Application Sweeper is an application that focuses on the aggregation, curation and filtering of real-time content. It assumes the user knows exactly what sources they are tracking but needs an application to help them prioritize their attention. Here is a comparison. Sweeper is sort of like an open source version of TweetDeck, or to use a Google analogy: Google Reader. The user defines a number of sources to track and Sweeper offers a number of ways for filtering and viewing that collected content. Suggested Uses What can Sweeper be used for? A number of things but here’s a few ideas... As a FeedReader Sweeper was designed for collecting large amounts of disparate real-time data and sweeping through it quickly and efficiently, while also doing things to that content. So there is an emphasis on speed and summation of large datasets, allowing the user to decide upon where to spend his or her time to delve deeper. As mentioned in the examples above, one might consider using Sweeper as a substitute for a traditional feed-reader. However, unlike most feed-readers there are no restrictions on the type of data that can be aggregated, and there’s smart triggers applied to data going out. ex. If I perform this function, content is affected in this way. This functionality can be useful for setting up really advanced conditional taskingwhich we’ll cover later. For Passive Data-Processing Sweeper can also be configured to be a passive filter for data, meaning you can set it to aggregate content, then automatically perform certain tasks around that. ex. Aggregate all tweets from #hashtag tagged in the state of Maine and send only that data to another platform. When used in this way, Sweeper essentially becomes a smart cron tool equipped with geo- tagging, natural language processing and other power contexual features. For Active Content Filtering Users are also provided a number of utilities for quickly searching through content. Clicking on a selection of tags allows the user to see content only selecting those tags. The cluster panel allows content to be clustered around other content in various channels that are similar. The user can also sort by assigned scores (which can represent the favor they might have for some types of content over others) in any variation between 1 and 100. ex. show me only the content with a score of 40 or above; or only content between 20 and 60. For Real-time Social Media Curation Sweeper can be used for real-time media curation across channels (Blogs, News, RSS/ATOM, Twitter, SMS, Email) and across over 50 languages. For a journalist attempting to collect data
  • 6. that’s rapidly unfolding across social-media, this can save potentially unprecedented amounts of time. Rather that opening 50 different windows for different apps, the Sweeper application can be used to mine and add context to disparate content, completely at the users whim. Perhaps even more interestingly, all this aggregated data can be annotated, mapped, shared or exported in a number of ways after it’s been structured as the user sees fit. As a Vertical Content Dashboard Perhaps you have a need to know what’s going across various industries at all times. You could enter the feeds of several well known bloggers, the @twitternames of thought leaders in that industry, a public facing email address you control like sports@mynewsite.com, a public facing shortcode (ex. 6060). That might just be your sports page. But when you replicate that experience multiple times across Entertainment, World News, Food, Lifestyle etc. you end up with an equally rich immerse real-time data-mining tool across all those interests. Terminology Before we continue, it will help if you have a basic understanding of the terminology we use to discuss the application. Sweeper (capital ‘S’) - the name of a SwiftRiver application for aggregating and processing feeds of content sweeper (lowercase ‘s’) - generally, one who performs the function of sweeping through feeds of content. However, in the Sweeper application the user role of sweeper is assigned to users who can edit tags and process content but who don’t have administrative rights to the application. sweep - to process data channel - the distribution type used to deliver content. Twitter, Email, RSS/ATOM, SMS are all channels. source - the place (or person) from which content originates. a persons @twittername, email address, blog or web url, or phone-number would all be considered sources. Several sources may be collected to reference a single identity ex. this blog, this url, this phone number all belong to the same person content item - a single item of content collected from a feed, regardless of the channel it came in on or the source it came from tag - a layer of taxonomy applied to all content lat/lon - geospatial coordinates; short for latitude and longitude veracity - more accurately the subjective favor the user (or users) has for content. The baseline of favor expressed for certain types of content is uses as a building block for a score applied to content. This score is then used both for prioritizing sources and for recommending other content the user or users may favor. cluster - a collection of content items deemed to be statistically similar based on tags editors - editors don’t have full administrative rights to the application but they can perform tasks that sweepers can’t. turbine - another word for plugins for SwiftRiver applications impulse turbine - plugins that pre-process content (before the application receives it). Impulse Turbine plugins affect how data is structured as part of the Swift object module. reactor turbine - plugins that process content based on human interaction or assigned logic (after the application has received it). Reactor Turbine plugins can be used to take structured data and do
  • 7. something with it. parsers - on the application architecture level parsers are modules that can be written to create new sources trusted source - applies a default score of 100 to a source allowing the user to vote against a high-score as the default. ex. you have my trust now but could lose it over-time
  • 8. IV. Explaining the Sweeper UI So now that we’ve got the basics we can walk you through the Sweeper user interface, it’s basic features and functions. At first look the application can be a little intimidating so hopefully this guide takes the edge off (like a martini!).
  • 9. Analytic Dashboard This dashboard offers a quick survey of the content being collected by Sweeper. Where is data mostly being collected from? How much content in total? Howe much from each channel? The charts are dynamic and update with each use of the application.
  • 10. Main Content Window Below you see the main content display window. This is where aggregated content can be viewed.
  • 11. Admin Panel This area contains four tabs. Login, Impulse Turbines, Reactor Turbines, Sources, Add User Login - as you might expect, this area allows users to login to the application Impulse Turbine - for enabling or disabling impulse turbine plugins Reactor Turbine - for enabling or disabling reactor turbine plugins Sources - this is the area where one can add sources to aggregate into Sweeper Users - area for adding users and assigning their administrative rights
  • 12. View Tabs This area contains several tabs for altering the view of the main content window. The titles are fairly self- explanatory. Dashboard, New content, Accurate, Inaccurate, Crosstalk, Irrelevant Dashboard - contains a collection of charts plotting various aspects of the content being collected New content - for viewing new content as it’s being collected Accurate - shows all content voted up Inaccurate - shows all content voted down Crosstalk - shows content that is completely off-topic Irrelevant - shows content that is on-topic but not relevant to the user’s specific needs
  • 13. Filter Panel Filters for changing the view of the main content window. Veracity Slider - allows the user to set a range of anything between 1 and 100 to view content by assigned score Channels - view only the content that came in on a particular channel Tags - view only the content containing a selection of tags Refresh Staging Area Reveals how much content has been aggregated since the main content window was last refreshed.
  • 14. Rating Panel The upper left part of the Rating Panel is for quickly determining information about content. Is this a ‘trusted’ source or has it been rated as trusted by the people within your bounded (or unbounded) group of users? The upper right quadrant shows a score that represents the favor the user or their community has for the associated source. In the lower quadrant we have four buttons here is what they essentially do: Green (Up) - expresses favor for a content item while positively affecting it’s sources score so that in the future content from the same source will be prioritized. Red (Down) - expresses disapproval for a content item while negatively affecting it’s sources score so that in the future content from the same source will be deprioritized Crosstalk - expresses that this content is not relevant because it’s essentially been collected by mistake and that it’s not useful. Removes it from the main view without negatively affecting the source score. Irrelevant - expresses that this content is not germane to the task the user is trying to perform and more importantly, is somehow damaging or distracting. Removes the content from the main view with negatively affecting the source score. It’s important to note that these votes whether up or down are not the only things being factored into the scoring of content. We also factor in a number of things like the tag profile of content, the ratings of the individuals users rating this individual, and other factors. For an in-depth explanation see the RiverID System Guide.
  • 15. Content Items Content items are divided into three sub-sections: the Header, the Body and the Footer. In the Header you’ll find an icon denoting what channel this content came in on: Twitter, Email, SMS, or RSS/Atom. Clicking this icon will reveal more: A pop-up display reveals information about the source and the content itself: Source - the source of the content (a Twitter @name, email address, url or phone number) Channel - the channel the content came in on (Twitter, Email, SMS, or RSS/Atom) Source Score - the trust score associated with this source Link - hyperlink to the original content
  • 16. In the Body you’ll find a portion of the message (from Twitter and SMS) or headline/subject (Articles, Blogs, Email) In the Footer you’ll find tags which add a layer of taxonomy to the content. You can quickly find other content like this particular content item by clicking on the tags themselves. Users can also add their own tags*, edit tags* or delete tags to help the system improve**. * Adding tags and editing tags is not possible in the v0.3.0 of Sweeper UI. However a slight modification of the code exposes this feature and makes it available. ** There is an active learning element of our Tagging API that allows the system to learn from user feedback that will be available soon. You can read more about this in the section on Impulse Reactor Plugins.
  • 17. V. Overview of Plugins There are a few plugins that ship with Sweeper and that are either enabled by default or commonly used. There are way too many to list here so in this section we’ll explain what a few of the available plugins are and what they are used for. You can always find more plugins for Swiftly applications at http://plugins.swiftly.org Duplicate Content Filter When activated, this plugin passes all content through the Duplication Filter API in the Swift Web Service stack, effectively removing all duplicate content (like retweets) from a feed. Google Language Services When activated, this plugin passes all content through the Google Translate API. Google Translate will automatically detect what language the content is in, translate it and send it back. This allows you to aggregate content in multiple languages but only see the resulting translated, English content! This is a huge time saver when doing international research. But how do you know what content has been translated. When activated, additional info in the content item’s header will let the user know what has been translated, and from what language. See the example above. If you expect large amounts of data you may want to opt for the Google Enterprise Language Service
  • 18. plugin instead. With this plugin the amount of content that can be translated is increased significantly. It requires an API key from Google. If you need help getting Enterprise level access, contact us at support@swiftly.org Geo-Location (Yahoo) When activated, this plugin passes all content through the Yahoo Placemaker API where we try to detect a location where the content is likely to have originated from. We then apply lat/lon coordinates to the content that are then stored as part of the content meta info. When passed to other systems, this lat/lon info can be used for geo-spatial reference. To use this service, you’ll need to acquire a Yahoo Placemaker API key from Yahoo. If you need help getting Enterprise level access, contact us at support@swiftly.org Tagging When activated, all content passing through Sweeper will be tagged by our natural language processing API. Essentially this services tries to extract what it thinks are the active keywords being used, and uses that to help the user automatically sort content. Tags are very important to SwiftRiver and we take a dual taxonomic and folksonomic approach in our applications. Meaning, although these tags are machine generated, they can be edited and improved upon by humans which in turns helps to teach the algorithm how to tag content better. Ushahidi Push For users of Ushahidi or Crowdmap. This will take any content voted up in the Ratings panel and automatically plot it on a designated Ushahidi deployment map as an approved report. This is a significant time saver for large groups who want to use Sweeper to curate data, but use Ushahidi or Crowdmap to visualize it.
  • 19. Users will need to enter and API key for an Ushahidi deployment that they have administrative rights to. ex. http://xxx.xxx.xx.xxx/ushahidi/ There are many variants of this plugin. One is called Ushahidi Passive Push and essentially it turns Sweeper into a cron suite where content is automatically aggregated, structured, and passed along to Ushahidi...mostly without any human operators! Tag Clustering When activated, this plugin allows the user to view content similar to any particular content item. The clustering is done by using a statistical profile of the associated Tags for proximity matching. This gives the user more control over alternative recommendation methods, because it can factor in the users own tagging methods. For instance if I use unique identifiers or words unique to my organization, they too can be used as part of the proximity matching algorithm! Annotations* Annotations offers the ability to annotate any content item. This can be used to leave individual notes for reference, or to collaboratively converse around content with your team. Quiver/Bookmarking* Quiver is a bookmarklet that allows the ability to quickly collect content from around the web and post it to your Sweeper deployment (effectively adding them to your quiver). This can be useful for individually collecting research, or if you have teams of contributors actively recommending content for you to then apply all our contextual APIs to. * These features will ship with the forthcoming release of Sweeper.
  • 20. VI. Adding Sources To begin using Sweeper at all, one must begin aggregating from predefined sources. Essentially this is where you inform the system what you want to track. Sweeper currently only accepts inputs that are updated streams of data - feeds - in XML/ATOM/RSS or JSON format. To get any content we don’t currently accept into Sweeper, all one would need to do is write a parser, a few lines of code that tell the application how to structure data coming from that particular feed. The types of content natively supported are IMAP, Gmail, FrontlineSMS, GoogleNews, any RSS or Atom feed, Flickr, other SMS gateways and Twitter. Email (IMAP) Sweeper will accept the IMAP details of any email account and begin pulling in content allowing you to aggregate, translate, tag and cluster your email. Email (Gmail)
  • 21. Sweeper supports aggregating email from any Gmail account, pulling in content and allowing you to aggregate, translate, tag and cluster your email. Although Gmail also supports IMAP, the native Gmail aggregation is recommend. FrontlineSMS In combination with FrontlineSMS, Sweeper can become a powerful SMS curation service that aggregates real-time content (SMS) even if there is no internet connection! There are two ways of integrating FrontlineSMS with Sweeper. Remote and Local.
  • 22. Is for users who have access to some type of network, either it’s via the Internet or just a LAN. Simply enter the details of the FrontlineSMS deployment you want to pull data from. You will need to use this in combination with the FrontlineFetch go-between servlet which can be downloaded from http:// plugins.swiftly.org/?p=51.
  • 23. The local option requires that Sweeper deployment and Frontline:SMS be installed on the same machine or server. This allows the Sweeper application to pull directly from the FSMS database and will work even if there is no Internet. News & Blog Search This source module allows you to set up a keyword search, returning real-time search results from Google News, Posterous, Blogger and Wordpress.com. The results will appear in the main content view, translated if necessary. RSS/ATOM
  • 24. Self-explanatory, simply enter the URL of a feed in the RSS, ATOM 1.0 or ATOM 2.0 service and Sweeper will begin aggregating that content. Flickr This service allows the user to aggregate content from the photo-sharing service FlickR. The options are fairly simple. Tag Search will return results aggregated from Flickr based on a search using a specific keyword ex. cats, dogs, Eiffel Tower. Tag Search with Location will only return geo- tagged results, great when used in combination with a mapping platform like Crowdmap. Follow User is for only returning the results from a specific user account. SMS Gateways
  • 25. We’ve included a generic SMS gateway aggregator. It’s set up to read from the HTTP posts commonly used by services that don’t have APIs. However, it’s there largely to fork and modify - a head start on integrating your own SMS service. Twitter Culling content from Twitter is easy. There are two options Search and Follow User. With Search, the user enters the name for a search (the name that has relevance to you) followed by the term(s) that they would like to search. These can be common words or hashtags. ex. ‘My Twitter Search’
  • 26. and ‘#searchword”. There is no limit to the number of search queries one can have, however the return of results is limited by your individual access to the Twitter search API. If you’d like to increase this access contact Twitter to get white-listed or contact support@swiftly.org. Note on Sources and Search: When using a Twitter search please note that the search itself is not a source. In the Swiftly eco-system, content producers are sources. This means that we will identify all the individual content producers and help you keep track of them. This allows one to monitor conversations around keywords that might lead them to great content producers. With Follow User, the user can enter a unique name for the Twitter handle they want to follow along with the actual @name on Twitter. For example ‘Bob Smith, Rwanda’ alongside ‘@bobsmith’. This is helpful because it perhaps allows you to leave notes about who you may be following for yourself, or your team members.