SlideShare ist ein Scribd-Unternehmen logo
1 von 41
IBM InfoSphere Streams
Technical Overview




Jerome Chailloux
Europe IOT - Sr. Technical Field Specialist - Big Data, Linux Advocate
jerome.chailloux@fr.ibm.com



   February 21, 2013                                                     © 2013 IBM Corporation
IBM InfoSphere Streams v3.0
A platform for real-time analytics on BIG data
 Volume                                    Just-in-time decisions
     – Terabytes per second
     – Petabytes per day                               Powerful
 Variety                                              Analytics

     – All kinds of data
     – All kinds of analytics          Millions of                    Microsecond
                                       events per                       Latency
 Velocity                              second
     – Insights in microseconds
 Agility                                      Sensor, video, audio, text,
     – Dynamically responsive                  and relational data sources
     – Rapid application development




2                                                                       © 2013 IBM Corporation
Streams Analyzes All Kinds of Data
                    Mining in Microseconds
                    (included with Streams)                Acoustic
                                                           (IBM Research)
                                                           (Open Source)
                                       ***New
    Text          Simple & Advanced Text                                     Advanced
(listen, verb),
                  (included with Streams)                                    Mathematical
(radio, noun)                                                                Models
                                                                             (IBM Research)
                             ***New
                                                                              Statistics
                      Predictive
                      (included with
                                                          ∑ R(s , a )
                                                        population
                                                                     t   t    (included with
                                                                              Streams)
                      Streams)
                                            ***New   Image & Video
                                Geospatial           (Open Source)
                                (included with
                                Streams)

3                                                                               © 2013 IBM Corporation
Categories of Problems Solved by Streams
 Applications that require on-the-fly processing, filtering and analysis
  of streaming data
    – Sensors: environmental, industrial, surveillance video, GPS, …
    – “Data exhaust”: network/system/web server/app server log files
    – High-rate transaction data: financial transactions, call detail records

 Criteria: two or more of the following
    – Messages are processed in isolation or in limited data windows
    – Sources include non-traditional data (spatial, imagery, text, …)
    – Sources vary in connection methods, data rates, and processing requirements,
      presenting integration challenges
    – Data rates/volumes require the resources of multiple processing nodes
    – Analysis and response are needed with sub-millisecond latency
    – Data rates and volumes are too great for store-and-mine approaches




4                                                                               © 2013 IBM Corporation
The Big Data Ecosystem: Interoperability is Key

                                   Streams
             Non-Traditional / /
              Non-Traditional
Streaming
             Non-Relational
              Non-Relational
   Data
                Data Feeds
                 Data Feeds
                                                 RTAP: Analytics on
                                                       Data in Motion
             Non-Traditional / /
              Non-Traditional            BigInsights
              Non-Relational
               Non-Relational
Internet-     Data Sources
               Data Sources
  Scale
Data Sets      Traditional / /
                Traditional
                Relational
                 Relational
               Data Sources
                Data Sources                           Analytics on
                                                       Data at Rest


Traditional  Traditional / /
              Traditional
Warehouse Relational
            Relational
                   Data
                    Data
               Sources
                Sources              Data                Analytics on
                                   Warehouse           Structured Data


 5                                                            © 2013 IBM Corporation
Streaming Analytics in Action
                                                                    Stock Market
                                                           Impact of weather on securities prices
               Natural Systems                           Analyze market data at ultra-low latencies
                Wildfire management
                 Water management                                         Law Enforcement,
                                                                        Defense & Cyber Security
                                                                          Real-time multimodal surveillance
 Transportation                                                                Situational awareness
    Intelligent traffic                                                      Cyber security detection
       management
                                                                                Fraud Prevention
                                                                               Detecting multi-party fraud
                                                                               Real time fraud prevention
 Manufacturing
  Process control for
  microchip fabrication
                                                                                       e-Science
                                                                                 Space weather prediction
                                                                               Detection of transient events
  Health & Life Sciences                                                       Synchrotron atomic research
   Neonatal ICU monitoring
    Epidemic early warning                                                    Other
            system                     Telephony                             Smart Grid
      Remote healthcare            CDR processing                         Text analysis
          monitoring                 Social analysis                  Who’s talking to whom?
                                    Churn prediction                   ERP for commodities
                                      Geomapping                        FPGA acceleration
Use Case: Law Enforcement and Security
 Video surveillance, wire taps,
  communications, call records, etc.

 Millions of messages per second
  with low density of critical data

 Identify patterns and relationships
  among vast information sources




    “The US Government has been working with IBM Research since 2003 on a
    radical new approach to data analysis that enables high speed, scalable and
    complex analytics of heterogeneous data streams in motion. The project has
    been so successful that US Government will deploy additional installations to
    enable other agencies to achieve greater success in various future projects”
    – US Government
7                                                                        © 2013 IBM Corporation
Predictive Analytics in a Neonatal ICU
 Real-time analytics and correlations
  on physiological data streams
    – Blood pressure, Temperature, EKG,
      Blood oxygen saturation etc.,
 Early detection of the onset of
  potentially life-threatening
  conditions
    – Up to 24 hours earlier than current
      medical practices
    – Early intervention leads to lower patient
      morbidity and better long term outcomes
 Technology also enables physicians
  to verify new clinical hypotheses




8                                                 © 2013 IBM Corporation
Smarter Faster Cheaper CDR Processing
  6 Billion CDRs per day, dedups over 15 days, processing latency from 12 hours to a few seconds
                             6 machines (using ½ processor capacity)




InfoSphere Streams
     xDR Hub
                                                                      Key Requirements:
                                                                Price/Performance and Scaling
Surveilance and Physical Security: TerraEchos (Business
Partner)

 Use scenario
   – State-of-the-art covert surveillance system
     based on Streams platform

   – Acoustic signals from buried fiber optic cables
     are monitored, analyzed and reported in real time
     for necessary action

   – Currently designed to scale up to 1600 streams
     of raw binary data

 Requirement

   – Real-time processing of multi-modal signals
     (acoustics. video, etc)

   – Easy to expand, dynamic

   – 3.5M data elements per second

 Winner 2010 IBM CTO Innovation Award
Streams for Real-Time Geomapping

                       Multiple GPS Data Sources
                          350-400K probe points / second per source
                          Map probe point to nearest poly-line (Map) Real-time location profile
                          200 million – 1 billion poly-lines
                          2 level grid decomposition based search
                       14 Blade servers
                          2X Dual-Core Xeon 5160
                          16 GB RAM
                          4 data prep, 10 mapping
                       Performance
                          941,000 probes/sec
                           for 1 Billion poly-lines
secr uo S a a D SP G




                                        Hierarchical
           t




                                         Mapping


      11                                                                             © 2013 IBM Corporation
Use Cases: Video Processing (Contour Detection)




                  Original Picture




                 Contour Detection

12                                           © 2013 IBM Corporation
How Streams Works
      Continuous ingestion
      continuous ingestion
           Continuous analysis




13                                © 2013 IBM Corporation
How Streams Works
      Continuous ingestion                   Infrastructure provides services for
          Continuous analysis                     Scheduling analytics across hardware hosts,
                                                   Establishing streaming connectivity
                      Filter / Sample
                                                    Transform                  Annotate




                                                  Correlate
                                                                   Classify




Achieve scale:                                                       Where appropriate:
    By partitioning applications into software components             Elements can be fused together
    By distributing across stream-connected hardware hosts            for lower communication latency

14                                                                                        © 2013 IBM Corporation
Scalable Stream Processing
 Streams programming model: construct a graph

     – Mathematical concept                    OP        OP                                         OP
        • not a line -, bar -, or pie chart!                        OP          OP
        • Also called a network                     OP
                                                         stream                                     OP
        • Familiar: for example,
          a tree structure is a graph
     – Consisting of operators and the streams that connect them
        • The vertices (or nodes) and edges of the mathematical graph
        • A directed graph: the edges have a direction (arrows)
 Streams runtime model: distributed processes
     – Single or multiple operators form a Processing Element (PE)
     – Compiler and runtime services make it easy to deploy PEs
        • On one machine
        • Across multiple hosts in a cluster when scaled-up processing is required
     – All links and data transport are handled by runtime services
        • Automatically
        • With manual placement directives where required



15                                                                                   © 2013 IBM Corporation
From Operators to Running Jobs
 Streams application graph:                Src          OP                                                     Sink
     – A directed, possibly cyclic, graph                                      OP            OP
     – A collection of operators                   Src
                                                           st r e a m                                           Sink
     – Connected by streams
 Each complete application is a potentially deployable job

 Jobs are deployed to a Streams runtime environment, known as a
  Streams Instance (or simply, an instance)

 An instance can include a single processing node (hardware)

 Or multiple processing nodes


                               node         node                                    node
                                    node            node                node
                                                                                      node
                                h/w node

                                                     Streams instance

16                                                                                                © 2013 IBM Corporation
InfoSphere Streams Objects: Runtime View
                                             Instance
 Instance
     – Runtime instantiation of InfoSphere   Job
       Streams executing across one or        Node
       more hosts                               PE                          PE
                                                                 Stream 1      Stream 2
     – Collection of components and              operator
       services
 Processing Element (PE)                             Stream
                                                             1

     – Fundamental execution unit that is
                                                        PE            3      Stream 4
       run by the Streams instance                           Stream
     – Can encapsulate a single operator                     Stream
                                                                      3      Stream 5
       or many “fused” operators
 Job                                          Node
     – A deployed Streams application
       executing in an instance
     – Consists of one or more PEs




17                                                                                 © 2013 IBM Corporation
Competition: Complex-Event Processing (CEP)

              Streams                                  CEP



      Analytics on continuous
                                     “real time”          Analysis on discrete
       data streams
      Simple to extremely                                 business events
                                 “ultra-low” latency      Rules-based (if-then-
       complex analytics
      Scale for computational                             else) with correlation
                                                           across event types
       intensity                   event stream
                                                          Only structured data
      Supports a whole range       processing             types are supported
       of relational and non                              Modest data rates
       relational data types




18                                                                          © 2013 IBM Corporation
IBM InfoSphere Streams 3.0
     Comprehensive
                          Scale-out architecture Sophisticated analytics with
        tooling
                                                   toolkits & accelerators




                                                                           Front Office 3.0

                          • Clustered runtime for near-   • Big Data, CEP, Database,
• Eclipse IDE               limitless capacity              Data Explorer (Big Data),
• Web console             • RHEL v5.3 and above             DataStage, Finance, Geospatial,
                          • CentOS v6.0 and above           Internet, Messaging, Mining, SPSS,
• Drag & Drop editor
                                                            Standard, Text, TimeSeries toolkits
• Instance graph          • X86 & Power multicore
                            hardware
• Streams visualization
                          • InfiniBand support            • Telco & Social Media accelerators
• Streams debugger
                          • Ethernet support
19                                                                                   © 2013 IBM Corporation
What is Streams Processing Language?
 Designed for stream computing
     – Define a streaming-data flow graph
     – Rich set of data types to define tuple attributes
 Declarative
     – Operator invocations name the input and output streams
     – Referring to streams by name is enough to connect the graph
 Procedural support
     – Full-featured imperative language
     – Custom logic in operator invocations
     – Expressions in attribute assignments and parameter definitions
 Extensible
     –   User-defined data types
     –   Custom functions written in SPL or a native language (C++ or Java)
     –   Custom operators written in SPL
     –   User-defined operators written in a native language (C++ or Java)




20                                                                            © 2013 IBM Corporation
Streams Standard Toolkit: Relational Operators
 Relational operators
     –   Functor Perform tuple-level manipulations
     –   Filter Remove some tuples from a stream
     –   Aggregate     Group and summarize incoming tuples
     –   Sort Impose an order on incoming tuples in a stream
     –   Join Correlate two streams
     –   Punctor Insert window punctuation markers into a stream




21                                                                 © 2013 IBM Corporation
Streams Standard Toolkit: Adapter Operators
 Adapter operators
     –   FileSource    Read data from files in formats such as csv, line, or binary
     –   FileSink Write data to files in formats such as csv, line, or binary
     –   DirectoryScan     Detect files to be read as they appear in a given directory
     –   TCPSource Read data from TCP sockets in various formats
     –   TCPSink Write data to TCP sockets in various formats
     –   UDPSource Read data from UDP sockets in various formats
     –   UDPSink Write data to UDP sockets in various formats
     –   Export    Make a stream available to other jobs in the same instance
     –   Import    Connect to streams exported by other jobs
     –   MetricsSink Create displayable metrics from numeric expressions




22                                                                            © 2013 IBM Corporation
Streams Standard Toolkit: Utility Operators
 Workload generation and custom logic
     – Beacon      Emit generated values; timing and number configurable
     – Custom      Apply arbitrary SPL logic to produce tuples
 Coordination and timing
     –   Throttle Make a stream flow at a specified rate
     –   Delay    Time-shift an entire stream relative to other streams
     –   Gate Wait for acknowledgement from downstream operator
     –   Switch Block or allow the flow of tuples based on control input




23                                                                         © 2013 IBM Corporation
Streams Standard Toolkit: Utility Operators (cont’d)
 Parallel flows
     –   Barrier   Synchronize tuples from sequence-correlated streams
     –   Pair Group tuples from multiple streams of same type
     –   Split Forward tuples to output streams based on a predicate
     –   ThreadedSplit Distribute tuples over output streams by availability
     –   Union     Construct an output tuple from each input tuple
     –   DeDuplicate Suppress duplicate tuples seen within a given time period
 Miscellaneous
     –   DynamicFilter Filter tuples based on criteria that can change while it runs
     –   JavaOp General-purpose operator for encapsulating Java code
     –   Parse    Parse blob data for use with user-defined adapters
     –   Format Format blob data for use with user-defined adapters
     –   Compress      Compress blob data
     –   Decompress Decompress blob data
     –   CharacterTransform Convert blob data from one encoding to another

24                                                                             © 2013 IBM Corporation
XML Support: Built Into SPL                                  <catalog>
                                                              <catalog>
                                                              <book price="30.99">
                                                               <book price="30.99">
                                                               <title>This is aaboring title</title>
                                                                <title>This is boring title</title>
 XML type                                                     <author>John Smith</author>
                                                                <author>John Smith</author>
                                                               <author>Howard Hughes</author>
     – Validated for syntax or schema                           <author>Howard Hughes</author>
                                                               <reference quality="-1">
                                                                <reference quality="-1">
 XMLParse operator                                              <book>The first reference</book>
                                                                  <book>The first reference</book>
     – With XPath expressions and functions                    </reference>
                                                                </reference>
                                                               <reference quality="100">
                                                                <reference quality="100">
       to parse and manipulate XML data                          <book>Another Book</book>
                                                                  <book>Another Book</book>
     – Convert XML to tuples                                   </reference>
                                                                </reference>
 XQuery functions                                            </book>
                                                               </book>
                                                             </catalog>
     – Use XML data on the fly                                </catalog>
 Adapters support XML format
     – Standard Toolkit      Example: Extract information about books from an XML catalog
                              Example: Extract information about books from an XML catalog
     – Database Toolkit      stream<BookInfo> XX == XMLParse(XML) {{
                              stream<BookInfo>         XMLParse(XML)
        • Supports              param trigger :: "/catalog/book" ;;
                                 param trigger     "/catalog/book"
          DB2 pureXML                 parsing :: permissive;
                                       parsing     permissive;       // log and ignore errors
                                                                      // log and ignore errors
                                output XX :: title
                                 output       title      == XPath("title/text()"),
                                                             XPath("title/text()"),
                                             authors
                                              authors    == XPathList("author/text()"),
                                                             XPathList("author/text()"),
                                             price
                                              price      == (decimal32) XPath ("@price"),
                                                             (decimal32) XPath ("@price"),
                                             references == XPathList("reference",
                                              references     XPathList("reference",
                                                    {quality == (int32) XPath("@quality"),
                                                     {quality     (int32) XPath("@quality"),
                                                     book
                                                      book    == XPath("book/text()") });
                                                                  XPath("book/text()") });
                             }}


25                                                                                   © 2013 IBM Corporation
Streams Extensibility: Toolkits

 Like packages, plugins, addons, extenders, etc.
     – Reusable sets of operators, types, and functions
     – What can be included in a toolkit?
        •   Primitive and composite operators
        •   User-defined types
        •   Native and SPL functions
        •   Sample applications, utilities
        •   Tools, documentation, data, etc.
     – Versioning is supported
     – Define dependencies on other versioned assets (toolkits, Streams itself)
 Base for creating cross-domain and domain-specific applications

 Developed by IBM, partners, end-customers
     – Complete APIs and tools available
     – Same power as the “built-in” Standard Toolkit




26
26                                                                         © 2013 IBM Corporation
Integration: the Internet and Database Toolkits
 Integration with traditional sources and consumers

 Internet Toolkit
     – InetSource      periodically retrieve data from HTTP, FTP, RSS, and files
 Database Toolkit
          • ODBC databases: DB2, Informix, Oracle, solidDb, MySQL, SQLServer, Netezza
     –   ODBCAppend Insert rows into an SQL database table
     –   ODBCEnrich Extend streaming data based on lookups in database tables
     –   ODBCRun       Perform SQL queries with parameters from input tuples
     –   ODBCSource Read data from a SQL database
     –   SolidDBEnrich Perform table lookups in an in-memory database
     –   DB2SplitDB Split a stream by DB2 partition key
     –   DB2PartitionedAppend Write data to table in specified DB2 partition
     –   NetezzaLoad Perform high-speed loads into a Netezza database
     –   NetezzaPrepareLoad Convert tuple to delimited string for Netezza loads

27                                                                             © 2013 IBM Corporation
Integration: the Big Data Toolkit
 Integration with IBM’s Big Data Platform

 Data Explorer
     – DataExplorerPush        Insert records into a Data Explorer index

 BigInsights: Hadoop Distributed File System
     –   HDFSDirectoryScan     Like DirectoryScan, only for HDFS
     –   HDFSFileSource Like FileSource, only for HDFS
     –   HDFSFileSink Like FileSink, only for HDFS
     –   HDFSSplit    Write batches of data in parallel to HDFS




28                                                                         © 2013 IBM Corporation
Integration: the DataStage Integration Toolkit
 Streams real-time analytics
  and DataStage information
  integration
     – Perform deeper analysis on the
       data as part of the information
       integration flow
     – Get more timely results and
       offload some analytics load
       from the warehouse
 Operators and tooling
     – Adapters to exchange data
       between Streams and
       DataStage
        • DSSource
        • DSSink
     – Tooling to generate
       integration code
     – DataStage provides
       Streams connectors
29                                               © 2013 IBM Corporation
Integration: the Messaging Toolkit
 Integrate with IBM WebSphere MQ
     – Create a stream from a subscription to an MQ topic or queue
     – Publish a stream to an MQ series topic or queue
 Operators
     – XMSSource Read data from an MQ queue or topic
     – XMSSink Send messages to applications that use WebSphere MQ




30                                                                   © 2013 IBM Corporation
Integration and Analytics: the Text Toolkit

 Derive structured information from unstructured text
     – Apply extractors
        •   Programs that encode rules for extracting information from text
        •   Written in AQL (Annotation Query Language)
        •   Developed against a static repository of text data
        •   AQL files can be combined into modules (directories)
        •   Modules can be compiled into TAM files
        •   NOTE: Old-style AOG files not supported by BigInsights 2.0 and this toolkit
                Can be used in Streams 3.0 with the Deprecated Text Toolkit
 Streams Studio plugins for developing AQL queries
     – Same tooling as in BigInsights
 Operator and utility
     – TextExtract Run AQL queries (module or TAM) over text documents
     – createtypes script
        • Create stream type definitions that match the output views of the extractors.
 Plays a major role in accelerators
     – SDA: Social Data Analytics
     – MDA: Machine Data Analytics


31                                                                                  © 2013 IBM Corporation
Analytics: The Complex Event Processing Toolkit
 MatchRegex                    Use patterns to detect composite events
     – In streams of simple events (tuples)
     – Easy-to-use regex-style pattern match of user-defined predicates
 Integration in Streams allows CEP-style processing with high
  performance and rich analytics

     Example: detect “M-shape” patterns in stock prices:
      Example: detect “M-shape” patterns in stock prices:
                   double-top formations
                    double-top formations
stream<MatchT> Matches == MatchRegex(Quotes) {{
 stream<MatchT> Matches      MatchRegex(Quotes)
   param
    param
     pattern :: ". rise+ drop+ rise+ drop* deep" ;;
      pattern    ". rise+ drop+ rise+ drop* deep"                                             drop
     partitionBy :: symbol ;;                                                       rise
      partitionBy    symbol
     predicates :: {{
      predicates                                                     Series of
       rise == price >> First(price) && price >= Last(price),
        rise    price     First(price) && price >= Last(price),      rising peaks
       drop == price >= First(price) && price << Last(price),
        drop    price >= First(price) && price      Last(price),     and troughs
       deep == price << First(price) && price << Last(price) };
        deep    price     First(price) && price     Last(price) };
   output
    output
     Matches :: symbol == symbol, seqNum == First(seqNum),
      Matches    symbol    symbol, seqNum    First(seqNum),                          Deep drop
                count == Count(), maxPrice == Max(price);
                 count    Count(), maxPrice    Max(price);
}}                                                                                   below start
                                                                                     of match


32                                                                                   © 2013 IBM Corporation
Analytics: The Geospatial Toolkit
 High-performance analysis and processing of geospatial data
 Enables Location Based Services (LBS)
     – Smarter Transportation: monitor traffic speed and density
     – Geofencing: detect when objects enter or leave a specified area
 Geospatial data types
     – e.g. Point, LineString, Polygon
 Geospatial functions
     – e.g. Distance, Map point to LineString, IsContained, etc.




33                                                                       © 2013 IBM Corporation
Analytics: The TimeSeries Toolkit
 Apply Digital Signal Processing techniques
     – Find patterns and anomalies in real time
     – Predict future values in real time
 A rich set of functionality for working with time series data
     – Generation
        • Synthesize specific wave forms (e.g., sawtooth, sine)
     – Preprocessing
        • Preparation and conditioning (ReSample, TSWindowing)
     – Analysis
        • Statistics, correlations, decomposition and transformation
     – Modeling
        • Prediction, regression and tracking (e.g. Holt-Winters, GAMLearner)




34                                                                              © 2013 IBM Corporation
Analytics: the Mining Toolkit
 Enables scoring of real-time data in a Streams application
     – Scoring is performed against a predefined model (in PMML file)
     – Supports a variety of model types and scoring algorithms
 Predictive Model Markup Language (PMML)
     – Standard for statistical and data mining models
     – Produced by Analytics tools: SPSS, ISAS, etc.
     – XML representation defined by http://www.dmg.org/
 Scoring operators
     –   Classification Assign tuple to a class and report confidence
     –   ClusteringAssign tuple to a cluster and compute score
     –   Regression Calculate predicted value and standard deviation
     –   Associations Identify the applicable rule and report consequent
          (rule head), support, and confidence
 Supports dynamic replacement of the PMML model used
  by an operator



35                                                                         © 2013 IBM Corporation
Analytics: Streams + SPSS Real Time Scoring Service
 Included with SPSS, not with Streams
                         IBM InfoSphere Streams                                        S   SPSS Scoring Operator

                                                              S                        R SPSS Repository Operator

                                                                                       P SPSS Publish Operator
                                                          y       In
                                                       or              m
                                                                           em
               R     P                            em                            or
                                                                                           SPSS Modeler Scoring Stream
                                              m                                   y
                                         In

                    File
                   System
 Change                        IBM SPSS Modeler Solution Publisher
Notification


        IBM SPSS Collaboration & Deployment Services


                         Model Refresh                                                Repository
                                                                                       Repository




36                                                                                                      © 2013 IBM Corporation
Analytics: the Financial Services Toolkit

 Adapters
     – Financial Information Exchange (FIX)
     – WebSphere Front Office for Financial Markets (WFO)
     – WebSphere MQ Low-Latency Messaging (LLM) Adapters
 Types and Functions
     – OptionType (put, call), TxType (buy, sell), Trade, Quote, OptionQuote, Order
     – Coefficient of Correlation
     – “The Greeks” (put/call values, delta, theta, etc.)
 Operators
     – Based on QuantLib financial analytics open source package.
     – Compute theoretical value of an option (European, American)
 Example applications
     – Equities Trading
     – Options Trading




37                                                                         © 2013 IBM Corporation
RESOURCE LINKS
Where to get Streams

 Software available via
     – Passport Advantage site


     – PartnerWorld



     – Academic Initiative



     – IBM Internal


     – 90-day trial code
        • IBM Trials and Demos portal
        • ibm.com -> Support & downloads
          ->Download -> Trials and demos

39                                         © 2013 IBM Corporation
More Streams links

 Streams in the Information Management
  Acceleration Zone
 Streams in the Media Library (recorded
  sessions)
     – Search on “streams”

 The Streams Sales Kit
     – Software Sellers Workplace
       (the old eXtreme Leverage)


     – PartnerWorld


 InfoSphere Streams Wiki on
  developerWorks
     – Discussion forum, Streams Exchange

40                                          © 2013 IBM Corporation
41   © 2013 IBM Corporation

Weitere ähnliche Inhalte

Was ist angesagt?

Designing Secure Cisco Data Centers
Designing Secure Cisco Data CentersDesigning Secure Cisco Data Centers
Designing Secure Cisco Data CentersCisco Russia
 
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...Liz Warner
 
SugarCON partner presentation by IBM
SugarCON partner presentation by IBMSugarCON partner presentation by IBM
SugarCON partner presentation by IBMBevdewitt
 
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma CollinsClosed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma CollinsLiz Warner
 
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Liz Warner
 
Design and Deployment using the Cisco Smart Business Architecture (SBA)
Design and Deployment using the Cisco Smart Business Architecture (SBA)Design and Deployment using the Cisco Smart Business Architecture (SBA)
Design and Deployment using the Cisco Smart Business Architecture (SBA)Cisco Russia
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
3 oficinas remotas - repli stor oncourse
3 oficinas remotas - repli stor oncourse3 oficinas remotas - repli stor oncourse
3 oficinas remotas - repli stor oncourseOmega Peripherals
 
Imex Research Virtualization Executive Summary On Slideshare
Imex Research Virtualization Executive Summary On SlideshareImex Research Virtualization Executive Summary On Slideshare
Imex Research Virtualization Executive Summary On SlideshareM. R. Pamidi, Ph. D.
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesAravindharamanan S
 
Solix Corporate Overview
Solix Corporate OverviewSolix Corporate Overview
Solix Corporate OverviewKunal Grover
 
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Joao Barreto Fernandes
 
Implementing Process Controls and Risk Management with Novell Compliance Mana...
Implementing Process Controls and Risk Management with Novell Compliance Mana...Implementing Process Controls and Risk Management with Novell Compliance Mana...
Implementing Process Controls and Risk Management with Novell Compliance Mana...Novell
 

Was ist angesagt? (16)

Designing Secure Cisco Data Centers
Designing Secure Cisco Data CentersDesigning Secure Cisco Data Centers
Designing Secure Cisco Data Centers
 
zLinux
zLinuxzLinux
zLinux
 
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
Closed Loop Network Automation for Optimal Resource Allocation via Reinforcem...
 
SugarCON partner presentation by IBM
SugarCON partner presentation by IBMSugarCON partner presentation by IBM
SugarCON partner presentation by IBM
 
IBM iFlow Director
IBM iFlow DirectorIBM iFlow Director
IBM iFlow Director
 
Closed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma CollinsClosed Loop Platform Automation - Tong Zhong & Emma Collins
Closed Loop Platform Automation - Tong Zhong & Emma Collins
 
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
Service Assurance Constructs for Achieving Network Transformation - Sunku Ran...
 
Design and Deployment using the Cisco Smart Business Architecture (SBA)
Design and Deployment using the Cisco Smart Business Architecture (SBA)Design and Deployment using the Cisco Smart Business Architecture (SBA)
Design and Deployment using the Cisco Smart Business Architecture (SBA)
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
3 oficinas remotas - repli stor oncourse
3 oficinas remotas - repli stor oncourse3 oficinas remotas - repli stor oncourse
3 oficinas remotas - repli stor oncourse
 
Imex Research Virtualization Executive Summary On Slideshare
Imex Research Virtualization Executive Summary On SlideshareImex Research Virtualization Executive Summary On Slideshare
Imex Research Virtualization Executive Summary On Slideshare
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examples
 
Solix Corporate Overview
Solix Corporate OverviewSolix Corporate Overview
Solix Corporate Overview
 
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
Executive Breakfast SysValue-NetApp-VMWare - 16 de Março de 2012 - Apresentaç...
 
Implementing Process Controls and Risk Management with Novell Compliance Mana...
Implementing Process Controls and Risk Management with Novell Compliance Mana...Implementing Process Controls and Risk Management with Novell Compliance Mana...
Implementing Process Controls and Risk Management with Novell Compliance Mana...
 
Sycor
SycorSycor
Sycor
 

Ähnlich wie InfoSphere streams_technical_overview_infospherusergroup

Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyHitachi Vantara
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDataWorks Summit
 
Big Data - SysFera presentation at the CSCI
Big Data - SysFera presentation at the CSCIBig Data - SysFera presentation at the CSCI
Big Data - SysFera presentation at the CSCISysFera
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Sverige
 
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012Amazon Web Services
 
Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Mindtree Ltd.
 
Big Data Whitepaper - Streams and Big Insights Integration Patterns
Big Data Whitepaper  - Streams and Big Insights Integration PatternsBig Data Whitepaper  - Streams and Big Insights Integration Patterns
Big Data Whitepaper - Streams and Big Insights Integration PatternsMauricio Godoy
 
IBM Big Data Platform, 2012
IBM Big Data Platform, 2012IBM Big Data Platform, 2012
IBM Big Data Platform, 2012Rob Thomas
 
Big data datacrunch
Big data datacrunchBig data datacrunch
Big data datacrunchReseau'Nable
 
Multi-thematic spatial databases
Multi-thematic spatial databasesMulti-thematic spatial databases
Multi-thematic spatial databasesConor Mc Elhinney
 
Scalability and Availability - Without Compromise
Scalability and Availability - Without CompromiseScalability and Availability - Without Compromise
Scalability and Availability - Without CompromiseBjorn Andersson
 
Big data landscape version 2.0
Big data landscape version 2.0Big data landscape version 2.0
Big data landscape version 2.0Matt Turck
 

Ähnlich wie InfoSphere streams_technical_overview_infospherusergroup (20)

Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Smarter Computing Big Data
Smarter Computing Big DataSmarter Computing Big Data
Smarter Computing Big Data
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROI
 
Big Data - SysFera presentation at the CSCI
Big Data - SysFera presentation at the CSCIBig Data - SysFera presentation at the CSCI
Big Data - SysFera presentation at the CSCI
 
Linked Sensor Data 101 (FIS2011)
Linked Sensor Data 101 (FIS2011)Linked Sensor Data 101 (FIS2011)
Linked Sensor Data 101 (FIS2011)
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategy
 
Big data use cases
Big data use casesBig data use cases
Big data use cases
 
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).Scalable Computing Labs (SCL).
Scalable Computing Labs (SCL).
 
Big Data & The Cloud
Big Data & The CloudBig Data & The Cloud
Big Data & The Cloud
 
Big Data Whitepaper - Streams and Big Insights Integration Patterns
Big Data Whitepaper  - Streams and Big Insights Integration PatternsBig Data Whitepaper  - Streams and Big Insights Integration Patterns
Big Data Whitepaper - Streams and Big Insights Integration Patterns
 
IBM Big Data Platform, 2012
IBM Big Data Platform, 2012IBM Big Data Platform, 2012
IBM Big Data Platform, 2012
 
Big data datacrunch
Big data datacrunchBig data datacrunch
Big data datacrunch
 
Introducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data EngineIntroducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data Engine
 
Multi-thematic spatial databases
Multi-thematic spatial databasesMulti-thematic spatial databases
Multi-thematic spatial databases
 
Scalability and Availability - Without Compromise
Scalability and Availability - Without CompromiseScalability and Availability - Without Compromise
Scalability and Availability - Without Compromise
 
16h30 p duff-big-data-final
16h30   p duff-big-data-final16h30   p duff-big-data-final
16h30 p duff-big-data-final
 
Big data landscape version 2.0
Big data landscape version 2.0Big data landscape version 2.0
Big data landscape version 2.0
 

Mehr von IBMInfoSphereUGFR

IBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqecIBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqecIBMInfoSphereUGFR
 
Ibm leads way with hadoop and spark 2015 may 15
Ibm leads way with hadoop and spark 2015 may 15Ibm leads way with hadoop and spark 2015 may 15
Ibm leads way with hadoop and spark 2015 may 15IBMInfoSphereUGFR
 
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3IBMInfoSphereUGFR
 
Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3IBMInfoSphereUGFR
 
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis ArnaudièsIBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis ArnaudièsIBMInfoSphereUGFR
 
IBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication ProductsIBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication ProductsIBMInfoSphereUGFR
 
Présentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSOPrésentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSOIBMInfoSphereUGFR
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBMInfoSphereUGFR
 
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1  What's New - Aomar BarizIBM MDM 10.1  What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar BarizIBMInfoSphereUGFR
 

Mehr von IBMInfoSphereUGFR (10)

IBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqecIBM InfoSphere Stewardship Center for iis dqec
IBM InfoSphere Stewardship Center for iis dqec
 
Ibm leads way with hadoop and spark 2015 may 15
Ibm leads way with hadoop and spark 2015 may 15Ibm leads way with hadoop and spark 2015 may 15
Ibm leads way with hadoop and spark 2015 may 15
 
Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3Présentation IBM InfoSphere MDM 11.3
Présentation IBM InfoSphere MDM 11.3
 
IBM Data lake
IBM Data lakeIBM Data lake
IBM Data lake
 
Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3Présentation IBM InfoSphere Information Server 11.3
Présentation IBM InfoSphere Information Server 11.3
 
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis ArnaudièsIBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
IBM InfoSphere Data Architect 9.1 - Francis Arnaudiès
 
IBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication ProductsIBM InfoSphere Data Replication Products
IBM InfoSphere Data Replication Products
 
Présentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSOPrésentation IBM DB2 Blu - Fabrizio DANUSSO
Présentation IBM DB2 Blu - Fabrizio DANUSSO
 
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZIBM InfoSphere MDM v11 Overview - Aomar BARIZ
IBM InfoSphere MDM v11 Overview - Aomar BARIZ
 
IBM MDM 10.1 What's New - Aomar Bariz
IBM MDM 10.1  What's New - Aomar BarizIBM MDM 10.1  What's New - Aomar Bariz
IBM MDM 10.1 What's New - Aomar Bariz
 

Kürzlich hochgeladen

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Kürzlich hochgeladen (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

InfoSphere streams_technical_overview_infospherusergroup

  • 1. IBM InfoSphere Streams Technical Overview Jerome Chailloux Europe IOT - Sr. Technical Field Specialist - Big Data, Linux Advocate jerome.chailloux@fr.ibm.com February 21, 2013 © 2013 IBM Corporation
  • 2. IBM InfoSphere Streams v3.0 A platform for real-time analytics on BIG data  Volume Just-in-time decisions – Terabytes per second – Petabytes per day Powerful  Variety Analytics – All kinds of data – All kinds of analytics Millions of Microsecond events per Latency  Velocity second – Insights in microseconds  Agility Sensor, video, audio, text, – Dynamically responsive and relational data sources – Rapid application development 2 © 2013 IBM Corporation
  • 3. Streams Analyzes All Kinds of Data Mining in Microseconds (included with Streams) Acoustic (IBM Research) (Open Source) ***New Text Simple & Advanced Text Advanced (listen, verb), (included with Streams) Mathematical (radio, noun) Models (IBM Research) ***New Statistics Predictive (included with ∑ R(s , a ) population t t (included with Streams) Streams) ***New Image & Video Geospatial (Open Source) (included with Streams) 3 © 2013 IBM Corporation
  • 4. Categories of Problems Solved by Streams  Applications that require on-the-fly processing, filtering and analysis of streaming data – Sensors: environmental, industrial, surveillance video, GPS, … – “Data exhaust”: network/system/web server/app server log files – High-rate transaction data: financial transactions, call detail records  Criteria: two or more of the following – Messages are processed in isolation or in limited data windows – Sources include non-traditional data (spatial, imagery, text, …) – Sources vary in connection methods, data rates, and processing requirements, presenting integration challenges – Data rates/volumes require the resources of multiple processing nodes – Analysis and response are needed with sub-millisecond latency – Data rates and volumes are too great for store-and-mine approaches 4 © 2013 IBM Corporation
  • 5. The Big Data Ecosystem: Interoperability is Key Streams Non-Traditional / / Non-Traditional Streaming Non-Relational Non-Relational Data Data Feeds Data Feeds RTAP: Analytics on Data in Motion Non-Traditional / / Non-Traditional BigInsights Non-Relational Non-Relational Internet- Data Sources Data Sources Scale Data Sets Traditional / / Traditional Relational Relational Data Sources Data Sources Analytics on Data at Rest Traditional Traditional / / Traditional Warehouse Relational Relational Data Data Sources Sources Data Analytics on Warehouse Structured Data 5 © 2013 IBM Corporation
  • 6. Streaming Analytics in Action Stock Market  Impact of weather on securities prices Natural Systems  Analyze market data at ultra-low latencies  Wildfire management  Water management Law Enforcement, Defense & Cyber Security  Real-time multimodal surveillance Transportation  Situational awareness  Intelligent traffic  Cyber security detection management Fraud Prevention  Detecting multi-party fraud  Real time fraud prevention Manufacturing  Process control for microchip fabrication e-Science  Space weather prediction  Detection of transient events Health & Life Sciences  Synchrotron atomic research  Neonatal ICU monitoring  Epidemic early warning Other system Telephony  Smart Grid  Remote healthcare  CDR processing  Text analysis monitoring  Social analysis  Who’s talking to whom?  Churn prediction  ERP for commodities  Geomapping  FPGA acceleration
  • 7. Use Case: Law Enforcement and Security  Video surveillance, wire taps, communications, call records, etc.  Millions of messages per second with low density of critical data  Identify patterns and relationships among vast information sources “The US Government has been working with IBM Research since 2003 on a radical new approach to data analysis that enables high speed, scalable and complex analytics of heterogeneous data streams in motion. The project has been so successful that US Government will deploy additional installations to enable other agencies to achieve greater success in various future projects” – US Government 7 © 2013 IBM Corporation
  • 8. Predictive Analytics in a Neonatal ICU  Real-time analytics and correlations on physiological data streams – Blood pressure, Temperature, EKG, Blood oxygen saturation etc.,  Early detection of the onset of potentially life-threatening conditions – Up to 24 hours earlier than current medical practices – Early intervention leads to lower patient morbidity and better long term outcomes  Technology also enables physicians to verify new clinical hypotheses 8 © 2013 IBM Corporation
  • 9. Smarter Faster Cheaper CDR Processing 6 Billion CDRs per day, dedups over 15 days, processing latency from 12 hours to a few seconds 6 machines (using ½ processor capacity) InfoSphere Streams xDR Hub Key Requirements: Price/Performance and Scaling
  • 10. Surveilance and Physical Security: TerraEchos (Business Partner)  Use scenario – State-of-the-art covert surveillance system based on Streams platform – Acoustic signals from buried fiber optic cables are monitored, analyzed and reported in real time for necessary action – Currently designed to scale up to 1600 streams of raw binary data  Requirement – Real-time processing of multi-modal signals (acoustics. video, etc) – Easy to expand, dynamic – 3.5M data elements per second  Winner 2010 IBM CTO Innovation Award
  • 11. Streams for Real-Time Geomapping Multiple GPS Data Sources 350-400K probe points / second per source Map probe point to nearest poly-line (Map) Real-time location profile 200 million – 1 billion poly-lines 2 level grid decomposition based search 14 Blade servers 2X Dual-Core Xeon 5160 16 GB RAM 4 data prep, 10 mapping Performance 941,000 probes/sec for 1 Billion poly-lines secr uo S a a D SP G Hierarchical t Mapping 11 © 2013 IBM Corporation
  • 12. Use Cases: Video Processing (Contour Detection) Original Picture Contour Detection 12 © 2013 IBM Corporation
  • 13. How Streams Works  Continuous ingestion  continuous ingestion  Continuous analysis 13 © 2013 IBM Corporation
  • 14. How Streams Works  Continuous ingestion Infrastructure provides services for  Continuous analysis Scheduling analytics across hardware hosts, Establishing streaming connectivity Filter / Sample Transform Annotate Correlate Classify Achieve scale: Where appropriate: By partitioning applications into software components Elements can be fused together By distributing across stream-connected hardware hosts for lower communication latency 14 © 2013 IBM Corporation
  • 15. Scalable Stream Processing  Streams programming model: construct a graph – Mathematical concept OP OP OP • not a line -, bar -, or pie chart! OP OP • Also called a network OP stream OP • Familiar: for example, a tree structure is a graph – Consisting of operators and the streams that connect them • The vertices (or nodes) and edges of the mathematical graph • A directed graph: the edges have a direction (arrows)  Streams runtime model: distributed processes – Single or multiple operators form a Processing Element (PE) – Compiler and runtime services make it easy to deploy PEs • On one machine • Across multiple hosts in a cluster when scaled-up processing is required – All links and data transport are handled by runtime services • Automatically • With manual placement directives where required 15 © 2013 IBM Corporation
  • 16. From Operators to Running Jobs  Streams application graph: Src OP Sink – A directed, possibly cyclic, graph OP OP – A collection of operators Src st r e a m Sink – Connected by streams  Each complete application is a potentially deployable job  Jobs are deployed to a Streams runtime environment, known as a Streams Instance (or simply, an instance)  An instance can include a single processing node (hardware)  Or multiple processing nodes node node node node node node node h/w node Streams instance 16 © 2013 IBM Corporation
  • 17. InfoSphere Streams Objects: Runtime View Instance  Instance – Runtime instantiation of InfoSphere Job Streams executing across one or Node more hosts PE PE Stream 1 Stream 2 – Collection of components and operator services  Processing Element (PE) Stream 1 – Fundamental execution unit that is PE 3 Stream 4 run by the Streams instance Stream – Can encapsulate a single operator Stream 3 Stream 5 or many “fused” operators  Job Node – A deployed Streams application executing in an instance – Consists of one or more PEs 17 © 2013 IBM Corporation
  • 18. Competition: Complex-Event Processing (CEP) Streams CEP  Analytics on continuous “real time”  Analysis on discrete data streams  Simple to extremely business events “ultra-low” latency  Rules-based (if-then- complex analytics  Scale for computational else) with correlation across event types intensity event stream  Only structured data  Supports a whole range processing types are supported of relational and non  Modest data rates relational data types 18 © 2013 IBM Corporation
  • 19. IBM InfoSphere Streams 3.0 Comprehensive Scale-out architecture Sophisticated analytics with tooling toolkits & accelerators Front Office 3.0 • Clustered runtime for near- • Big Data, CEP, Database, • Eclipse IDE limitless capacity Data Explorer (Big Data), • Web console • RHEL v5.3 and above DataStage, Finance, Geospatial, • CentOS v6.0 and above Internet, Messaging, Mining, SPSS, • Drag & Drop editor Standard, Text, TimeSeries toolkits • Instance graph • X86 & Power multicore hardware • Streams visualization • InfiniBand support • Telco & Social Media accelerators • Streams debugger • Ethernet support 19 © 2013 IBM Corporation
  • 20. What is Streams Processing Language?  Designed for stream computing – Define a streaming-data flow graph – Rich set of data types to define tuple attributes  Declarative – Operator invocations name the input and output streams – Referring to streams by name is enough to connect the graph  Procedural support – Full-featured imperative language – Custom logic in operator invocations – Expressions in attribute assignments and parameter definitions  Extensible – User-defined data types – Custom functions written in SPL or a native language (C++ or Java) – Custom operators written in SPL – User-defined operators written in a native language (C++ or Java) 20 © 2013 IBM Corporation
  • 21. Streams Standard Toolkit: Relational Operators  Relational operators – Functor Perform tuple-level manipulations – Filter Remove some tuples from a stream – Aggregate Group and summarize incoming tuples – Sort Impose an order on incoming tuples in a stream – Join Correlate two streams – Punctor Insert window punctuation markers into a stream 21 © 2013 IBM Corporation
  • 22. Streams Standard Toolkit: Adapter Operators  Adapter operators – FileSource Read data from files in formats such as csv, line, or binary – FileSink Write data to files in formats such as csv, line, or binary – DirectoryScan Detect files to be read as they appear in a given directory – TCPSource Read data from TCP sockets in various formats – TCPSink Write data to TCP sockets in various formats – UDPSource Read data from UDP sockets in various formats – UDPSink Write data to UDP sockets in various formats – Export Make a stream available to other jobs in the same instance – Import Connect to streams exported by other jobs – MetricsSink Create displayable metrics from numeric expressions 22 © 2013 IBM Corporation
  • 23. Streams Standard Toolkit: Utility Operators  Workload generation and custom logic – Beacon Emit generated values; timing and number configurable – Custom Apply arbitrary SPL logic to produce tuples  Coordination and timing – Throttle Make a stream flow at a specified rate – Delay Time-shift an entire stream relative to other streams – Gate Wait for acknowledgement from downstream operator – Switch Block or allow the flow of tuples based on control input 23 © 2013 IBM Corporation
  • 24. Streams Standard Toolkit: Utility Operators (cont’d)  Parallel flows – Barrier Synchronize tuples from sequence-correlated streams – Pair Group tuples from multiple streams of same type – Split Forward tuples to output streams based on a predicate – ThreadedSplit Distribute tuples over output streams by availability – Union Construct an output tuple from each input tuple – DeDuplicate Suppress duplicate tuples seen within a given time period  Miscellaneous – DynamicFilter Filter tuples based on criteria that can change while it runs – JavaOp General-purpose operator for encapsulating Java code – Parse Parse blob data for use with user-defined adapters – Format Format blob data for use with user-defined adapters – Compress Compress blob data – Decompress Decompress blob data – CharacterTransform Convert blob data from one encoding to another 24 © 2013 IBM Corporation
  • 25. XML Support: Built Into SPL <catalog> <catalog> <book price="30.99"> <book price="30.99"> <title>This is aaboring title</title> <title>This is boring title</title>  XML type <author>John Smith</author> <author>John Smith</author> <author>Howard Hughes</author> – Validated for syntax or schema <author>Howard Hughes</author> <reference quality="-1"> <reference quality="-1">  XMLParse operator <book>The first reference</book> <book>The first reference</book> – With XPath expressions and functions </reference> </reference> <reference quality="100"> <reference quality="100"> to parse and manipulate XML data <book>Another Book</book> <book>Another Book</book> – Convert XML to tuples </reference> </reference>  XQuery functions </book> </book> </catalog> – Use XML data on the fly </catalog>  Adapters support XML format – Standard Toolkit Example: Extract information about books from an XML catalog Example: Extract information about books from an XML catalog – Database Toolkit stream<BookInfo> XX == XMLParse(XML) {{ stream<BookInfo> XMLParse(XML) • Supports param trigger :: "/catalog/book" ;; param trigger "/catalog/book" DB2 pureXML parsing :: permissive; parsing permissive; // log and ignore errors // log and ignore errors output XX :: title output title == XPath("title/text()"), XPath("title/text()"), authors authors == XPathList("author/text()"), XPathList("author/text()"), price price == (decimal32) XPath ("@price"), (decimal32) XPath ("@price"), references == XPathList("reference", references XPathList("reference", {quality == (int32) XPath("@quality"), {quality (int32) XPath("@quality"), book book == XPath("book/text()") }); XPath("book/text()") }); }} 25 © 2013 IBM Corporation
  • 26. Streams Extensibility: Toolkits  Like packages, plugins, addons, extenders, etc. – Reusable sets of operators, types, and functions – What can be included in a toolkit? • Primitive and composite operators • User-defined types • Native and SPL functions • Sample applications, utilities • Tools, documentation, data, etc. – Versioning is supported – Define dependencies on other versioned assets (toolkits, Streams itself)  Base for creating cross-domain and domain-specific applications  Developed by IBM, partners, end-customers – Complete APIs and tools available – Same power as the “built-in” Standard Toolkit 26 26 © 2013 IBM Corporation
  • 27. Integration: the Internet and Database Toolkits  Integration with traditional sources and consumers  Internet Toolkit – InetSource periodically retrieve data from HTTP, FTP, RSS, and files  Database Toolkit • ODBC databases: DB2, Informix, Oracle, solidDb, MySQL, SQLServer, Netezza – ODBCAppend Insert rows into an SQL database table – ODBCEnrich Extend streaming data based on lookups in database tables – ODBCRun Perform SQL queries with parameters from input tuples – ODBCSource Read data from a SQL database – SolidDBEnrich Perform table lookups in an in-memory database – DB2SplitDB Split a stream by DB2 partition key – DB2PartitionedAppend Write data to table in specified DB2 partition – NetezzaLoad Perform high-speed loads into a Netezza database – NetezzaPrepareLoad Convert tuple to delimited string for Netezza loads 27 © 2013 IBM Corporation
  • 28. Integration: the Big Data Toolkit  Integration with IBM’s Big Data Platform  Data Explorer – DataExplorerPush Insert records into a Data Explorer index  BigInsights: Hadoop Distributed File System – HDFSDirectoryScan Like DirectoryScan, only for HDFS – HDFSFileSource Like FileSource, only for HDFS – HDFSFileSink Like FileSink, only for HDFS – HDFSSplit Write batches of data in parallel to HDFS 28 © 2013 IBM Corporation
  • 29. Integration: the DataStage Integration Toolkit  Streams real-time analytics and DataStage information integration – Perform deeper analysis on the data as part of the information integration flow – Get more timely results and offload some analytics load from the warehouse  Operators and tooling – Adapters to exchange data between Streams and DataStage • DSSource • DSSink – Tooling to generate integration code – DataStage provides Streams connectors 29 © 2013 IBM Corporation
  • 30. Integration: the Messaging Toolkit  Integrate with IBM WebSphere MQ – Create a stream from a subscription to an MQ topic or queue – Publish a stream to an MQ series topic or queue  Operators – XMSSource Read data from an MQ queue or topic – XMSSink Send messages to applications that use WebSphere MQ 30 © 2013 IBM Corporation
  • 31. Integration and Analytics: the Text Toolkit  Derive structured information from unstructured text – Apply extractors • Programs that encode rules for extracting information from text • Written in AQL (Annotation Query Language) • Developed against a static repository of text data • AQL files can be combined into modules (directories) • Modules can be compiled into TAM files • NOTE: Old-style AOG files not supported by BigInsights 2.0 and this toolkit  Can be used in Streams 3.0 with the Deprecated Text Toolkit  Streams Studio plugins for developing AQL queries – Same tooling as in BigInsights  Operator and utility – TextExtract Run AQL queries (module or TAM) over text documents – createtypes script • Create stream type definitions that match the output views of the extractors.  Plays a major role in accelerators – SDA: Social Data Analytics – MDA: Machine Data Analytics 31 © 2013 IBM Corporation
  • 32. Analytics: The Complex Event Processing Toolkit  MatchRegex Use patterns to detect composite events – In streams of simple events (tuples) – Easy-to-use regex-style pattern match of user-defined predicates  Integration in Streams allows CEP-style processing with high performance and rich analytics Example: detect “M-shape” patterns in stock prices: Example: detect “M-shape” patterns in stock prices: double-top formations double-top formations stream<MatchT> Matches == MatchRegex(Quotes) {{ stream<MatchT> Matches MatchRegex(Quotes) param param pattern :: ". rise+ drop+ rise+ drop* deep" ;; pattern ". rise+ drop+ rise+ drop* deep" drop partitionBy :: symbol ;; rise partitionBy symbol predicates :: {{ predicates Series of rise == price >> First(price) && price >= Last(price), rise price First(price) && price >= Last(price), rising peaks drop == price >= First(price) && price << Last(price), drop price >= First(price) && price Last(price), and troughs deep == price << First(price) && price << Last(price) }; deep price First(price) && price Last(price) }; output output Matches :: symbol == symbol, seqNum == First(seqNum), Matches symbol symbol, seqNum First(seqNum), Deep drop count == Count(), maxPrice == Max(price); count Count(), maxPrice Max(price); }} below start of match 32 © 2013 IBM Corporation
  • 33. Analytics: The Geospatial Toolkit  High-performance analysis and processing of geospatial data  Enables Location Based Services (LBS) – Smarter Transportation: monitor traffic speed and density – Geofencing: detect when objects enter or leave a specified area  Geospatial data types – e.g. Point, LineString, Polygon  Geospatial functions – e.g. Distance, Map point to LineString, IsContained, etc. 33 © 2013 IBM Corporation
  • 34. Analytics: The TimeSeries Toolkit  Apply Digital Signal Processing techniques – Find patterns and anomalies in real time – Predict future values in real time  A rich set of functionality for working with time series data – Generation • Synthesize specific wave forms (e.g., sawtooth, sine) – Preprocessing • Preparation and conditioning (ReSample, TSWindowing) – Analysis • Statistics, correlations, decomposition and transformation – Modeling • Prediction, regression and tracking (e.g. Holt-Winters, GAMLearner) 34 © 2013 IBM Corporation
  • 35. Analytics: the Mining Toolkit  Enables scoring of real-time data in a Streams application – Scoring is performed against a predefined model (in PMML file) – Supports a variety of model types and scoring algorithms  Predictive Model Markup Language (PMML) – Standard for statistical and data mining models – Produced by Analytics tools: SPSS, ISAS, etc. – XML representation defined by http://www.dmg.org/  Scoring operators – Classification Assign tuple to a class and report confidence – ClusteringAssign tuple to a cluster and compute score – Regression Calculate predicted value and standard deviation – Associations Identify the applicable rule and report consequent (rule head), support, and confidence  Supports dynamic replacement of the PMML model used by an operator 35 © 2013 IBM Corporation
  • 36. Analytics: Streams + SPSS Real Time Scoring Service  Included with SPSS, not with Streams IBM InfoSphere Streams S SPSS Scoring Operator S R SPSS Repository Operator P SPSS Publish Operator y In or m em R P em or SPSS Modeler Scoring Stream m y In File System Change IBM SPSS Modeler Solution Publisher Notification IBM SPSS Collaboration & Deployment Services Model Refresh Repository Repository 36 © 2013 IBM Corporation
  • 37. Analytics: the Financial Services Toolkit  Adapters – Financial Information Exchange (FIX) – WebSphere Front Office for Financial Markets (WFO) – WebSphere MQ Low-Latency Messaging (LLM) Adapters  Types and Functions – OptionType (put, call), TxType (buy, sell), Trade, Quote, OptionQuote, Order – Coefficient of Correlation – “The Greeks” (put/call values, delta, theta, etc.)  Operators – Based on QuantLib financial analytics open source package. – Compute theoretical value of an option (European, American)  Example applications – Equities Trading – Options Trading 37 © 2013 IBM Corporation
  • 39. Where to get Streams  Software available via – Passport Advantage site – PartnerWorld – Academic Initiative – IBM Internal – 90-day trial code • IBM Trials and Demos portal • ibm.com -> Support & downloads ->Download -> Trials and demos 39 © 2013 IBM Corporation
  • 40. More Streams links  Streams in the Information Management Acceleration Zone  Streams in the Media Library (recorded sessions) – Search on “streams”  The Streams Sales Kit – Software Sellers Workplace (the old eXtreme Leverage) – PartnerWorld  InfoSphere Streams Wiki on developerWorks – Discussion forum, Streams Exchange 40 © 2013 IBM Corporation
  • 41. 41 © 2013 IBM Corporation

Hinweis der Redaktion

  1. Streams key messages: Velocity – low latency – microsecond response times Variety – all kinds of data, all kinds of analytics Volume – millions of events per second Agility – respond quickly to changes; fast development of Streaming apps
  2. That one sentence is the shortest statement I can think of that captures the essence of Streams Platform : Streams is not a solution or application, nor is it a limited-purpose tool. Instead, it is a platform. This means that it does not do anything useful when you first install it. It needs to be programmed before it can do anything. In that sense, it is much like a DBMS, or an operating system. It comes with the tools, language, and building blocks that let you build programs for it, and with a runtime environment that lets you run those programs. The bad news is that you have to program it (you can’t just pick a few options from a menu); the good news is that once you learn how to do that, it can do anything you can imagine (or, more accurately, anything you can find algorithms and code for). Real-time : The Streams programs you create do their processing and analysis in as close to real-time as it is possible to get on a standard IT platform. In this case, real-time means very low latency, where latency is the delay from the time a packet of data arrives to the time the result is available. A key factor here is that Streams does everything in memory; it has no concept of mass storage (disk). Analytics : Because Streams is fast, scalable, and programmable, the kinds of analysis you can apply ranges from the simple to the extremely sophisticated. You are not limited to simple averages or if-then-else rules. BIG data : Actually, make that infinite data. For purposes of program and algorithm design, streaming data has no beginning and no end and is therefore by definition infinite in volume. In practical terms, this means that Streams can process any kind of data feed, including those that would be much too slow or expensive to capture and store in their entirety. We’ve added Agility to Big Data’s V 3 : A few features of the programming language and runtime make it easy to break an application into modules and add new modules and modify existing ones without interrupting the overall solution.
  3. The blue items are included in the Streams product (Mining in Microseconds, Statistics, Text Analysis) The red items have been built for Streams (Acoustic, Image &amp; Video, Geospatial, Predictive, and Advanced Mathematics) and have been used in various projects and engagements, but they have not been made a part of the product. Contact development if you think any of them could help you in an opportunity. More later in this presentation on the toolkits available in the Streams product. A few notes: The “simple” in “Simple &amp; Advanced Text” refers to basic functions for regular expression matching (and replacement) in string values; these are built into the Standard Toolkit, meaning they’re basically part of the language. “Advanced” refers to real text analysis using the same System T code used in BigInsights. This is new in Fix Pack 3 (November 2011). Statistics included with Streams range from simple aggregate functions (average, sum, etc.) to the more advanced metrics included in the Financial Services Toolkit. For true time series analysis, we have an advanced toolkit in development; this is covered under Advanced Mathematical Models in this slide.
  4. Streams targets applications where processing goals are best achieved by on-the-fly mining and processing of streaming data. That is, where ingest volume and rate exceed storage capacity Requires data filtering, aggregating, and other processing to reduce the volume so it can be handled by a data storage and management system at the back end. Even if the goal is not analysis on the fly, Streams is needed simply to prepare the data for analysis later. Example: extremely noisy and voluminous data feeds in radio astronomy ingest rate plus processing requirements exceed ability of single application on a single processor to process the data InfoSphere Streams is specifically designed with parallelism and deployment across computing clusters in mind. result is needed in (near) real-time Latency refers to the delay, or elapsed time, between the appearance of a data item at the input and the availability of the result at the output Analysis of stored data is not a viable option when delays measured in milliseconds add up to lives or money lost. Data must be processed in memory, as it comes in, before it can be committed to persistent storage. [continued on next page]
  5. The integration from the previous slide is depicted here again, in a slightly different and simpler way. The key point is that the Big Data approach, and in particular Streams/RTAP (Real-Time Analytic Processing), is (almost) always in addition to, not instead of, traditional warehousing and IT. All three approaches are not necessarily present in every project, but there are always bidirectional connections between them.
  6. Key Points Representative and impressive use cases A way to augment and improve an existing use case and traditional technology - algorithmic trading, risk, fraud detection, security New opportunities enabled by the software - Galway Bay, Neonatal intensive care, Radio astronomy, traffic management Virtually every field needs this: Streams is suitable to build solutions for a variety of domains Stock market – Impact of weather on commodity pricing or stock prices – demo built to show impact of hurricane on oil assets in the Gulf of Mexico, Sample Equity and Options trading apps included with Streams Natural systems – A university with access to US Weather satellites and unmanned aerial vehicles wants to build a real-time wildfire analysis and monitoring system that can detect smoke from wildfires, and task satellites and UAV’s to monitor fires and direct firefighters. Several organizations are looking to better understand watersheds and oceans to predict weather patters and manage fishing stocks Transportation -- Real time fleet management is one application Another is integrating vehicle traffic information with subway, train and taxi information to improve transit operation. Pilot done for Stockholm with KTH University Manufacturing – IBM Burlington used Streams in a pilot to automate analysis of chip testing to improve yield Health &amp; Life sciences – by correlating information across neonatal monitors, detection of systemic infection can be discovered 6 to 24 hours earlier than experienced nurses. Now being extended for brain trauma in Neurologic ICU Telephony – call detail record processing transform telephone records in ASN.1 format to standard ASCII and puts unto databases for billing. It also summarizes to a dashboard. Other apps set up a social network of who is talking to Whom, and who is likely to leave your company. Geomapping allows a telco to provide location based services. e-Science – Space Weather prediction can help alleviate damage from solar flares to satellite and electric grid. Detection of transient events and imaging remote universe from an array of radio telescopes – a worldwide $2 billion USD multiyear effort. Synchrotron research enables re-use of data across many research organizations. Fraud prevention – Detecting multi-party fraud and finding fraud faster can result in less fraud Law Enforcement – monitoring cameras to detect faces and then do facial recognition can be used to find criminals; combining many data types can improve analysis and provide better situational awareness Other – smart grid to talk to many meters, text analytics to understand content, Who’s talking to Whom – voice analysis to build up social network, Weather summarization to optimize commodities purchase and FGPA acceleration to improve number crunching (mathematical analysis) in Streams)
  7. This is about all we can say about the original use case that led to the government-IBM partnership that produced Distillery / System S / Streams. Government continues to be one of two major industry focus areas, along with telco.
  8. This is still very much a research project. ICU = Intensive Care Unit. In this case (neonatal), for premature babies. UOIT = University of Ontario Institute of Technology (Toronto) Important observations: The instrumentation is there: dozens of sensors for each infant. But the instruments work in isolation and only produce visual signals, auditory signals in case of anomalies, and sometimes a paper record on demand. Streams is a highly usable platform for finally capturing all that data and combining it in useful ways, in real time. Since this kind of combined analysis has not been done before, Streams first needs to enable the research effort of figuring out which patterns are relevant and predictive, before any of this can be standardized and put into an approved product and practice. Both of these factors are present in many other organizations and industries.
  9. In collaboration with KTH university in Stockholm, we built a system to take GPS data from taxis and trucks and, in real time, derive information about traffic speed and density for each road segment, as well as travel time between specific locations. It turns out that processing GPS data is not just a matter of gathering up the reported locations. You have to filter out bad data and “snap” the location to the nearest road segment that goes in the right direction—because there is uncertainty in the location and vehicles are assumed to be on roads, not in the middle of a building or a field. This requires an interesting combination of keeping static map data (the road segments) in memory and dividing up the map into tiles to take advantage of multiple processing nodes. There are some performance numbers in the slide, but the important point is that we were able to use the Streams platform to build and incorporate fairly sophisticated spatial operators and achieve the desired performance by dividing up the area in to regions and distributing the load—an example of real-time map-reduce (the algorithmic principle, not the Hadoop operating system).
  10. This slide is here as a placeholder: it is often fun and useful to show the recorded demo of this video processing app. This was developed by the researcher in the picture on Streams 1.2 using user-defined operators build around the OpenCV open-source image processing library. It is not part of any customer engagement (although the library is). The reason this is useful is that for those who are unfamiliar with the concept of streaming data, it is easy to grasp that a video is nothing but a stream of still images (we call it “streaming video” on the web, for good reason). So by showing how each image (video frame) is processed through a series of steps—from color-to-greyscale conversion through noise filtering and thresholding (to create a one-bit, black-and-white image) to the edge-detection step that creates the contours—we can get across Streams’s fundamental computing paradigm: that of discrete packets of data flowing from operator to operator in a graph consisting of such operators connected by streams. If you play the video (make it loop while you talk), notice that Streams does all that to each frame, 24 or 30 of them per second, and when the processed and original frame come back together, no delay is noticeable. You can get it here: http://cattail.boulder.ibm.com/cattail/#view=gilesjam@us.ibm.com/files/842A08609BCA3DDA8AB3367D093F23B6
  11. This slide and the following one are old but serve their purpose well. As with the video processing demo, the aim is for people to get familiar with the concept of streaming data processing as the flow of discrete packets of data from one operator to the next in a flow graph; each operator does something with the data that comes in and produces new data as a result. Seeing those little blocks move from bubble to bubble in the animated slide, relentlessly and never ending, conveys the point rather well, so let the audience stare at this for a while, especially once you reveal what goes on behind the grey panel. At this point, however, all you’re showing is that a Streams application (here shown as a black box—or rather, a grey one) continuously ingests data from multiple sources (often many different types of data arriving at very different rates), and continuously sends analysis results and processed data to disk files, databases, real-time displays, or any other computational or IT processes (depicted as the proverbial cloud).
  12. Click-by-click (3 and 4 are timed—no click needed): The overall application is composed of many individual operators (or, maybe, Processing Elements consisting of one or more operators each); those are the bubbles in this graph. The data packets (sometimes called messages; we call them tuples) generated by one bubble travel to the next downstream bubble, and so on. Sometimes the output from one bubble becomes input to multiple bubbles; conversely, sometimes multiple output streams merge as input to one bubble. But the tuples keep flowing. Having all these separate processes working together gives us flexibility and scalability. If we need more power than a single computer can provide, we can simply distribute the bubbles over multiple boxes (hosts). Some (compute-intensive) bubbles get their very own box; other bubbles want to be in the same box, because they must exchange a lot of data and going across the network would slow that down. The Streams runtime services can help distribute the load. In the picture, the streams flowing within a host (interprocess communication) turn read, indicating that they are faster than the blue ones that go across the network between hosts. Finally, we label some of the boxes with some of the processing stages typically found in many Streams solutions: Filter and sample to reduce the amount of data flowing through the system; transform, changing the data format, computing new quantities, and dropping unneeded attributes; correlate to join or otherwise combine data from multiple sources; classify based on models generated from historical or training data; annotate to enrich the data by looking up static information based on keys present in the data.
  13. The programming model is to define a data flow graph that defines the connections among data sources (inputs), operators, and data sinks (outputs). Operators, and the streams that connect them, are the building blocks of the logical program. The deployment model is to group, or “fuse”, operators into units called Processing Elements , or PE s. PEs are separate executables that form the building blocks of the distributed application. A PE contains one or more operators A program consisting of many operators may be fused into a single PE; at the other extreme, each operator may run in its own PE Operators fused into a single PE are tightly coupled; data transfer is local and fast Communication between PEs uses the network protocol and is therefore slower; but these more loosely coupled components can be flexibly distributed over available processing nodes and are more easily reusable At the physical level, a Streams job (a running application) consists of multiple intercommunicating PEs. This lets you place resource-hungry analytics on appropriately sized nodes Reuse of generic components is a side-benefit, but choosing the optimum level of component granularity and performance is currently more art than science Streams provides the infrastructure to support the decomposition of applications into scalable chunks (PEs), and the deployment and operation of these PEs across stream-connected processing nodes. Currently, the product only supports nodes with x86 architecture, running a Linux (Red Hat) operating system.
  14. A stream processing application is modeled as a graph (also known as a network): a set of vertices (or nodes ) connected by edges . In Streams, the vertices are called operators and the edges are called, well, streams . In mathematics, there are different types of graph; ours is directed , which means that the edges have a direction from one vertex to another but not back; they are arrows. Also, our graph may be cyclic , which means that loops are allowed. There are restrictions on this (only certain types of backward connections are allowed) but this is good enough for now. The term job simply refers to a running application. It’s looking at it from the runtime execution perspective rather than the programming perspective.
  15. Another view of the same thing, to get a clear model in everyone’s mind. The Streams Instance is the runtime environment that makes it possible to run Streams applications: analogous to an application server or a database server. The instance is a collection of Linux components and services, managing one or more hosts for running applications. We’ve already seen the Processing Element as well: the fundamental unit of execution, akin to a shared library. Each PE encapsulates one or more operators; at deployment time, each PE is loaded by its own executable program, the PE Container . When a Streams application runs in the Instance, it is called a Job . The job, in turn, manifests itself as a collection of one or more PEs. When you monitor an instance (using one of a variety of tools, including a command-line program, a browser-based management console, and an interactive view of the flow graph in Streams Studio), you can get information about jobs and PEs, submit and cancel jobs, terminate and restart PEs, view PE-specific logs, and so on.
  16. There is a separate deck with competitive information about Complex Event Processing and vendors in that space. This is just one summary slide. CEP as a product category has been known in the industry a bit longer than stream processing. To this day, there are no obvious competitors that can approach Streams on the true stream-processing end, so we look here at CEP. Both use very similar terms to describe what they do. “ Real-time” or near-real-time refers to latency but also throughput, and to the in-memory nature of processing. The question is, how quick do you have to be to call yourself “real time”? Likewise, how low is “ultra-low”? CEP literature also talks about stream processing, but they call it “event stream processing”. In practice this can cover whatever you want, but in concept it is different from processing a stream of low-level, discrete data packets, as opposed to higher-level, “complex” events made up of a number of business process steps. How we look at the data coming in is fundamentally different (even if it’s the same data). Our streams are continuous flows of discrete data packets; theirs are separate, individual events. That colors how we then deal with the data in our environment. (Note that “continuous” here is not used in the mathematical sense of a function without steps, holes, or spikes, but in the everyday-language sense of “always flowing, never ceasing”. What arrives on these flows are discrete packets of data, which suggests the opposite of a different meaning of continuous.) The other distinctions made here are of degree in the areas of complexity of analysis, performance, and data variety.
  17. There is a lot of new stuff in Streams 3.0. Aside from numerous improvements under the covers, the ones that stand out are Drag &amp; drop editing: this opens up a whole new way of introducing Streams and giving prospective customers and partners a hands-on experience without getting lost in the details of the SPL syntax. Data visualization in the Streams Console: there are few easy-to-use tools that can graphically display dynamically changing data, and while this capability does not have the sophistication of Cognos RTM and other dashboarding tools, this is perfect for development, demos, and even POCs. Several new analytics and integration toolkits that greatly enhance the spectrum of problem domains and industry areas that can be served.
  18. Streams Processing Language is unique to IBM InfoSphere Streams. This is what gives Streams the power and flexibility that other systems lack. Of course, the downside is that there is yet another language to learn. But in practice, this is not the burden it may seem, because the greatest step in the learning curve is getting comfortable with a new computing paradigm, the distributed nature of typical deployments, and so on. Once you’ve taken that step, the specific syntax of the language is a minor addition—especially since the IDE helps complete your stream and operator declarations with all the brackets and braces in the right place. Type-generic operators : The built-in operators work on any combination of the data types supported by Streams Declarative operators : To configure the operators, you declare them, give them names, define input and output stream(s), and assign parameters User-defined operators : These can take different forms, including one that makes them behave exactly like all the built-in operators (type-generic) Procedural constructs : Additional language elements that control repetition, loops, and other aspects of the streams connecting the operators The following slides cover all of this in greater detail.
  19. Reference slide. The Relational operators Aggregate, Sort, and Join are named after analogous operations in relational databases, but in stream processing there is a twist: they can only operate on windows of tuples, never the “entire” stream. Functor is a general-purpose attribute transformation and filtering operator with many of the capabilities of the SQL SELECT statement. Punctor is unique to Streams: it exists solely to add window markers to the stream, thus separating groups of tuples into disjoint windows that can be used by downstream operators for their window-based operations. Note that Import and Export are not “real” operators in that they don’t generate any code. Instead, they add metadata for the Streams Application Manager service to use in making stream connections between applications. For convenience, they are made to look syntactically like regular operators. And since they’re not about communicating with the outside world, they’re not really “adapters” either.
  20. Reference slide. Note that Import and Export are not “real” operators in that they don’t generate any code. Instead, they add metadata for the Streams Application Manager service to use in making stream connections between applications. For convenience, they are made to look syntactically like regular operators. And since they’re not about communicating with the outside world, they’re not really “adapters” either.
  21. Reference slide. Many of these operators only make sense in a streaming-data system. The Custom operator is like a blank slate. It can perform any kind of custom logic in response to the arrival of a tuple on one of its input ports, including writing to the console, sending out tuples and punctuation markers, and calling system functions, by using the simple but powerful SPL expression language. The Gate operator can help even out tuple flow and avoid overwhelming a downstream operator by only sending tuples on after receiving acknowledgment of previous tuples. The Switch operator can temporarily block a stream altogether, for example while a downstream operator is initializing or refreshing lookup information. Operators that are new in 3.0 are highlighted in blue .
  22. Reference slide. Some parallel-flow operators are for performance optimization and load balancing (e.g., ThreadedSplit); others support High Availability scenarios (DeDuplicate). Yet others support parallel processing by applying independent algorithmic steps in parallel to the same data (Barrier, Union). DynamicFilter is a very powerful operator for maintaining a list of filter criteria that may change over time, such as key words for a simple text-based filter. Parse, Format, Compress, and Decompress expose some of the capabilities that are built in to the standard adapters for files and TCP and UDP sockets; this allows you to apply the same features to data coming from or going to other types of connectors. Operators that are new in 3.0 are highlighted in blue .
  23. IBM IOD 2011 10/18/12 Prensenter name here.ppt 10/18/12 09:05 XML processing is still somewhat complicated, but it is now possible to read and write XML data and do something intelligent with it using built-in capabilities. The standard toolkit adapters can use the txt format parameter to read and write XML literals (double-quoted strings containing complete documents, including line breaks). They can also be read as strings (format: line) or blobs (format: block) and later converted to tuples or the xml data type. The xml data type validates syntax when a value is assigned; if a schema is specified ( xml&lt;&quot; Schema-URI &quot;&gt; , it also checks validity against that schema. XML data can be parsed and turned into structured-tuple data by the XMLParse operator, supported by XPath expressions. Or it can be left as xml-type values and used on the fly in XQuery functions.
  24. Streams is extensible at heart; extensibility is not an afterthought. New operators are developed all the time, by IBM itself as well as by users. With call-level APIs and code-generation facilities available to all users, user-developed toolkits have all the same power and access to system capabilities as IBM-developed ones. The only limitation is that it is not possible to create new “built-in” types. For example, the xml data type introduced in 3.0 had to be added to the system by IBM; it could not have been added by a third party.
  25. InetSource is good for REST interfaces (HTTP GET and POST requests) as well as the other types listed. For files, the difference with FileSource is that it periodically reads, either just the new data since the last read, or the entire file again. More operators will probably be added to the Internet toolkit in future, such as for using javascript-based data visualization, http sources that require authentication and other capabilities not currently supported, etc. Check the Streams Exchange on developerWorks for an early view of such operators. Regarding the supported databases in the ODBC* operators, there are some nuances. Any database that has a driver for UnixODBC (2.3 or later) can be supported. For the specific databases listed, there are restrictions: on RHEL 6, Oracle is not supported, and on IBM Power7 systems running RHEL 6, solidDB, Netezza, and Oracle are not supported. SolidDBEnrich is there in addition to ODBCEnrich, because it is much more efficient to use the proprietary SolidDB API than to go through ODBC and full SQL. There are restrictions, but for stream enrichment you usually don’t need all the flexibility of SQL. Additional target-specific operators have been and will continue to be added to this toolkit; operators for high-speed writing to partitioned DB2 databases and to Netezza already exist.
  26. IBM InfoSphere Data Explorer can index and consume results from Streams. The DataExplorerPush operator uses the BigSearch API. The Hadoop Distributed File System requires a separate set of source and sink adapters, because it is a proprietary file system that is not POSIX-compatible. The HDFS* operators all have counterparts of the same name in the Standard Toolkit, and operate in similar ways. HDFSSplit is there to improve throughput by allowing you to parallelize the HDFS I/O operations: in iterative fashion, send a specific amount of data to each of several HDFSSink operators, followed by a window punctuation marker to indicate that it is time to write that batch. This fits the parallel and batch-oriented nature of Hadoop. Note that there is no need for a similar set of adapters for BigInsights when configured with GPFS, because GPFS is POSIX-compliant and the standard-toolkit adapters work just fine. The HDFS* operators support Hadoop versions 0.20.2 and 1.0.0 (on Power, 0.20 only). They use both Hadoop and JNI (Java Native Interface).
  27. The IBM InfoSphere DataStage Integration Toolkit is supported by DataStage version 9.1. Streams can enhance a DataStage ETL job by performing deeper real-time analytic processing (RTAP) on the data as it is being loaded. In a slightly different scenario, DataStage can perform additional enrichment and transformation on the output of a Streams RTAP application, before storing the end results in a warehouse.
  28. NOTE: This toolkit is not supported on POWER7 systems. The XMSSource and XMSSink operators use the standard XMS architecture and APIs (CreateProducer, CreateConsumer, CreateConnection).
  29. The Text Toolkit does not require Hadoop or BigInsights, but in Text Analytics it is almost inevitable that the analytical model, as expressed in the AQL program, is developed and tested against a repository of text documents, usually in something like BigInsights. An example of text analytics is sentiment analysis based on a Twitter feed (the documents are tweets, in that case).
  30. IBM IOD 2011 10/18/12 Prensenter name here.ppt 10/18/12 09:05 The pattern parameter contains a string with a regular expression. The regular expression works on the granularity of input tuples. For example, rise+ matches one or more tuples that satisfy the rise predicate. The partitionBy parameter specifies one or more attributes of the input tuple as a key, and the pattern applies to tuples for each key separately and independently. The predicates parameter defines the symbols used in the regular expression—in this example, rise, drop, and deep. Each predicate is just a boolean expression written in SPL. Finally, the output clause assigns attributes of the output tuple when a match is detected. Both the predicates and the output assignments use aggregations (First, Last, Count, Max). Aggregations are computed from all the tuples in the match so far.
  31. Note that this is a toolkit that provides no operators; only types and functions. Geospatial and spatiotemporal (including the time dimension) computing form a rich domain that requires specific expertise, such as knowledge of cartography (maps and map projections), geospatial geometry (shapes and locations and their representations), set theory (spatiotemporal relationships), interoperability standards and conventions, as well as the specifics of the application (location intelligence and location-based services for businesses, security and surveillance, geographic information systems, traffic patterns and route finding, etc.). This toolkit provides the beginnings of a set of capabilities that lets Streams provide the real-time component for emerging location-based applications.
  32. A time series is a sequence of numerical data that represents the value of an object or multiple objects over time. For example, you can have time series that contain: the monthly electricity usage that is collected from smart meters; daily stock prices and trading volumes; ECG recordings; seismographs; or network performance records. Time series have a temporal order and this temporal order is the foundation of all time series analysis algorithms. A time series can be regular (sampled at regular intervals) or irregular (sampled at arbitrary times, for example when triggered by events). In addition, a time series can be univariate (tracking a single, scalar variable) or vector (containing a collection of scalar variables); it can even be expanding , if the number of variables increases over time. This toolkit contains 25 operators; too many to list them all here.
  33. The Streams Data Mining Toolkit adds several operators that can analyze and score streaming data according to PMML-standard models developed separately—for example, with SPSS—based on historical data. The PMML support and scoring code is ported directly from the IBM InfoSphere Warehouse, ensuring consistency of results. Currently, the Streams Mining operators have no built-in provision for learning and letting the model evolve; they simply apply the model to each incoming tuple in turn, with no reference to any data that passed through before. They can, however, respond to model changes. As the warehouse-based analysis results in an updated model, simply overwrite the model file; the operator detects the update, reads in the new model, and applies it from then on to newly arriving data.
  34. The IBM SPSS Analytics Toolkit for InfoSphere Streams enables the use of SPSS Modeler predictive analytics and SPSS Collaboration and Deployment Services model management features in Streams applications. The Analytics Toolkit consists of three Streams operators to implement Modeler analytics in Streams applications The SPSS Scoring operator enables the scoring against predictive models designed in SPSS Modeler in the flow of Streams applications. It works by passing an in-memory request to SPSS Modeler Solution Publisher which conducts scoring via a Modeler scoring stream and returns results in-memory back to the SPSS Scoring operator. The SPSS Repository operator, meanwhile, enables the use of SPSS Collaboration and Deployment Services model management functionality. Collaboration and Deployment Services is used to manage the lifecycle of SPSS predictive analytics assets. It provides automated evaluation of model effectiveness and automated refreshing of models when warranted. The SPSS Repository operator monitors the Collaboration and Deployment Services repository. When it receives notification that a scoring model is refreshed, it retrieves the updated model from the repository, prepares it and passes it to the SPSS Publish operator. The SPSS Publish operator in turn generates the executable images required to update the scoring model used in the Streams application. Upon publication of the new images, the new scoring model is automatically swapped into place without disruption to the processing underway in the Streams application.
  35. The Financial Information eXchange (FIX) protocol is a messaging standard to support real-time electronic exchange of security (option and stock) transactions. The Fix protocol is documented at http://www.fixprotocol.org/ WebSphere Front Office for Financial Markets contains feed handlers for a large number of financial-market feeds. For a detailed description of the WebSphere Front Office suite, how to configure and use feed handlers and clients, refer to the WFO documentation at http://publib.boulder.ibm.com/infocenter/imds/v3r0/index.jsp. IBM WebSphere MQ Low Latency Messaging is a high-throughput, low-latency messaging transport. It is optimized for the very high-volume, low-latency requirements typical of financial markets firms and other industries where speed of data delivery is paramount. In addition to the mentioned adapters, there are some simulation operators, which generate simulated feeds—continuous streams of cyclically repeating data: EquityMarketTradefeed, EquityMarketQuoteFeed, and OrderExecutor. These operators support the included sample applications and are not intended for production use.
  36. URLs: Passport Advantage : http://www.ibm.com/software/howtobuy/passportadvantage/index.html PartnerWorld : https://www.ibm.com/partnerworld/mem/valuepack/ Academic Initiative : https://www.ibm.com/developerworks/university/software/get_software.html XL (internal) Software Downloads : http://w3.ibm.com/software/xl/download/ticket.do Trials and Demos : http://www14.software.ibm.com/webapp/download/search.jsp?pn=InfoSphere+Streams NOTE: some of these links don’t open properly from PowerPoint Copy and paste into your browser to get around such problems. Passport Advantage notes: Passport Advantage covers nearly all IBM Software from Software Group (SWG), including WebSphere, IM, Lotus, Tivoli and Rational The more you buy (including maintenance), the greater the discounts The discounts apply to all the software you buy (including maintenance) Maintenance, which includes telephone support for usage and defects is included in new license prices Maintenance is 20% of the new license price
  37. Everyone should have been introduced to the Information Management Acceleration Zone (IMAZ). Streams is under Big Data and Warehousing The Sales Kit is a bit stale, so favor the IMAZ. URLs: IM Acceleration Zone : https://w3-connections.ibm.com/wikis/home?lang=en_US#/wiki/Info%20Mgmt%20Client%20Technical%20Professional%20Resources%20Wiki/page/Streams Media Library : http://w3.tap.ibm.com/medialibrary/search?displayMediaSet=0&amp;qt=infosphere%20streams&amp;narrow_filter=all&amp;m_oB=3&amp;m_oD=1 Software Sellers Workplace : http://w3-103.ibm.com/software/xl/portal/content?synKey=Q836166N45949G23#overview PartnerWorld : https://www.ibm.com/partnerworld/wps/servlet/mem/ContentHandler/Q836166N45949G23/lc=en_US developerWorks : http://www.ibm.com/developerworks/wikis/display/streams/Home The Streams Exchange is a place for Streams application developers to share ideas, source code, design patterns, experience reports, rules of thumb, best practices, etc.