12 big data company products investigation: Datameer, Qubole, Alpine, Pentaho, SiSense, 1010data, Palantir, Platfora, Data Torrent, Continuuity, Nuevora
4. +
Datameer
Products:
Technologies
http://www.slideshare.net/ydn/hug-meetup-may-2012-the-changing-big-data-landscape
It’s a full lifecycle product for Big Data: data integration, analytics and visualization
Open ecosystem: allow end user to publish reuse analytics template as application
http://www.datameer.com/learn/videos-smart-analytics.html
Business Model:
Product and Service License
Consultant of Industry Solution
Reference:
Full lifecycle and Open ecosystem
Basic but can solve 80% problem
Great user experience.
55 type of data
sources
statistical analytics on
spreadsheet
mining: clustering, decision
tree, association and
recommendation
22 widgets in inforgraphic using drag
and drop
5. +
Qubole Founded by Pre Facebook Analytics Lead
Infrastructure service: Big Data Service for data professionals.
100% Managed Hadoop Cluster in the cloud (auto-scaling, high performance,
self-maintenance)
Built-in Data Connectors to Apps and Data Sources (AppNexus, RedShift,
mongoDB and more)
24/7 Customer Support (via chat, phone or e-mail) from data infrastructure SWAT
team
All these services are hosted in the Amazon AWS cloud
API and Web UI to:
Lunch a new Hadoop cluster
Run Hadoop Job, Hive Query, Pig Job, Workflow, etc.
Monitoring and Schedule the analytics tasks.
Business Model http://www.qubole.com/pricing/
$199/Month: Startup: 3 user, 550 QCUH, 170 Invocations
$999/Month: Premium: 10 user, 4300 QCUH, 1050 Innovations
Customer: Pinterest, Decide, MediaMath, etc
http://www.qubole.com/qubole-data-service/
6. +
Alpine
World’s First Collaborative, Code-Free, Advanced Analytics Solution
for Big Data
Web GUI interface to create a analytics workflow(like SPSS Modeler), built in
full statistical functionality including Time-Series Analysis, Classification,
Regression, Decision Trees and more. Complete customization with native
support for Hadoop and Relational Databases
Data source from: Greenplum Database, Hadoop Cluster(Pivotal,
Greenplum, Cloudera, Apache, and MapR), HAWQ
Easy visualization of data: Frequency, Histogram, Heatmap, Time Series and
Boxplot
Easy collaborate with other to finish a analytics task using workspace.
Business Model: on line SaaS service
http://alpinenow.com/
7. +
Pentaho
Feature:
Web-based data access wizard for business users
Powerful data integration and federation for IT and developers
Access to any data type from Excel to big data sources such as Hadoop,
NoSQL and high performance analytic databases
http://pentahobigdata.com/
http://demo.pentaho.com/pentaho/Home?locale=en_US http://www.pentaho.com/evaluation-path#big-data
Technical Highlight:
• Visual designer for MapReduce jobs to reduce development cycles.
• Data preparation, modeling and exploration of unstructured data sets.
• Cluster support, enabling distributed processing of jobs across multiple nodes.
• Unique in-Hadoop execution for extremely fast performance.
http://www.youtube.com/embed/vOMOFPMnXgk?width=720&height=480&iframe=true&autoplay=1&fs=1
8. +
SiSense
Standard BI software:
Interactive Dashboard and Reports
Manager and Mashup Data
Integration with other application
http://www.sisense.com/documentation
9. +
1010data
Cloud-based analytical platform allows businesses to glean new
insights from unlimited amounts and varieties of data, from the
businesses themselves and other companies.
Business Model
Data consumers (business executives, analysts
and application developers)
Data providers
Third-party integrators
Independent software vendors (ISVs)
The 1010data analytics platform is an integrated stack that includes:
proprietary database management system (DBMS), data integration tools,
and unique user interface (the Trillion-Row Spreadsheet)
Not a Hadoop based solution
http://1010data.com/ http://1010data.com/demo/
10. +
Palantir
Palantir stack their analysis on the data platform: data integration,
search and discovery, knowledge management and collaboration.
Palantir Gotham and Palantir Metropolis are the core technologies that
power any Palantir solution.
Gotham: Get a clear picture of all your data, structured or unstructured. Pivot
analysis seamlessly between the semantic, geospatial, and temporal views.
Collaborate at the colleague, team, and organizational level.
Metropolis: a quantitative analysis platform that provides a suite of analytical
tools enabling complex, multi-study research demands.
Industry Solutions:Government,Health and Commercial
http://www.palantir.com/solutions/
They focusing on the data integration and technologies of how to
process different type of data effectively, not focusing on how to make
automatically analytics on data to replace data scientist work.
https://www.palantir.com/ http://www.palantir.com/platforms http://www.palantir.com/library/
11. +
Platfora
Product: Simplify how use can use Hadoop by simplify the data
collection and analytics process with tools and UI.
Demo:http://www.platfora.com/customers/disney/
12. +
A real-time streaming process engine on Hadoop/Yarn, designed
to enable highly scalable, massively distributed real-time
computations with minimal overhead.
Data Torrent
Architecture: https://www.datatorrent.com/docs/guides/RealTimeStreamingPlatform.html
Dev Process
Execution Logic
Hadoop Grid
14. +
Nuevora
Focusing on Marketing and Customer Behavior Analytics.
Process
Scenarios:
Customer Acquisition Growth
Customer acquisition is a broad term that is used to identify the processes and
procedures used to locate, qualify and ultimately secure the business of new
customers.
Customer Retention Growth
The true health of a business closely relates to how well it can acquire and retain
customers. An organization’s ability to provide value drives its customer retention,
which impacts the success of the entire business.
Customer Value Growth
The growth of a business is fueled by maximizing the wallet share and life-time
profitability of its customer base … whether you call it “customer relationship
management” or just good business.
Inside Technologies:
Hadoop: Data processing
R: predictive model
Tableau: data visualization