SlideShare ist ein Scribd-Unternehmen logo
1 von 31
NoSQL Landscape and a
Solution to Polyglot
Persistence
© Impetus Technologies
Agenda
• Big Data Problems
• Transition from RDMS to NoSQL
• NoSQL Landscape
• Challenges in transition
• Tools for NoSQL
• Kundera – an open source polyglot solution
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
© Impetus Technologies
BIG Data Problem
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Why not RDBMS?
Scalability
Data format
High availability
 Data volume in zeta
byte, yottabyte
 Horizontal scaling
would be expensive
 Data format can be
static or dynamic
 Relational / Non-
relational
 Data locality
 No single point
of failure
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Non-RDBMS way
Scale out
Scale up Static schema
Dynamic schema
Centralized
Decentralized
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Introduction to NoSQL
“An approach to storing and retrieving data with horizontal scaling, simple
design and high availability”
Data format driven
processing
Distributed with No
single point of
failure(SPOF)
Thinking out of SQL
box
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
NoSQL :A Pragmatic Solution?
With NOSQL data can be consistent, highly available
and with no SPOF!
But not 100%!
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
CAP Theorem
Consistency
Availability
Partition
Tolerance N/A
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Thinking NoSQL?
Size
Format
VelocityFiltering
Large Data
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Size
High data growth ! scalability is an issue?
Traditional RDBMS based solutions will not work!
xxx
xxx
xxx
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Velocity
Near real time/Big Data analytics
Parallel processing, ready-for-read design is required
Traditional RDBMS solutions are not
fast enough to meet the SLAs !
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Filtering
Filtering. Fraud detection
Risk management analysis
Traditional RDBMS may work on small
scale but not with large data !
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Format
Non relational data format.
Different nature of data set: graph based, key-value based access
Traditional database is limited to static tables!
lo
g
s
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
NoSQL Landscape
NOSQL
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Transition to NoSQL
Datastore
Selection
API
exploration
Landscape
Understanding
Implementation
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Selecting a NoSQL Datastore
Neo4j, Titan,
Objectivity,
Orient DB,
Vertex DB
Cassandra,
HBase,
Hypertable,
BigTable
Oraclekv, Redis,
Couch DB, Riak
MongoDB,
Couch base
Graph Columnar
Key-value Document
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
High Level APIs
 Kundera
 Kundera
 Kundera
 Hector
 Easy Cassandra
 Datastax java
driver
 Astyanax
 Morphia
 Data Nucleus
 Jongo
 Spring data
 Spring data
 Neo4j
 Hibernate OGM
 Data nucleus
 Hbase api
 Spring data
 Kundera
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Hybrid Design
Cassandra, HBase RDBMS Redis
MongoDB, Couchbase Neo4J, Titan Hadoop, Spark
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Bumpy Ride!
Unlearn and Learn
new APIs!
Index based retrieval
over multiple NOSQL
data stores
Atomic operations
NOSQL world is still
evolving, may need to
explore among data stores
Migration of existing
production applications and
many more…
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
One Stop Solution
Master key, possible?
Let’s explore!
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Polyglot Way
Migrating existing
solutions
Guarantee
atomicity
Switch
databases
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
High Level Polyglot API
 Spring data
 Kundera
 Spring data
 Kundera
 Spring data
 Kundera
 Spring data
 Kundera
Let’s implement in JPA way!
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Kundera to the Rescue!!
Supports 8 data stores –
Cassandra, Hbase,
MongoDB, Redis, Neo4j,
Oracle NoSQL, CouchDB
and any RDBMS
CRUD / Strong Query
Support
Object Relationships
Handling
Datastore-Optimized
Persistence and Query
Approach
Interceptors / Events /
Caching
Connection Pool / Fallback
(Lucene) Indexing
Flexibility
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Getting Started
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
User Logs Sample App
@Entity
@Table(name = "user“)
@IndexCollection(columns = { @Index(name = "emailId") })
public class User {
@Id
@Column(name = "user_id")
private String userId;
@Column(name = "first_name")
private String firstName;
@Column(name = "last_name")
private String lastName;
@Column(name = "emailId")
private String emailId;
@OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY)
@JoinColumn(name = "user_id")
private Set<userLogs> logs;
@Embedded
private PersonalDetail personalDetail;
public User() {
// Default constructor.
}
//Setters and Getters
@Entity
@Table(name = “logs”)
@Index(columns = { "body", “created_at" }, index = true)
public class UserLogs {
@Id
@Column(name = “log_id")
private String logId;
@Column(name = "body")
private String body;
@Column(name = “created_at")
@Temporal(TemporalType.DATE)
private Date createdDate;
public UserLogs() {
// Default constructor.
}
// Setters and Getters
User Entity UserLogs Entity
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
User Logs Sample App
Configuration : Persistence.xml
<!-- Persistence unit for Cassandra persistence -->
<persistence-unit name=“logCassandra">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<class>com.impetus.kvapps.entities.UserLogs</class>
<exclude-unlisted-classes>true</exclude-unlisted-classes>
<properties>
<property name="kundera.nodes" value="localhost" />
<property name="kundera.port" value="9160" />
<property name="kundera.keyspace" value=“userstore" />
<property name="kundera.dialect" value="cassandra" />
<property name="kundera.client.lookup.class"
value="com.impetus.client.cassandra.thrift.ThriftClientFactory" />
<property name="kundera.ddl.auto.prepare" value="create" />
<property name="index.home.dir" value="lucene"/>
</properties>
</persistence-unit>
<!-- Persistence unit for mysql persistence -->
<persistence-unit name=“logRdbms">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<class>com.impetus.kvapps.entities.User</class>
<exclude-unlisted-classes>true</exclude-unlisted-classes>
<properties>
<property name="kundera.client.lookup.class"
value="com.impetus.client.rdbms.RDBMSClientFactory" />
<property name="hibernate.hbm2ddl.auto" value="create" />
<property name="hibernate.show_sql" value="false" /><property
name="hibernate.format_sql" value="false" />
<property name="hibernate.dialect"
value="org.hibernate.dialect.MySQL5Dialect" />
<property name="hibernate.connection.driver_class"
value="com.mysql.jdbc.Driver" />
<property name="hibernate.connection.url"
value="jdbc:mysql://localhost:3306/userstore" />
<property name="hibernate.connection.username" value="root" />
<property
name="hibernate.connection.password" value="root" />
</propertie>
</persistence-unit>
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Switching Data stores
<!-- Persistence unit for Cassandra persistence -->
<persistence-unit name=“logCassandra">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<class>com.impetus.kvapps.entities.userLogs</class>
<exclude-unlisted-classes>true</exclude-unlisted-classes>
<properties>
<property name="kundera.nodes" value="localhost" />
<property name="kundera.port" value="9160" />
<property name="kundera.keyspace" value=“userstore" />
<property name="kundera.dialect" value="cassandra" />
<property name="kundera.client.lookup.class"
value="com.impetus.client.cassandra.thrift.ThriftClientFactory" />
<property name="kundera.ddl.auto.prepare" value="create" />
<property name="index.home.dir" value="lucene"/>
</properties>
</persistence-unit>
<!-- Persistence unit for Mongo persistence -->
<persistence-unit name=“logMongo">
<provider>com.impetus.kundera.KunderaPersistence</provider>
<class>com.impetus.kvapps.entities.User</class>
<exclude-unlisted-classes>true</exclude-unlisted-classes>
<properties>
<property name="kundera.nodes" value="localhost" />
<property name="kundera.port" value="27017" />
<property name="kundera.keyspace" value=“userlstore" />
<property name="kundera.dialect" value="mongodb" />
<property name="kundera.client.lookup.class"
value="com.impetus.client.mongodb.MongoDBClientFactory" />
<property name="kundera.ddl.auto.prepare" value="create" />
</properties>
</persistence-unit>
//create entity manager factory.
EntityManagerFactory emf = Persistence.createEntityManagerFactory(“logCassandra,logMongo”, properties);
EntityManager em = emf.createEntityManager();
…..
em.persist(user);
Configuration : Persistence.xml
Persist Data
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Performance & Benchmarks
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Technical Challenges Addressed!
• Distributed indexing over multiple NOSQL database e.g. Solr,
Elastic search
• Plugin Kundera powered ES or Lucene indexer
• Build your own library and simply plugin
• Unlearn and Learn new APIs!
• Based on most popular JPA 2.0 specification
• Atomicity guarantee and Transaction management
• Built in support for JPA/JTA transaction and batch operations
• NOSQL world is evolving, plan to switch databases?
• Since it’s a JPA powered solution, reuse same code with almost no changes
Recorded version available at http://bit.ly/1hfz4Tn
Q&A
Big Data Solutions and Services partner for Enterprises
bigdata@impetus.com
© Impetus Technologies
© Impetus Technologies
ThankYou!
• Meet us at
• Hadoop Summit, San Jose
• CIO Big Data Summit, Texas
• Strata Conference + Hadoop World, New York
• Gartner Symposium, Orlando
• Try / Recommend Kundera
• https://github.com/impetus-opensource/Kundera
• @impetustech

Weitere ähnliche Inhalte

Mehr von Impetus Technologies

Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
Impetus Technologies
 

Mehr von Impetus Technologies (20)

Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
Accelerating Hadoop Solution Lifecycle and Improving ROI- Impetus On-demand W...
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
Enterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus WebcastEnterprise Ready Android and Manageability- Impetus Webcast
Enterprise Ready Android and Manageability- Impetus Webcast
 
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
Maturity of Mobile Test Automation: Approaches and Future Trends- Impetus Web...
 
Big Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLabBig Data Analytics with Storm, Spark and GraphLab
Big Data Analytics with Storm, Spark and GraphLab
 
Webinar maturity of mobile test automation- approaches and future trends
Webinar  maturity of mobile test automation- approaches and future trendsWebinar  maturity of mobile test automation- approaches and future trends
Webinar maturity of mobile test automation- approaches and future trends
 
Next generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph labNext generation analytics with yarn, spark and graph lab
Next generation analytics with yarn, spark and graph lab
 
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
The Shared Elephant - Hadoop as a Shared Service for Multiple Departments – I...
 
Performance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus WebcastPerformance Testing of Big Data Applications - Impetus Webcast
Performance Testing of Big Data Applications - Impetus Webcast
 
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus WebinarReal-time Predictive Analytics in Manufacturing - Impetus Webinar
Real-time Predictive Analytics in Manufacturing - Impetus Webinar
 
Webinar real-time predictive analytics in manufacturing
Webinar  real-time predictive analytics in manufacturingWebinar  real-time predictive analytics in manufacturing
Webinar real-time predictive analytics in manufacturing
 
Real-time Analytics for the Healthcare Industry: Arrythmia Detection- Impetus...
Real-time Analytics for the Healthcare Industry: Arrythmia Detection- Impetus...Real-time Analytics for the Healthcare Industry: Arrythmia Detection- Impetus...
Real-time Analytics for the Healthcare Industry: Arrythmia Detection- Impetus...
 
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarBuild and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
 
Addressing Performance Testing Challenges in Agile- Impetus Webinar
Addressing Performance Testing Challenges in Agile- Impetus WebinarAddressing Performance Testing Challenges in Agile- Impetus Webinar
Addressing Performance Testing Challenges in Agile- Impetus Webinar
 
Impetus SandStorm - Performance Testing Tool for Web, Mobile and Cloud
Impetus SandStorm  - Performance Testing Tool for Web, Mobile and CloudImpetus SandStorm  - Performance Testing Tool for Web, Mobile and Cloud
Impetus SandStorm - Performance Testing Tool for Web, Mobile and Cloud
 
Addressing Performance Testing Challenges in Agile: Process and Tools: Impetu...
Addressing Performance Testing Challenges in Agile: Process and Tools: Impetu...Addressing Performance Testing Challenges in Agile: Process and Tools: Impetu...
Addressing Performance Testing Challenges in Agile: Process and Tools: Impetu...
 
Webinar Invite-Build and Manage Hadoop and Oracle NoSQL Database Solutions
Webinar Invite-Build and Manage Hadoop and Oracle NoSQL Database SolutionsWebinar Invite-Build and Manage Hadoop and Oracle NoSQL Database Solutions
Webinar Invite-Build and Manage Hadoop and Oracle NoSQL Database Solutions
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

NoSQL Landscape and a Solution to Polyglot Persistence

  • 1. NoSQL Landscape and a Solution to Polyglot Persistence
  • 2. © Impetus Technologies Agenda • Big Data Problems • Transition from RDMS to NoSQL • NoSQL Landscape • Challenges in transition • Tools for NoSQL • Kundera – an open source polyglot solution Recorded version available at http://bit.ly/1hfz4Tn © Impetus Technologies
  • 3. © Impetus Technologies BIG Data Problem Recorded version available at http://bit.ly/1hfz4Tn
  • 4. © Impetus Technologies Why not RDBMS? Scalability Data format High availability  Data volume in zeta byte, yottabyte  Horizontal scaling would be expensive  Data format can be static or dynamic  Relational / Non- relational  Data locality  No single point of failure Recorded version available at http://bit.ly/1hfz4Tn
  • 5. © Impetus Technologies Non-RDBMS way Scale out Scale up Static schema Dynamic schema Centralized Decentralized Recorded version available at http://bit.ly/1hfz4Tn
  • 6. © Impetus Technologies Introduction to NoSQL “An approach to storing and retrieving data with horizontal scaling, simple design and high availability” Data format driven processing Distributed with No single point of failure(SPOF) Thinking out of SQL box Recorded version available at http://bit.ly/1hfz4Tn
  • 7. © Impetus Technologies NoSQL :A Pragmatic Solution? With NOSQL data can be consistent, highly available and with no SPOF! But not 100%! Recorded version available at http://bit.ly/1hfz4Tn
  • 8. © Impetus Technologies CAP Theorem Consistency Availability Partition Tolerance N/A Recorded version available at http://bit.ly/1hfz4Tn
  • 9. © Impetus Technologies Thinking NoSQL? Size Format VelocityFiltering Large Data Recorded version available at http://bit.ly/1hfz4Tn
  • 10. © Impetus Technologies Size High data growth ! scalability is an issue? Traditional RDBMS based solutions will not work! xxx xxx xxx Recorded version available at http://bit.ly/1hfz4Tn
  • 11. © Impetus Technologies Velocity Near real time/Big Data analytics Parallel processing, ready-for-read design is required Traditional RDBMS solutions are not fast enough to meet the SLAs ! Recorded version available at http://bit.ly/1hfz4Tn
  • 12. © Impetus Technologies Filtering Filtering. Fraud detection Risk management analysis Traditional RDBMS may work on small scale but not with large data ! Recorded version available at http://bit.ly/1hfz4Tn
  • 13. © Impetus Technologies Format Non relational data format. Different nature of data set: graph based, key-value based access Traditional database is limited to static tables! lo g s Recorded version available at http://bit.ly/1hfz4Tn
  • 14. © Impetus Technologies NoSQL Landscape NOSQL Recorded version available at http://bit.ly/1hfz4Tn
  • 15. © Impetus Technologies Transition to NoSQL Datastore Selection API exploration Landscape Understanding Implementation Recorded version available at http://bit.ly/1hfz4Tn
  • 16. © Impetus Technologies Selecting a NoSQL Datastore Neo4j, Titan, Objectivity, Orient DB, Vertex DB Cassandra, HBase, Hypertable, BigTable Oraclekv, Redis, Couch DB, Riak MongoDB, Couch base Graph Columnar Key-value Document Recorded version available at http://bit.ly/1hfz4Tn
  • 17. © Impetus Technologies High Level APIs  Kundera  Kundera  Kundera  Hector  Easy Cassandra  Datastax java driver  Astyanax  Morphia  Data Nucleus  Jongo  Spring data  Spring data  Neo4j  Hibernate OGM  Data nucleus  Hbase api  Spring data  Kundera Recorded version available at http://bit.ly/1hfz4Tn
  • 18. © Impetus Technologies Hybrid Design Cassandra, HBase RDBMS Redis MongoDB, Couchbase Neo4J, Titan Hadoop, Spark Recorded version available at http://bit.ly/1hfz4Tn
  • 19. © Impetus Technologies Bumpy Ride! Unlearn and Learn new APIs! Index based retrieval over multiple NOSQL data stores Atomic operations NOSQL world is still evolving, may need to explore among data stores Migration of existing production applications and many more… Recorded version available at http://bit.ly/1hfz4Tn
  • 20. © Impetus Technologies One Stop Solution Master key, possible? Let’s explore! Recorded version available at http://bit.ly/1hfz4Tn
  • 21. © Impetus Technologies Polyglot Way Migrating existing solutions Guarantee atomicity Switch databases Recorded version available at http://bit.ly/1hfz4Tn
  • 22. © Impetus Technologies High Level Polyglot API  Spring data  Kundera  Spring data  Kundera  Spring data  Kundera  Spring data  Kundera Let’s implement in JPA way! Recorded version available at http://bit.ly/1hfz4Tn
  • 23. © Impetus Technologies Kundera to the Rescue!! Supports 8 data stores – Cassandra, Hbase, MongoDB, Redis, Neo4j, Oracle NoSQL, CouchDB and any RDBMS CRUD / Strong Query Support Object Relationships Handling Datastore-Optimized Persistence and Query Approach Interceptors / Events / Caching Connection Pool / Fallback (Lucene) Indexing Flexibility Recorded version available at http://bit.ly/1hfz4Tn
  • 24. © Impetus Technologies Getting Started Recorded version available at http://bit.ly/1hfz4Tn
  • 25. © Impetus Technologies User Logs Sample App @Entity @Table(name = "user“) @IndexCollection(columns = { @Index(name = "emailId") }) public class User { @Id @Column(name = "user_id") private String userId; @Column(name = "first_name") private String firstName; @Column(name = "last_name") private String lastName; @Column(name = "emailId") private String emailId; @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY) @JoinColumn(name = "user_id") private Set<userLogs> logs; @Embedded private PersonalDetail personalDetail; public User() { // Default constructor. } //Setters and Getters @Entity @Table(name = “logs”) @Index(columns = { "body", “created_at" }, index = true) public class UserLogs { @Id @Column(name = “log_id") private String logId; @Column(name = "body") private String body; @Column(name = “created_at") @Temporal(TemporalType.DATE) private Date createdDate; public UserLogs() { // Default constructor. } // Setters and Getters User Entity UserLogs Entity Recorded version available at http://bit.ly/1hfz4Tn
  • 26. © Impetus Technologies User Logs Sample App Configuration : Persistence.xml <!-- Persistence unit for Cassandra persistence --> <persistence-unit name=“logCassandra"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.UserLogs</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="9160" /> <property name="kundera.keyspace" value=“userstore" /> <property name="kundera.dialect" value="cassandra" /> <property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> <property name="index.home.dir" value="lucene"/> </properties> </persistence-unit> <!-- Persistence unit for mysql persistence --> <persistence-unit name=“logRdbms"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.User</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.client.lookup.class" value="com.impetus.client.rdbms.RDBMSClientFactory" /> <property name="hibernate.hbm2ddl.auto" value="create" /> <property name="hibernate.show_sql" value="false" /><property name="hibernate.format_sql" value="false" /> <property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5Dialect" /> <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver" /> <property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/userstore" /> <property name="hibernate.connection.username" value="root" /> <property name="hibernate.connection.password" value="root" /> </propertie> </persistence-unit> Recorded version available at http://bit.ly/1hfz4Tn
  • 27. © Impetus Technologies Switching Data stores <!-- Persistence unit for Cassandra persistence --> <persistence-unit name=“logCassandra"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.userLogs</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="9160" /> <property name="kundera.keyspace" value=“userstore" /> <property name="kundera.dialect" value="cassandra" /> <property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> <property name="index.home.dir" value="lucene"/> </properties> </persistence-unit> <!-- Persistence unit for Mongo persistence --> <persistence-unit name=“logMongo"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.User</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="27017" /> <property name="kundera.keyspace" value=“userlstore" /> <property name="kundera.dialect" value="mongodb" /> <property name="kundera.client.lookup.class" value="com.impetus.client.mongodb.MongoDBClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> </properties> </persistence-unit> //create entity manager factory. EntityManagerFactory emf = Persistence.createEntityManagerFactory(“logCassandra,logMongo”, properties); EntityManager em = emf.createEntityManager(); ….. em.persist(user); Configuration : Persistence.xml Persist Data Recorded version available at http://bit.ly/1hfz4Tn
  • 28. © Impetus Technologies Performance & Benchmarks Recorded version available at http://bit.ly/1hfz4Tn
  • 29. © Impetus Technologies Technical Challenges Addressed! • Distributed indexing over multiple NOSQL database e.g. Solr, Elastic search • Plugin Kundera powered ES or Lucene indexer • Build your own library and simply plugin • Unlearn and Learn new APIs! • Based on most popular JPA 2.0 specification • Atomicity guarantee and Transaction management • Built in support for JPA/JTA transaction and batch operations • NOSQL world is evolving, plan to switch databases? • Since it’s a JPA powered solution, reuse same code with almost no changes Recorded version available at http://bit.ly/1hfz4Tn
  • 30. Q&A Big Data Solutions and Services partner for Enterprises bigdata@impetus.com © Impetus Technologies
  • 31. © Impetus Technologies ThankYou! • Meet us at • Hadoop Summit, San Jose • CIO Big Data Summit, Texas • Strata Conference + Hadoop World, New York • Gartner Symposium, Orlando • Try / Recommend Kundera • https://github.com/impetus-opensource/Kundera • @impetustech

Hinweis der Redaktion

  1. TITLE: Real-time Streaming Analytics – Business Value, Use Cases and Architectural Considerations Speaker: Anand Venugopal, Sr. Director of Business Development Abstract: As IT and line-of-business executives begin to operationalize Hadoop and MPP based batch big data analytics, it&amp;apos;s time to begin to understand and prepare for the next wave of innovation in data processing—Analytics over real-time streaming data. This session will provide an overview and discussion on the business value, use cases and architectural considerations of integrating real-time streaming analytics into your Enterprise Big Data roadmap.