Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Lightning Fast Analytics with Hive LLAP and Druid

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 7 Anzeige

Lightning Fast Analytics with Hive LLAP and Druid

Herunterladen, um offline zu lesen

Cox Communications, one of the largest network providers in the U.S., is primarily focused on ensuring network security and providing better service to customers including:
• Real-time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization
• Enrich the security team with capabilities to determine the source and destination of traffic, class of service, and the causes of congestion on NetFlow data

Challenges:
Data related to Network Security includes more granular streaming data. The major challenge lies in having an unified platform to perform data cleansing, transformation, analytics and reporting on this huge streaming datasets. With the growing network traffic, there is an exponential growth with the associated data. There is a need for Scalable framework to handle these datasets and derive useful information out of data. Along with data processing, data retrieval also plays a major role for better analysis. Currently Data processing was done in daily batch using manual python scripts and with implementation of custom data structures which were specific to use cases. There was a need for more generic and unified framework to provide automated real time end to end solution to obtain high performing, more granular business results.

Solution:
Automation of this process has opportunities on several fronts, notably, providing consistency, repeat-ability, and modernization of OLAP analytics on enterprise big data platform. Reports can be generated easier and faster with the underlying OLAP engine.
• Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time stream data processing and enriching data using the ecosystem components like Spark, Kafka, Hive
• Impressively faster OLAP analytics using Hive LLAP and Druid Integration
• Simple and faster reporting using Superset

All of the necessary components under one roof of Hortonworks Hadoop Platform.
An end-to-end solution using Big Data platform produced faster and repeatable results with sub second query results.
Value Additions by above solution:

• Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results
• Opportunity for business users to explore and visualize real time streaming datasets with integration for various data sources and build dashboards for different slices
• Capability to run BI queries in just milliseconds over 1TB dataset
• High granular permission model on security datasets that allow intricate rules on accessibility for the datasets

Cox Communications, one of the largest network providers in the U.S., is primarily focused on ensuring network security and providing better service to customers including:
• Real-time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization
• Enrich the security team with capabilities to determine the source and destination of traffic, class of service, and the causes of congestion on NetFlow data

Challenges:
Data related to Network Security includes more granular streaming data. The major challenge lies in having an unified platform to perform data cleansing, transformation, analytics and reporting on this huge streaming datasets. With the growing network traffic, there is an exponential growth with the associated data. There is a need for Scalable framework to handle these datasets and derive useful information out of data. Along with data processing, data retrieval also plays a major role for better analysis. Currently Data processing was done in daily batch using manual python scripts and with implementation of custom data structures which were specific to use cases. There was a need for more generic and unified framework to provide automated real time end to end solution to obtain high performing, more granular business results.

Solution:
Automation of this process has opportunities on several fronts, notably, providing consistency, repeat-ability, and modernization of OLAP analytics on enterprise big data platform. Reports can be generated easier and faster with the underlying OLAP engine.
• Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time stream data processing and enriching data using the ecosystem components like Spark, Kafka, Hive
• Impressively faster OLAP analytics using Hive LLAP and Druid Integration
• Simple and faster reporting using Superset

All of the necessary components under one roof of Hortonworks Hadoop Platform.
An end-to-end solution using Big Data platform produced faster and repeatable results with sub second query results.
Value Additions by above solution:

• Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results
• Opportunity for business users to explore and visualize real time streaming datasets with integration for various data sources and build dashboards for different slices
• Capability to run BI queries in just milliseconds over 1TB dataset
• High granular permission model on security datasets that allow intricate rules on accessibility for the datasets

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Lightning Fast Analytics with Hive LLAP and Druid (20)

Anzeige

Weitere von DataWorks Summit (20)

Aktuellste (20)

Anzeige

Lightning Fast Analytics with Hive LLAP and Druid

  1. 1. Network Traffic Analytics with Hive LLAP and Druid
  2. 2. Network Traffic Analytics Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results Opportunity for business users to explore and visualize streaming datasets with integration for various data sources and build dashboards for different slices Capability to run BI queries in just milliseconds over 1TB dataset 2 3 4 Data ingestion, integration and aggregation of network security traffic in Hadoop at near Real-Time1 High granular permission model on security datasets that allow intricate rules on accessibility for the datasets5 Near Real time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization. Context: Framework provides an unified platform for data storage and data enrichment by integrating the network traffic with multiple sources (like Geo Location and Threat IP’s). Hadoop performs complex pre-computations and aggregations to report and alert on IP threats. Solution & Benefits:
  3. 3. • NetFlow routers to Kafka • Enrich Kafka messages with multiple Data Feeds • Store the single flat table for OLAP analysis • Analyze network traffic • Identify Threat IP’s • User anomaly detection • Create data visuals for streaming datasets • Provide near real time network traffic monitoring dashboards and alerting Network Traffic Analytics 3 Multi-step workflow to perform faster Analytics on network traffic data Data load and preparation Data Analytics Visualization and Publish
  4. 4. Current Challenges 4 Fragile and manually reliant process No unified platform to perform data cleansing, transformation, analytics and reporting Inadequate storage to account for data scalability Data Analysis done using custom data structures which are specific to use cases Lack of ability to perform monitoring and alerting on streaming datasets. Lack of ability to enrich data and aggregate at different levels
  5. 5. Enterprise Solution 5 Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time data streams Spark streaming to enrich the data with multiple source system feeds Faster OLAP analytics using Hive LLAP and Druid Integration Capability to run BI queries in just milliseconds over 1TB dataset Modernization of Application stack Explore and visualize streaming datasets and build dashboards for different slices
  6. 6. NetFlow Architecture Consumers Threat IP Sources Data Storage and Processing 6
  7. 7. Q & A

×