The 7 Things I Know About Cyber Security After 25 Years | April 2024
Big Data in The Cloud: Architecting a Better Platform
1. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Welcome!
Big Data in The Cloud:
Architecting a Better Platform
Brian Kinlaw, Principal Solution Architect, CSC
2. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Today’s Presenters
Brian Kinlaw
Principal Solution Architect
CSC Emerging Business Group
Leads the initiation, development and execution of Big Data,
Analytics, Social Media, Mobile, Cloud, Cyber Security, and
Internet of Things (IoT) solutions for the Office of the CTO.
3. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Agenda
I. CSC BDPaaS Overview
II. CSC Approach
III. BDPaaS Architecture
IV. BDPaaS Security
V. Questions & Answers
4. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Rapidly Evolving Analytics Landscape
BIG DATA 1.0 (EDW/BI) BIG DATA 3.0 (OPEN SOURCE / NEXT GEN)
KEY CHARACTERISTICS
• Relatively Small, Structured Data Sets
• Proprietary RDBMS
• Internally Sourced / Small Teams
• Reactive Reporting Mechanisms
• Introduction of Unstructured Data
Sources
• New In-Memory Analytic Capabilities
• “Data Scientists” Emerge
• Ad-hoc Reporting Becoming Pervasive
• Seamless Blend of Traditional
Analytics and Big Data
• Heavily Open Sourced
• Reporting Becomes Predictive &
Influence Business Process Change
REPRESENTATIVE
TECHNOLOGIES
IBM DB2, Oracle DB, IBM Cognos, SAP
Business Objects, Oracle BI, Informatica
IBM Netezza, HP Vertica, Oracle Exadata &
Exalytics, Teradata, Pivotal Greenplum
Cloudera Hadoop, Hortonworks Hadoop,
Spark, Storm, Kafka, Tableau, Pentaho
POTENTIAL BUSINESS
ROI
Low-Medium Medium Very High
CUSTOMER
SKILLS/TALENT
Bulk of Talent Today Talent Investments Required High Demand Talent
BIG DATA 2.0 (ANALYTIC APPLIANCES)
DETERMINING
VALUE
SECURITY & COMPLIANCESKILLS & CAPABILITIES 32%30%65%
The Market is
Here Today
Yet Challenges Remain…
5. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
CSC BIG DATA & ANALYTICS: WE ARE UNIQUELY
POSITIONED TO ADD VALUE
Technology Expertise
Working with Hadoop
since its Creation
Faster Time to Value
Deliver a Big Data Platform
in 30 Days
Enterprise Security
Data, Application, Platform
Security and Compliance
SHAPE TRANSFORM
MANAGEMENT
AS A SERVICE
DIFFERENTIATION: OUR UNIQUE STRENGTHS
FIVECORE
OFFERINGS
Analytics aaSBig Data Analytic Insights
Big Data
Strategy Big Data Platform
Innovation
Big Data Platform aaS
STRATEGY ANALYTICS
PLATFORMS
INDUSTRY
ACCELERATOR
S
Product Innovation: Optimize product mix & feature set to improve revenue by 25-30%
Customer Intelligence: Identify innovative new revenue channels – up to 2x revenue increase
Smart Operations: Improve operating margins ~60% thru efficiency and quality improvements
Risk Insights: Reduce fraudulent activity by up to 75%, avoid millions in cost & exposure
Revenue
Enhancers
Profit
Enhancers
6. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Client Value
Achieved
• Prioritized Roadmap of Initiatives to Achieve Growth
Vision within 2-3 years: BU Growth from $200M to
$1B Through Analytic Insights
Client Value
Achieved
• 331% ROI
• Payback Period of 2.1 Months
• 2% Yield Improvement = $300M
Client Value
Achieved
• Reduced time to onboard customers by 80%
• Improved visibility on service levels
• Increased customer satisfaction
Client Value
Achieved
• BSL Met Strategic Objective (ITaaS)
• Reduced Costs by 20%
• Improved Analytic Cycle Time by 50%
Client Value
Achieved
• Access to Information in Minutes versus Weeks
• Speed: Solution Deployed within Days
• Access to Key Next Gen Talent
Client Value
Achieved
• Speed to Market: 30 Days to Platform, 60 Days to
Full Working Mobile Telematics Application
• Flexible Deployment Options
Achieving Real Business Value With Our Clients
Integrated data
for ~100M people from 40
member companies
Healthcare
Maximized diamond
company profitability
through BI and analytics
Wholesale
Railway punctuality
improved from 92% to
a world-leading 96%
Transportation
Reduced tax evasion
and litigation through
DW and predictive modeling
Government
16% increase in claims fraud
investigations for significant
ROI in
6 months
Insurance
Performance
optimization and analytical
insights
into POS and sales trends
Retail/CPG
$10M reduction in
annual operating expenses
Printing
Customer intelligence
lifetime value model driving
marketing and customer
service
Travel & Leisure
Use of sensor data for real-
time management
of mining and mfg. ops and
maintenance
Natural Resources
Comprehensive global view
of exposure in
near real time
Banking
Global Insurance
Company
7. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
RISK RESULT
• Structuring all data at the point of ingestion
• Schema on Write vs Schema on Read
• Significant upfront expense ( and $$) for
planning
• Significant expense ( and $$) to adapt to
changes/needs of the business
• Data silos • Disparate information streams
• Reduced ability to obtain requirements from
entire business
• Does not allow for holistic decisions to be
made
• No golden source of truth
• Proprietary/custom data
warehousing/infrastructure
• Expensive
• Non standard to environment
• Scale • Not economically feasible
• Not technically possible
Risk to Traditional Data Model the status quo
8. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Risk of Transforming to a Big Data Business
RISK RESULT
• Numerous different technologies • Hard to select the best tool without specific
experience with these technologies
• Lack of Big Data specific expertise • Unreasonable expectations without having
done it before
• R&D in Big Data is lost or as time permits
• Scope creep is common
• Learning as your go
• Immature Big Data Technologies • Compliance risk
• Security Risk
• Complex deployments
• Complex integrations between technologies
• High operational costs
• Large CapEx expenditure • Buying upfront growth
• More complex to scale
Big Data & Analytic systems should be a tool to enable companies with better information
and insights, not a roadblock
9. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
1. Implementation
• Complexity
• Integration
• Speed
2. Operation
3. Data Science
• Business Relevance
• Feedback loop
4. Talent
• Robust & Scalable
• Monitoring & Automated Alerts
Operational Big Data Risks
• The right talent at the right time
5. Infrastructure
• Upfront - CapEx investment
• Iterative Flexibility
• Matching Hardware to Software
10. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
A New Mitigation Strategy Big Data Platform-as-a-Service
Operation
• Managed to your SLA needs
• Global delivery teams and support
• Integrated testing
Implementation
• DevOps infrastructure-as-code deployment
• Pre-defined orchestration scripts
• Flexible deployment locations
Talent
• Data engineers
• Solution Architects
• ETL expertise
• Support Team
• R&D Team
• BI/Viz/Reporting expertise
Data Science
• Subject matter expertise as needed
• Global Data Science team
• Applying analysis at the right point
Infrastructure
• as-a-Service model
• Pay-as-you-go structure
• Pre-configured hardware designs
11. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Agenda
I. CSC BDPaaS Overview
II. CSC Approach
III. BDPaaS Architecture
IV. BDPaaS Security
V. Questions & Answers
12. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Descriptive
Analytics I
What
happened?
• Reporting —
Query,
Reporting,
and Search
Tools
Diagnostic
Analytics
Why did it
happen?
• Analysis — OLAP
and Visualization
Tools
Descriptive
Analytics II
What’s happening
now?
• Monitoring —
Dashboards and
Scorecards
Predictive
Analytics
What might
happen?
• Predictive
Analysis —
Big Data
Prescriptive
Analytics
How can we make
it happen?
• Recommendations,
Risk Avoidance
Complexity
BusinessValue
Operations Triggers
High ImpactLow Impact
Process Improvement via Applied
Intelligence
The Analytics Journey
13. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
MAJOR ACTIVITIES
Solution
Iterative App DevelopmentPlatform RolloutTechnical DesignDiscovery
Interview Key
Business
Stakeholders
Interview Key
Technical
Stakeholders
Define
Objectives &
Challenges
Define Target
Use Case
Identify Data
Sources
Define
Business
Benefits
Define
Architecture
Develop High-
Level
Approach &
Costs
Agree to
Project
Plan/Rollout
Standup /
Connect
Environment
Design Data
Flows
Architecture
Validation
Build Data
Flows
Historical
Data
Real-Time
Data Flow
MANAGETRANSFORMSHAPE
Iterate
• Identify data sources for target
use case
• Develop high level tech
approach and costs
• Define high level benefits
• Develop initial case for action
• Develop go forward plan
• Develop Data Model
• Technical architecture &
integration design
• Stand up environment
• Dashboard design workshops
• Data mapping
• Build dashboard
• Configure application
• Data load
• Run solution iterations
• Analytical modeling
• 2-4 hour Design Thinking
Workshop
• Review current state metrics
• Review business pain points &
opportunities
• Review application & infrastructure
environment
• Define target use case
Customer Engagement Framework
14. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Data Exploration &
Transformation
Data Modeling &
Algorithm
Development
Data Visualization
& Reporting
Business Discovery
InsightLab: Rapid Analytics Development
Insight
Operationalization
Change
Management
Use Case
Prioritization &
Roadmap
Data Inventory
Identification &
Coordination
8 – 12 Week Sprint
Agile Scientific Approach to Measurable Business
Improvement
Inputs
Outputs
InsightLab
15. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
How to Build a Business Outcome for anything
Tools & TechnologiesR / Python /
Java /
Javascript
Tableau /
Pentaho /
Qlik
Cognos /
BobJ /
OBIEE
SAS / SPSS /
MatLab /
Rapid Miner
Relational
DB
Columnar
DB
Graph DB Hadoop
In-Memory
/ Streaming
Visualization
Time Series
Spatial
Charts
Mapping Histogram Graphs Line Charts
Scatter
Plots
Decision
Trees
Data
Exploration
Data Science
Decision Trees
Regression
Analysis
Classification Clustering
Anomaly
Detection
Natural
Language
Processing (NLP)
Correlation
Analysis
Ingestion / Munging
Discovery Integration Normalization
Dimensionality
Reduction
Feature
Extraction
Transformation
& Enrichment
Data Fusion
Business Insights
Descriptive (1.0) Diagnostics (2.0) Predictive (3.0) Prescriptive (4.0)
5Define the right
tools for the task at
hand
4
Define consumption
and interaction
3
Define the types of
Analysis
2
Define data needed
& format for
analysis
1
Define the desired
insights by stage
16. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Case Study
• Decrease warranty inquiry response times
• Increase operational efficiency
• Enable the business to extract new
insights
• Conducted 5-week big data strategy
assessment
• Established cloud-based big data platform
• Built the apps and analytics to capitalize
on the data
• Over 10,000 queries/day
• 30+ data connections
• 1,000+TB of data
• Response times of 2-3 months now done
with a single query
• Improved customer satisfaction
• Reduced churn
• Reduced support costs
• New product management capabilities,
fixes
• Better supply chain coordination
• Increased security
• New data and analytics products
• Increased cross-sales and up-sales
• Increased renewals
• Better license compliance
HGST, a Western Digital company, develops innovative, advanced hard disk drives, enterprise-class
solid state drives, and external storage solutions and services. CSC improved customer support and
product quality.
Solution ResultsChallenge
17. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Case Study
Network Rail manages the most of the rail infrastructure across Great Britain, responsible for control and maintenance of over 2,500 railway stations, 20,000
miles of track, and 40,000 bridges and tunnels. CSC provides a data and analytics hub for massive amounts of imagery and analog track
monitoring data.
• Network Rail needed a platform that could not only
store, but also analyze petabytes of data over the
long-term:
– Track imagery and video data captured via drones and
cameras
– Vibration data captured via maintenance trains
– Other forms of large file size analog data crossed with
operational, structured data sets
• Network Rail wanted to implement the solution
quickly, and ramp up data volumes at a fast pace
• Goal of leveraging combined services to assist with
loading data, managing the underlying
infrastructure, and working with and analyzing the
data
• CSC designed and configured the solution, built and
deployed it in the cloud, and developed ETL flows to
import massive amounts of bulk data on an ongoing
basis
– Core platform (BDPaaS) leveraging Hortonworks Data
Platform, including Hive with Tez
• CSC’s platform integrated with ESRI ArcGIS for Big
Data geolocation analysis features including
geotagging and geo tiles
• CSC managed the infrastructure, platform
components, and data flows, in addition to
providing continued support/consultation services
to the client
• Network Rail is generating insights on how to
prioritize in near real-time the improvement and
maintenance of the massive railway track and
infrastructure footprint
– Advanced analytics of analog data, including
geolocation capabilities
– Ability to handle the scale required by the massive
amount of data under management and data growth
– Complete transformation of a business unit’s analytics
capability on track for success in less than 12 months
SOLUTIONCHALLENGE RESULTS
Image
Files
YARN
HDFS
Hive
Hue
AWS S3
Object Storage
Hue
Hadoop-
ArcGIS
Connector
ESRI ArcGIS
Analog
Data
Geo
Info
PostgreSQL
PostGIS
ArcGIS
Geocortex
18. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Case Study
This Food & Hospitality Retailer has a footprint of over 650 regional hotels, 2,800 coffee shops, and a number of restaurant chains. CSC provides the
infrastructure, data platform, and analytics that uncovers revenue opportunities in customer web interactions.
• The client wanted to quickly evaluate the use of big
data and the value that it brings as it relates to
identifying new business opportunities
• Ease of use was a key need in making insights and
reporting more accessible to analysts… and
increasing the speed with which they could analyze
• Time to market was a key factor in the decision to
implement a comprehensive big data platform. The
client realized:
– A bare platform would not be easy
to manage
– Their staff does not possess the skills to operate a bare
platform
– They needed to focus on the
big data applications, rather than
the platform
• CSC designed and configured the solution, built and
deployed it in the cloud, and developed ETL flows to
transport web activity data within
90 days:
– Core platform (BDPaaS) leveraging Hortonworks Data
Platform, including Hive with Tez
– Aggregating various different data sources to create
one massive web log data set
– Adding data science algorithms to clean up data for
better insights
– Providing Pentaho Business Analytics as a
comprehensive reporting and dashboard suite for
insight presentation
• CSC managed the infrastructure, platform
components, and data flows, in addition to
providing continued support/consultation services
to the client
• The client is generating insights on how customers
interact with their website, and improving their
services for happier customers and more
streamlined business:
– Faster path to ROI with both tech and services
– Creating a real-time customer insights dashboard and
set of reports
– Ability to prove the value of big data internally through
the mining of data and generation of insights and
reports for various teams
– Scalability to more data sources and use cases,
including plans for mobile application analytics and
operational metrics, as well as operational business
analytics combining internal and external data sources
SOLUTIONCHALLENGE RESULTS
Food & Hospitality
Retailer
YARN
HDFS
Hive
Hue
PostgreSQL
(onboard)
Distcp
Hue
Pentaho Business Analytics
Logs
Pentaho Data Integration (PDI)
19. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Agenda
I. CSC BDPaaS Overview
II. CSC Approach
III. BDPaaS Architecture
IV. BDPaaS Security
V. Questions & Answers
20. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Big Data Platform Enables Insights in 30 Days
Cloud-Enabled
Scalable
Distributed
Powerful
Integration
Any Data Source,
Real-Time to Batch
World-Class
Managed
Operations and
Expert Services
Most Trusted
Security
Capabilities
APP 3
Flexible Deployment
Options
Public
Cloud
Virtual
Private Cloud
Dedicated
Cluster
Enterprise
Private Cloud
CSC Big Data Platform as a Service
APP 1
APP 2
REAL
TIME
BATCH
AD-HOC
Agile Application Development Environment that is Scalable, Sustaining, Self Healing
21. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AD HOCBATCH
Big Data Platform as a Service
Flexible Deployment Options
REAL-TIME
CSC Command and Control
Deployment
Center
Operations
Center
Support
Center
Application
Center
Knowledge
Center
Amazon Web
Services
CSC Hybrid Cloud
Services
CSC WebScale Dedicated Hardware
Enterprise Grade Security
Access
Control
Compliance
Support
Perimeter
Security
Activity
Monitoring
Audit
Logging
Encryption
Malware
Protection
Hardened OS
INTERACTIVE
Hive w/ Tez
Impala
HDFS, YARN, MapReduce, Spark
RELATIONAL
PostgreSQL
DOCUMENT
Elasticsearch
MongoDB
GRAPH
TitanDB
STREAM
Storm / Kafka
ETL Data Transformation Business Intelligence Data Mining Advanced Analytics Geolocation
COLUMNAR
HBase
Accumulo
DataStax
22. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Events
HTTP(S) / TCP / UDP
Files
Direct Upload / FTP / FTPS / SFTP
Streams Queries
Hadoop
Web
Listener
Command & Control
File Store /
Landing
Zone
Kafka Queue
Storm
HBase or
Accumulo
Tez or Impala
HDFS
MapReduce Hive Spark
Queries
Jobs
DataStax /
TitanDB
Elastic-search
or MongoDB
Splunk
FreeIPA + LDAP
Git
Jenkins
Agility Server
Puppet
Versioning
Control
ID Access &
Management
Monitoring &
Log File
Analysis
Continuous
Integration
Infrastructure
as Code
IT Policy &
Governance
Big Data PaaS – Standard Reference Architecture
23. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Command & Control
• $100M+ R&D investment
• 8+ years of R&D
• 25+ distinguished big data engineers
• 125+ related technology engineers (cloud, cybersecurity, etc.)
• Core committers to all major Big Data open source projects
Puppet
• Fully Automated Deployment
• Pre-built service orchestration scripts
• Pre-built integration connector
scripts
• Comprehensive Configuration
Management
Jenkins
• Automated, pre-built platform integration
tests
• Framework for app-level integration testing
Splunk
• Detailed Log Monitoring & Troubleshooting
• Complete activity monitoring & audit trail
• Comprehensive system monitoring and
alerting suite
FreeIPA +
LDAP
• User Account and Permissions
Management
• LDAP Integration
Git
• Platform and Application Version Control
• DevOps Push-Pull Application Code Delivery
Agility Server
• IT Policy & Governance Engine
• Hybrid Cloud Workload Interoperability
24. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Production
Data Flows
PRODUCTION
LOCAL
Maintenance Window
Push to Production
DEV / DR
Sample Data,
Partial/Full
Flows, or DR
Replication
Storm
Kafka HDFS
Hive
Impa
la
Elast
icsea
rchC&C
Storm
Kafka HDFS
Hive
Impa
la
Elast
icsea
rchC&C
VM/Sandbox
or “local node” environment
or “direct-dev” on BDPaaS
25. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Production
Data Flows
PRODUCTION
DEV / DR
Sample Data,
Partial/Full
Flows, or DR
Replication
Storm
Kafka HDFS
Hive
Impa
la
Elast
icsea
rchC&C
Storm
Kafka HDFS
Hive
Impa
la
Elast
icsea
rchC&C
• ADD OR REMOVE NODES
• RECONFIGURE NODES
• RECONGIFURE OVERALL CLUSTER
• ADD OR REMOVE CLUSTERS
• SCALE UP OR SCALE DOWN CPU, RAM, DISK
• ADD OR REMOVE ENVIRONMENTS
• ADD OR REMOVE NODES
• RECONFIGURE NODES
• RECONGIFURE OVERALL CLUSTER
• ADD OR REMOVE CLUSTERS
• SCALE UP OR SCALE DOWN CPU, RAM, DISK
• ADD OR REMOVE ENVIRONMENTS
26. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Tableau
ODBC
Kibana
API
RevolutionR
SAS
Bulk
Export
RHadoop/
ScaleR
Storm
Kafka HDFS
Hive
Impa
la
Elast
icsea
rchC&C
DR
BDRKfk Replicate
Terada
ta
Oracle
RDBM
S
Twitter
Logs
Video
Files
IBM
MQ
HDFS
Hive Impala
Elasticsearch
Command &
Control
Sqoo
p
Hue
Kfk-Hdp
Bulk Writer
Kfk-ES
Record
Writer
Storm
Kafka
Teradata
Connector
for Hadoop
HTT
P
(GNI
P)
HTT
P
Custom
Connecto
r
EBS Volumes
VPN
Amazon
S3
Amazon
IAM
Amazon
Storage Gateway
Direct Connect
Amazon
CloudFront
Amazon CloudFormation
AMI Service
Glacier
Ephemeral Local Drives
D2-Instances
R3-Instances
I2-Instances
C4-Instances
Amazon
RDS
C3-Instances
M3-Instances
27. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Why Public Cloud
• Higher Resource Efficiency for Increase Savings
• Significantly Greater Workload and Resource Flexibility
• More compatible with software-defined-everything approach
• Shared Services (image service, identity management,
object storage, block storage, telemetry, etc.)
• High Scale Cost Efficiency
• Hybrid Cloud Compatibility
Amazon Cloud Management
28. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Agenda
I. CSC BDPaaS Overview
II. CSC Approach
III. BDPaaS Architecture
IV. BDPaaS Security
V. Questions & Answers
29. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Guarded with Enterprise Grade Security
Change Management, Physical Security, Backups, Disaster Recovery, and more…
• Data at Rest Disk Encryption
• Encrypted Node-to-Node in
Flight Communication
• S3 Encryption & EBS Volume
Encryption -- AWS
• Secure transmission from
encrypted customer facility to
BDPaaS deployment
Disk & Network Encryption
• Deployment of a complete
platform stack
• Virtualization ready & Anti-Virus
ready
• Vulnerability Scanning,
Penetration Testing and Security
Patches
Std Operating Environment
• CSC Endpoint Security
(TrendMicro) – Hardware
security
• ClamAV -- Virtual Machine
security
• Tripwire -- File Integrity
monitoring
Malicious Code Protection
• Audit support -- HIPAA, PCI,
FISMA, ITAR etc
• Documentation support
• Compliance oversight
• Security enforcement / issue
resolution*
Compliance Support
• Splunk -- activity monitoring
and detailed system logging
• Cloudera Manager and Ambari -
- Hadoop configuration
information
• Puppet -- all non-Hadoop
component configuration
information
Activity Monitoring
• Free IPA -- centralized user
management and policy
controls
• LDAP/AD integration
• Kerberos option
• Apache Knox option
• Apache Sentry option
Access Control
• Secure VPN connections
• Isolated subnets
• Secure port management and
fine-grained port monitoring
• IP whitelisting & blacklisting
Perimeter Security
• ArcSight SIEM -- Security Event
Management
• Managed audit operations
personnel
• ArcSight via connector -- Splunk
Audit Logging
Ensuring Data, Application, Platform Security, and Meeting Regulatory Requirements
30. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Agenda
I. CSC BDPaaS Overview
II. CSC Approach
III. BDPaaS Architecture
IV. BDPaaS Security
V. Questions & Answers
31. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Questions and Answers:
CSC Website
http://www.csc.com/big_data/offerings/82345/
105621-
csc_big_data_platform_as_a_service_powered_
by_infochimps
TheSource
https://thesource.csc.com/Pages/Offerings/CSC-
Big-Data-Platform-as-a-Service.aspx
32. AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Thank You.
This presentation will be loaded to SlideShare the week following the Symposium.
http://www.slideshare.net/AmazonWebServices
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Hinweis der Redaktion
Big Data in The Cloud: Architecting a Better Platform
Big data technologies like Hadoop help tame the deluge of data by funneling machine and sensor data, busting organizational data silos, and connecting the Internet of Things. However, Hadoop isn't always the right tool for the job, and the public cloud is a very different operational model for architecting and managing a big data solution. In this session, we will review the architecture and technologies behind CSC's Big Data Platform as a Service on Amazon Web Services, how to achieve an agile approach to analytic app development, and how to ensure maximum security even in a public cloud environment.
Big Data in The Cloud: Architecting a Better Platform
Big data technologies like Hadoop help tame the deluge of data by funneling machine and sensor data, busting organizational data silos, and connecting the Internet of Things. However, Hadoop isn't always the right tool for the job, and the public cloud is a very different operational model for architecting and managing a big data solution. In this session, we will review the architecture and technologies behind CSC's Big Data Platform as a Service on Amazon Web Services, how to achieve an agile approach to analytic app development, and how to ensure maximum security even in a public cloud environment.
How many of you have worked with Hadoop.
How many of you have worked with CSC
This is our opportunity to educate you on our platform and our capabilities.
A recent Gartner survey, highlights that 87% of enterprises believe Big Data & Analytics will redefine the competitive landscape of their industries within the next three years. With that kind of business pressure, we’re seeing more and more organizations focused on Analytics out of competitive necessity…and the opportunity for CSC is enormous.
Today, our customers are finding themselves in various states of maturity…
Big Data 1.0: If we go back a few years, we lived in a very different world…a world characterized by traditional relational databases and business intelligence suites…a world of highly structured data sets, enabled through the use of very costly hardware and proprietary software.
But look at what’s happened….
Companies such as Terradata, a name synonymous with EDW has lost over $6B in Market Cap over the last 2 years
Oracle has seen decreased license revenue for the traditional RDBMS over the last 6 consecutive quarters.
Why has this occurred??
We’ve seen this massive influx of of the amount of data…new unstructured data sets, and the economics simply no longer work…
As volumes of data storage increases, so do requirements for compute capabilities to process these larger volumes, you need more and more storage…all driving costs up while at the same time IT budgets are shrinking or remaining flat. …a new model is needed…
Big Data 2.0: So how did the traditional IT vendors initially attempt to solve these compute/resource related challenges…with analytic appliances…yet with mixed results
Let’s look at SAP….they generate $22B in revenue each year…only about 3% or $600M comes from the sale of SAP HANA. They’re now looking at ways to bundle these capabilities directly into their software suite.
These solutions definitely add value…in very specific ways…but at a cost premium and requiring specialized skills…so adoption rates have been mixed at best.
Big Data 3.0: These dynamics have led many companies to the adoption of OpenSource technologies. We’re not seeing wholesale replacements of traditional data storage and analytic technologies….no one is willing to throw away their current investments. Instead the immediate future is about finding ways to operate effectively, operate seamlessly in this new hybrid world that merges the legacy, highly structured data sets of the past with that of the real time demands of the digital business economy.
Here’s the really good news for systems integrators like CSC…more and more organizations are realizing that they cannot succeed on their own. They need our help… They’re struggling with how to quantify the value of an engagement, they’re struggling with keeping up with the skill necessary to be successful, they’re struggling with how to overcome challenges around data security.
…We have an open invitation to bring more value.
- This should be common knowledge to everyone but it’s important to start here, the foundation crux of the problem we’re solving
- This allows clients to incrementally evolve their business as quickly as they need/want to
- Result of two surveys we’ve conducted the last couple years with several thousand CIOs from all over the world
CIO’s cannot afford to be laggards in this space. Across a mix of Enterprise, Large and Mid Sized companies, 53.4% had Big Data Solutions in operation with 23.8% in the planning stage according to an EMA & 9 Sight Consulting survey
Many organisations attempt to design, build and operate their own Big Data infrastructure. Research by Sandhill Group found that organizations face three significant challenges when they try to implement their own Hadoop / Big Data initiatives:
knowledge and experience (65.2%)
skills availability (52.6%)
development effort : the amount of technology development and engineering required (40.7%)
These challenges delay business value and create very real barriers:
Complex Data: The explosion of data has created large, complex data sets that traditional tools can’t handle such as multiple, difficult-to-leverage heterogeneous data sources
Robust and Scalable service: managing NoSQL and open source advanced analytics technologies to provide a stable, secure, robust and scalable production service at an acceptable cost is challenging. Skill and experience is required to ensure POC and initial analytic applications can scale and perform as needs change.
Speed of Stand up: In order for use-case oriented projects to have a high impact, fast timelines must be met and costs must be aligned with projected returns.
The technologies involved are novel requiring very different skills to traditional BI and analytics tools:
Skills Shortage: The required blend of technical and business capabilities is hard to find. The challenges of implementing, integrating and operating these new technologies is distracting organizations from their primary interest which is extracting new business insights to improve organizational performance.
Skills Retention: SandHill Group found “it is often a time-consuming process to cultivate the capabilities internally. Frequently, redirecting existing IT staff process to be a skills and cultural mismatch. “If re-training and skills development actions are successful, retaining these skills in a hot jobs market is tough.
"Gartner Analyst Carol Rozwell at the Gartner Business Intelligence & Analytics Summit 2014 said: "Through 2017, compensation for professionals with Big Data and related analytic skill sets will remain 20-30 percent higher than for other business skills. "
Sources:
Operationalizing the Buzz: Big Data 2013, An Enterprise Management Associates (EMA) and 9 Sight Consulting Research Report, November 2013
Do you Hadoop? A Survey of Big data Practitioners, Bradley Graham, M Rangaswami, SandHill Group, October 2013
CSC Analytics Services Enable a Maturity Roadmap for Using Insights to Make Decisions and Take Actions on All Levels of the Organization
Many clients find they are focusing most of their energy and resources on getting the technical foundation right, or on keeping their information management estate up to date to keep up with user demands. This limits the ability to move into newer areas such as incorporating more varied types of data or more sophisticated analytics.
As competitive pressures rise, the need to use more sophisticated information insights becomes a necessity for the organization. However with no let up on daily information needs it the inability to update the information infrastructure and to free up resources to learn new skills and techniques becomes unsustainable.
Opening Phrase - There are only a select number of “fundamental things” you can do with data, and from a computer’s standpoint it doesn’t care what that data “actually is” because it’s really just ones and zeros to a computer.
CSC BDPaaS is designed to enable clients to acquire and apply insights through batch, fine-grained, interactive, and real-time streaming analytics through a fully integrated and managed big data environment, delivered as-a-Service and deployed in under 30 days.
There is no “one-size-fits-all” and no swiss army knife – that’s why we made a platform designed to be flexible and able to do everything. You just need to add on the appropriate accelerators and packages. So in a way, we made the swiss army knife.
Delivered and priced in an as-a-Service model
Utilizing advanced web scale technologies
Enables application developers to quickly develop, test and deploy
Supports any combination of ad-hoc, batch and real-time analytics
Offers enterprise class scaling and performance
Other Security & compliance components – Activity monitoring, Alert Management, Backups
What does this mean to the client??
Accelerates client’s time to market
Mitigates Risk of Big Data Technology
Provides Extremely cost competitive TCO
Enables flexibility
Our IP within Puppet brings everything together to create a agile, secure, and enterprise ready Big Data Analytics System
Extremely fast deployment
Enterprise class infrastructure design (built into our puppet scripts) with best-in-class technologies
Extremely fast integrations
Robust security suite
Globally distributed and standardized managed services to keep costs low
aaS model
Open Source – no vendor lock-in
Vulnerability Scanning, Penetration Testing, and Security Patches
OS and application vulnerability scanning and penetration testing for every major and minor release published by R&D
Source code scanning for vulnerabilties
Security patches applied as required as part of the managed service
Change Management
ITIL-compliant process for managing change in the system
All changes are tracked
All significant changes must go through a Change Authorization Board review process
Physical Security & Managed Services
Employee Background checks
Encrypted laptops
Secure key and password management
Locked down physical facilities with active monitoring
Continuous, ongoing training for security, best practices, and compliance-readiness activities