http://sg.com.mx/sgce/2013/sessions/el-contexto-la-integraci%C3%B3n-masiva-datos
Los ejecutivos de las áreas de TI saben con certeza que la información de negocio más importante, se encuentra escondida en billones de eventos de seguridad. La habilidad de integrar datos para obtener una fotografía clara de la situación actual, es esencial en la manera que hoy día se detectan los ataques clandestinos. Basado en la colección, manejo y análisis; la seguridad de los datos puede ser un gran activo o un enorme dolor de cabeza.
Los desafíos de las llamadas soluciones “SIEM legacy” combinadas con metodologías de inteligencia en seguridad, pueden llevar su organización al siguiente nivel cuando ataques internos y externos se presentan, siempre en cumplimiento reportando, administrando y entregando un valor excepcional y rentabilidad. Conozca como responder ante las necesidades del Big Data mediante la integración de inteligencia global de amenazas (GTI).
4. >15000 Millones
Dispositivos Conectados2
(15B)
1. IDC “Server Workloads Forecast” 2009. 2.IDC “The Internet Reaches Late Adolescence” Dec 2009, extrapolationby Intel for 2015 2.ECG “Worldwide Device Estimates Year 2020 - Intel One Smart Network Work” forecast
3. Source: http://www.cisco.com/assets/cdc_content_elements/networking_solutions/service_provider/visual_networking_ip_traffic_chart.html extrapolatedto 2015
En 2015… Mayor demanda para los Data Centers
>1000 Million Mas
Netizen’s1
(1B)
>1 Zetabyte Tráfico
en Internet3
(1000 Exabytes)
5. Source: IDC, 2011 Worldwide Enterprise Storage Systems 2011–2015 Forecast Update.
Worldwide Enterprise Storage Consumption Capacity Shipped by Model, 2006–2015 (PB)
2.7 ZB de datos en 2012, 15,000 milliones de dispositivos conectados en 2015
Al rededor de 24 Petabytes
De datos procesados por Google* al día en 2011
4,000 milliones
Piezas de contenido compartidas en Facebook* cada
día (Julio 2011)
250 milliones
…de Tweets por día en Octubre de 2011
5.5 milliones
Emails (legítimos) por segundo en 2011
Una explosión de datos
6. Más datos…
6
En 2020, el volumen de información será de 35.2 Zettabytes
En el 2020, el volumen de información digital alcanzará los 35.2 Zettabytes (1 ZB
es igual a 1 trillón de GB), frente al 1.8 ZB de 2010. Ese crecimiento exponencial
de los datos hace de Big Data la fuerza motriz de la era de la información, de
acuerdo con estimaciones de Sogeti, compañía del Grupo Capgemini.
Por su parte, la consultora Gartner afirma que las empresas capaces de tener
información más valiosa, procesarla y administrarla, obtendrán resultados
financieros un 20% mejor que sus competidores.
7. Un caso
El New York Times usó 100 instancias de Amazon EC2
y Hadoop para procesar 4 TB de datos en imágenes
TIFF y obtener 11 millones de PDFs en 24 hrs a un
costo de $240 usd
http://en.wikipedia.org/wiki/Apache_Hadoop
8. Otro caso
Los clusters para Hadoop en Yahoo! cuentan
con 40,000 servidores y almacenan 40
petabytes de datos, y donde el cluster mayor es
de 4,000 sevidores
http://www.aosabook.org/en/hdfs.html
9. Solo un caso más
En 2010 Facebook declaró que tenía el cluster
de Hadoop mas grande del mundo con 21 PB.
En 2011 anunció que había crecido a 30PB y
hacia la mitad de 2012 alcanzó los 100PB. En
Noviembre 8, 2012 ellos anunciaron que su
almacen de datos crece casi la mitad de un PB
por día.
http://en.wikipedia.org/wiki/Apache_Hadoop
10. Big Data
10
Es un término aplicado a conjuntos de datos que superan la capacidad del
software habitual para ser capturados, gestionados y procesados en un
tiempo razonable. Los tamaños del “Big Data" se encuentran
constantemente en movimiento creciente, de esta forma en 2012 se
encontraba dimensionado en un tamaño de una docena de terabytes hasta
varios petabytes de datos en un único data set.
Los retos incluyen la captura, el procesamiento, el almacenamiento, el
compartir inteligencia, el análisis y la visualización.
Beneficio para el sector Salud, Financiero, Telcos, Energía, Tráfico, Marketing,
Manufactura, Seguridad… quién hará la pregunta correcta?
11. The four Vs
11
• Volume. When the term big data is used, data volume typically ranges multiple terabytes
to petabytes. This certainly fits the enterprise security model as it is not uncommon for
large organizations to collect tens of terabytes of security data on a monthly basis.
• Velocity. This term is used with respect to real-time data analysis requirements. In
cybersecurity, velocity can refer to the need for immediate anomaly, or incident
detection. Real-time data analysis is critical here to minimize damages associated with a
cybersecurity attack.
• Variety. Big data can be made up of multiple data types and feeds including structured
and unstructured data. From a security perspective, data variety could include log files,
network flows, IP packet capture, external threat/vulnerability intelligence, click streams,
network/physical access, and social networking activity, etc. It is not unusual for
enterprises to collect hundreds of different types of data feeds for security analysis.
• Veracity. Big data must be trustworthy and accurate. From a security perspective, this
means trusting the confidentiality, integrity, and availability of data sources like log files
and external data feeds.
12. Thousands of Events
The Big Security Data Challenge
BILLIONS OF EVENTS
Correlate Events
Consolidate Logs
Perimeter
APTs
Cloud
Data
Insider
BILLIONS OF EVENTS
13. The Security Dilemma
MONITORING TECHNIQUES MUST ADVANCE
VISIBILITY
INSTRUMENTATION
Instrumentation and data collection are still critical, but applying filters derived
from intelligence is the path to achieving better security.
14. Big Data vs. Big Security Data
Datasets whose size and variety is beyond the ability of typical
database software to capture, store, manage and analyze.
Understanding Security Data As Big Data
• How do I gather security context?
• How do I manage big
security information?
• How do I make security
information management work?
BIG DATA
BIG SECURITY DATA
• Size of Security Data doubling
annually
• Advanced threats demand
collecting more data
• Legacy data management
approaches failing
• SIEM use shifting from
compliance to security
Security Big Data is about matching security intelligence with the right collected data.
15. Gartner says…
• The amount of data analyzed by enterprise
information security organizations will double every
year through 2016.
• By 2016, 40% of enterprises will actively analyze at
least 10 terabytes of data for information security
intelligence, up from less than 3% in 2011.
• By 2016, 40% of Type A enterprises will create and
staff a security analytics role, up from less than 1%
in 2011.
16. Goal…
One of the primary drivers of security
analytics will be the need to identify when
an advanced targeted attack has bypassed
traditional preventative security controls
and has penetrated the organization.
17. Needle in a Datastack
17
• Organizations are storing approximately 11-15 terabytes of security data a week.
• The ability to detect data breaches within minutes is critical in preventing data loss, yet
only 35 percent of firms stated that they have the ability to do this.
• In fact, more than a fifth (22 percent) said they would need a day to identify a breach,
and five percent said this process would take up to a week. On average, organizations
reported that it takes 10 hours for a security breach to be recognized.
• Nearly three quarters (73 percent) of respondents claimed they can assess their
security status in real-time and they also responded with confidence in their ability to
identify in real-time insider threat detection (74 percent), perimeter threats (78
percent), zero day malware (72 percent) and compliance controls (80 percent).
However, of the 58 percent of organizations that said they had suffered a security
breach in the last year, just a quarter (24 percent) had recognized it within minutes. In
addition, when it came to actually finding the source of the breach, only 14 percent
could do so in minutes, while 33 percent said it took a day and 16 percent said a week.
The study, conducted by research firm Vanson Bourne, interviewed 500 senior IT decision makers in January 2013, including 200 in the USA and 100 each in the UK, Germany and Australia.
18. Datos útiles…de Verizon 2012
18
• “84% de los incidentes de seguridad (intrusiones
exitosas) se han reflejado en los logs”
• “Sólo el 8% de los incidentes de seguridad
detectados por las empresas han sido por minar
sus logs”
20. What else happened at this time?
Near this time?
What is the time zone?
What is this service? What other
messages did it produce?
What other systems does it run on?
What is the hosts IP address?
Other names? Location on the
network/datacenter?
Who is the admin? Is this
system vulnerable to exploits?
What does this number
mean? s this documented
somewhere?
Who is this user? What is the users
access-level? What is the users
real name, department, location?
What other events from this user?
What is this port? Is this a
normal port for this
service? What else is this
service being used for?
DNS name, Windows name, Other names?
Whois info? Organization owner? Where does
the IP originate from (geo location info)? What
else happened on this host? Which other hosts
did this IP communicate with?
SIEM is Still Evolving …Beyond Logs
21. SEM + SIM = SIEM
SIEM is the Evolution and Integration of
Two Distinct Technologies
Security Event Management (SEM)
― Primarily focused on Collecting and
Aggregating Security Events
Security Information Management (SIM)
― Primarily focused on the Enrichment,
Normalization, and Correlation of
Security Events
Security Information & Event
Management (SIEM) is a Set of
Technologies for:
Log Data Collection
Correlation
Aggregation
Normalization
Retention
Analysis and Workflow
1 2 3
Three Major Factors Driving the Majority of SIEM Implementations
Real-Time
Threat Visibility
Security
Operational
Efficiency
Compliance and/or Log
Management Requirements
22. The State of SIEM
Antiquated Architectures Force
Choices Between Time-to-Data
and Intelligence
Events Alone Do Not Provide
Enough Context to
Combat Today’s Threats
Complex Usability and
Implementation Have Caused
Costs To Skyrocket
00001001001111
11010101110101
10001010010100
00101011101101
VS
Legacy SIEM REALITY:
Turns Security Data Into
Actionable Information
Provides an Intelligent
Investigation Platform
Supports Management and
Demonstration of Compliance
SIEM Promise:
23. Shifting from Compliance to Security
23
Source: InformationWeek 2012 Security Information and Event Management Vendor Evaluation Survey of 322 business technology
professionals, April 2012
25. Medium Risk High Risk
Global Threat Intelligence and SIEM
McAfee Labs
IP Reputation Updates
GOOD SUSPECT BAD
IP REPUTATION CHECK
Botnet/
DDos
Mail/
Spam
Sending
Web
Access
Malware
Hosting
Network
Probing
Network
Probing
Presence
of Malware
DNS
Hosting
Activity
Intrusion
Attacks
AUTOMATIC IDENTIFICATION
AUTOMATIC RISK ANALYSIS
VIA ADVANCED CORRELATION
ENGINE
26. GTI with SIEM Delivers Even Greater Value
Sorting Through a Sea of Events…
200M events
18,000 alerts
and logs
Dozens of
endpoints
Handful
of users
Specific files
breached
(if any)
Optimized
response
RESPOND
Have I Been Communicating With Bad Actors?
Which Communication Was Not Blocked?
What Specific Servers/Endpoints/ Devices Were Breached?
Which User Accounts Were Compromised?
What Occurred With Those Accounts?
How Should I Respond?
32. McAfee ESM
McAfee Starts at the Core
July 9, 2013
32
McAfee DB
• Real-time, complex analysis
• Indexing purpose-built for SIEM
• Massive context feeds with enrichment
• Historical retrieval and analytics
• Integrated log and event management
• No DBA required
SMART FAST
Scale, Analytical flexibility, Performance
36. Conclusiones…
• Usar y encender tus Logs
• Primero un Log Mgmt antes que un SIEM
• No hay “balas de plata”
• Gana el pensamiento vs la tecnología
• Menos es más
• Windows Events Logs
• Syslogs
• DNS
• App Logs
• Context Awareness (Geolocation, Users, VM, Asset Mgmt, etc)
• Casos de uso , caso de uso, casos de uso!
• Arquitecturas de Big Data
• Alta velocidad (I/O), horas para ver un reporte? O minutos para una vista?
• Feeds de Seguridad (Sistemas de reputación)
• Seguridad Interconectada
• IP mala de reputación automáticamente bloqueada por el IPS.
• Equipo que tuvo contacto con IP maliciosa ser analizado desde el SIEM
37. “If you’re in a fight, you need to know that while it’s happening, not after the fact”