Harvey Nash USA Webinar: The Big Opportunity of Big Data
1. #HNBigData
The Big
Opportunities
of Big Data
October 27, 2011
2. #HNBigData
Anna Frazzetto
SVP Technology Solutions, Harvey
Nash
20+ years in IT
Offshoring & outsourcing road warrior
Member: HDI Strategic Advisory Board
Published author & industry presenter
22
3. #HNBigData
About Harvey Nash
Unique portfolio of services
– Executive Search
– IT Professional Services
– IT Staffing
Strong track record, founded more than 20
years ago
– 13 years on the London Stock Exchange
– 2009 revenues: over $750 million
Global footprint: USA, Europe, Asia
– Over 5,000 people working across 39 offices worldwide
– 4,800 resources in Vietnam
– Serving leading global enterprises, small and mid-size
businesses, governments and global institutions
33
5. #HNBigData
CIO Survey 2011
The 2011 Harvey Nash Survey
explored innovation as it relates
to the CIO role
– The CIO is also seen as the dominant
executive in the organization for
driving innovation
– Mobile technology and private cloud
services are the most widely
adopted technologies driving
current innovation with 22% of
global CIOs already using them to
exploit core business activities
55
6. #HNBigData
Innovation CIOs
A CIO is more likely an innovator if he/she:
Reports to the CEO . . . . . . . . . . . . . . . . . . . . . . . . . . 25%
Has realistic aspirations to become a CEO . . . . . . . . 26%
Has had budget increases in the last year . . . . . . . . 18%
Is a member of the operational board . . . . . . . . . . . 22%
Is focused on competitive advantage vs. savings . . 29%
Finds his/her job very fulfilling . . . . . . . . . . . . . . . . . 24%
Invests heavily in team training . . . . . . . . . . . . . . . . 36%
66
The growing dynamics and influences of the cloud, mobility and "Big Data" are indeed changing the nature of IT's role in the new global enterprise. They are also changing the nature of (and requirements for) the "New IT Leader", as well. All CIOs now stand at a cross-road, and the path taken will definitely have a huge influencing effect on both the potential success of our respective enterprises, as well as the lasting personal brands of each of us. What personal take-away’s must we all focus on and adopt concerning “The New Enablers” in order to sustain a positive legacy?
First there’s me, Anna Frazzetto (Anna to cover her own bio) .
Anna to review Harvey Nash—who we are….
Anna to review Harvey Nash clients…
To introduce Doug’s topic of Big Data, I want to share with you the insights from the Harvey Nash CIO Survey that consistently led to exciting cloud computing and innovation debates with our CIO Panels nationwide. Every survey discussion we held began with the topic of innovation, which was an important focus of the CIO Survey this year. We looked at how important innovation was to IT leaders and their businesses today. The survey showed that CIOs are front and center in innovation efforts:Survey results revealedthat the CIO is seen as the dominant executive in the organization for driving innovation and that mobile technology and private cloud services are the most widely adopted technologies driving current innovation with 22 percent of global CIOs already using them to exploit core business activities.The survey’s intensive look at innovation helped us identify some of the common traits of CIOs who are top innovators. What are they?
Here is what we learned. A CIO is more likely an innovator if he/she:Reports to the CEO . . . . . . . . . . . . . . . . . . . . . . . 25%Has realistic aspirations to become a CEO . . . . . . . . 26%Has had budget increases in the last year . . . . . . . . 18%Is a member of the operational board . . . . . . . . . . . 22%Is focused on competitive advantage vs. savings . . . . 29% Finds his/her job very fulfilling . . . . . . . . . . . . . . . 24%Invests heavily in team training . . . . . . . . . . . . . . . 36%
Harvey Nash was even able to create categories of CIOs based on the survey results: Innovation CIOs and Utility CIOs. Innovation CIOs tend to focus the majority of their work on driving business strategy and competitive advantage whereas Utility CIOs focus more on the operational excellence, cost savings and increasing efficiency. So let’s take this poll. Which does your business have?Would you describe the CIO of your business as an Innovation CIO or a Utility CIO?Innovation CIOUtility CIO
(Anna to explain results of poll as they come in)And while Innovation CIOs and Utility CIOs may approach their roles differently, both groups today are keenly aware of the need to innovate and the vital leadership role they as CIOs must play in embracing new, innovative technologies in order to push the business forward. For example, the survey found that…Most CIOs believe that to achieve business objectives set by the CEO, technology innovation must be at the heart of their strategy. 74% of respondents believe that if they don’t innovate and embrace new technology, their companies will lose market share.Only one-third of CIOs (34 percent) think they have achieved anything close to their innovation potential.
So what led all of this innovation talk toward the cloud and big data debates. As you can see from this word cloud, cloud computing was mentioned the most by CIOs throughout the survey as a technology that would increase competitive advantage and support innovation efforts. At our CIO Forum events, cloud was always the number one topic of conversation and I can say at our last New York City CIO Forum it was almost exclusively the topic of discussion. Innovation initiatives and the cloud computing push are tightly linked today.Clearly CIOs want to embrace their role in leading business innovation and for some that has meant embracing the cloud. For others cloud computing is not the path they’ll be taking yet the challenge of rapidly expanding data remains. Where will the data go? How can a business manage and leverage it for competitive advantage? Like the ever-expanding exabytes of data businesses around the world produce each day, the challenge of big data is a perplexing one that has CIOs both excited and anxious about their increasingly prominent role in business innovation. It is with those two issues in mind—the growing role of CIOs in strategic business innovation and the growing challenge of big data—that we invited Doug Harr here to share his expertise. A CIO himself who has watched the role evolve while working on the bleeding edge of big data solutions, Doug offers us a unique look out how IT can provide smart answers to a complex and sweeping business challenge. With that said, I welcome Doug Harr, CIO of Splunk and hand this Webinar over to him.
I am the CIO at Splunk having had some success in this same role for the last 10 years at other software companies such as Ingres and PortalI started my career at Hewlett Packard after graduating from Cal Poly SLO.In the 90’s I ran a consulting division for the midmarket that implemented Oracle Financials applications all over the western states at companies such as Symantec, Activision, and others.We were the first partner for Oracle Business Online in 1998 when the vision there was to run a company’s financials just like a utilityWhile this largely became a hosting business, others such as Salesforce, SuccessFactors, Netsuite and Workday followed.I’ve been passionate about putting business applications in the cloud now for most of the last 10 years, to the point where we run all of our business apps in the cloudIn my spare time, I am an avid music collector and concert goerI started by career before the term CIO existed – and have always marveled at how frequently we practitioners talk about just what a CIO is. How is the CIO role intended to be different that the head of IT” – what is the CIO supposed to be doing? I believe our new CFO will sit and ponder that same question, what IS Doug doing?
CIO can and has meant many things over time besides “Chief Information Officer”I argue that today, the most important “I” in CIO is intelligenceIn fact I would propose that there are three main objectives for the new Cloud-market economy CIO…
MANAGE SERVICES/TECHNOLOGY INVESTMENTSPROTECT INFORMATION ASSETSIMPROVE BUSINESS OUTCOMESYou need to bring more insight into your services, operations, customers, partners, ecosystem, than your customers and lines of businesses can aloneHarness the power of transactional business intelligence, along with mountains of machine data –to bring operational intelligence
Don’t want to sound too “new age” but several key transformationsare needed to focus CIO’s to a new delivery modelNeed to go from reporting on historical business transactions to ensuring customer experience in real time.Need to go from answering how did the business do last month, to how well does the business run? Our sales channels, our customer support, or web presence
There are three macro forces shaping the IT environment – big data, cloud, and Consumerization of IT. IT is positioned to deliver significant value to our customers so that you can benefit from these strong secular trends.Cloud is no longer a technology for someone else – or just the early adopters. Growth in Cloud spend is outpacing on=premise spend – give the stats. Companies are realizing the efficiencies of scale – by developing private clouds, by pursuing SAAS-based services, by pursuing development projects in the Cloud, and a growing deployment of enterprise apps in the public Cloud.Big Data! If I read one more useless article about Big Data…..) IT is perfectly positioned to enable core technology in the age of Big DataBig Data is driving a resurgence in the B.I. market, as everyone is now asking “how do I analyze all that data to get value out of it”? A few examples:Telco are analyzing CDRs for fraud analysis, churn analytics, rate plan and even tower/route optimization.web commerce - using customer site traffic info to analyze product trends and marketing effectiveness.Software companies like Salesforce analyze customer usage patterns on Chatter so that they can build a better collaboration engine. Consumerization of ITIf you had asked me five years ago whether there would be a secular shift in IT towards Consumerizaiton, I would have said no. We were coming through five years of standardization - which was driving consolidation of the enterprise software space. Consolidation was impacting everything from applications to middle ware to the data and storage level, and we were on a path towards the 5 giant companies that we know and love.Mobile is a major factor in the Consumerization of IT – just look at the xx M iPads that have been sold in the past year. Go to any board meeting, and all the (old) board members have their iPhones and iPads, and IT is forced to support them – along with RIMM and Android. And the iPad success is driving a renaissance of Macs in the corporate world.This trend of mobile + tablets is putting more pressure on IT to support multiple platforms – with instant access and high securityThe forces of consolidation are still very much in play at the data center, and server virtualization is fairly mature at this point. At the same time that scale is forcing consolidation, virtualization, and standardization at the data center, we are seeing explosion and fragmentation at the end user level. Now those same IT professionals have to deal with multiple devices, operating systems, and security issues for a fragmenting end user environment. One way that IT is approaching this new complexity is to abstract the desktop and/or mobile device – through Desktop virtualization.These are big forces altering the IT landscape – The Cloud coming of age, Big Data happening but early stages, and machine data is at the heart of it, and Consumerization of IT, driven by Mobile usage and now creating a new Virtualization market.
There are three macro forces shaping the IT environment – big data, cloud, and Consumerization of IT. IT is positioned to deliver significant value to our customers so that you can benefit from these strong secular trends.Cloud is no longer a technology for someone else – or just the early adopters. Growth in Cloud spend is outpacing on=premise spend – give the stats. Companies are realizing the efficiencies of scale – by developing private clouds, by pursuing SAAS-based services, by pursuing development projects in the Cloud, and a growing deployment of enterprise apps in the public Cloud.Big Data! If I read one more useless article about Big Data…..) IT is perfectly positioned to enable core technology in the age of Big DataBig Data is driving a resurgence in the B.I. market, as everyone is now asking “how do I analyze all that data to get value out of it”? A few examples:Telco are analyzing CDRs for fraud analysis, churn analytics, rate plan and even tower/route optimization.web commerce - using customer site traffic info to analyze product trends and marketing effectiveness.Software companies like Salesforce analyze customer usage patterns on Chatter so that they can build a better collaboration engine. Consumerization of ITIf you had asked me five years ago whether there would be a secular shift in IT towards Consumerizaiton, I would have said no. We were coming through five years of standardization - which was driving consolidation of the enterprise software space. Consolidation was impacting everything from applications to middle ware to the data and storage level, and we were on a path towards the 5 giant companies that we know and love.Mobile is a major factor in the Consumerization of IT – just look at the xx M iPads that have been sold in the past year. Go to any board meeting, and all the (old) board members have their iPhones and iPads, and IT is forced to support them – along with RIMM and Android. And the iPad success is driving a renaissance of Macs in the corporate world.This trend of mobile + tablets is putting more pressure on IT to support multiple platforms – with instant access and high securityThe forces of consolidation are still very much in play at the data center, and server virtualization is fairly mature at this point. At the same time that scale is forcing consolidation, virtualization, and standardization at the data center, we are seeing explosion and fragmentation at the end user level. Now those same IT professionals have to deal with multiple devices, operating systems, and security issues for a fragmenting end user environment. One way that IT is approaching this new complexity is to abstract the desktop and/or mobile device – through Desktop virtualization.These are big forces altering the IT landscape – The Cloud coming of age, Big Data happening but early stages, and machine data is at the heart of it, and Consumerization of IT, driven by Mobile usage and now creating a new Virtualization market.
The amount of data in our world is exploding. There are more applications, devices, servers, machines. There are more layers of complexity – virtualization, cloud computing. 5 bn mobile phones in 2010. Anything from 65%-120% CAGR (depending on the report you read).Couple of interesting comments from the analyst community: <read>Most of the data being generated is machine data – the fastest growing, most complicated and most valuable segment of big data.
Machine data holds critical operational information into user behavior, security risks, capacity consumption, peak usage times, fraudulent activity, customer experience and much more. Making use of machine data can provide significant benefits to nearly every enterprise. Here are a few examples: Transaction monitoring for online businesses providing 24x7 operationsWeb activity and web asset usage data to improve understanding of customers, capacity implications, and tracking of digital assetsService level monitoring information from managed service providers to help fulfill internal agreements with the businessCall and event detail records to uncover keys to more profitable services for communications service providersMobile data to better understand customer location and behaviorsMonitoring social media networks to identify spot trends and sentiment analysis
In itself, the sheer volume of data is a global phenomenon. Data is growing exponentially. But, the ‘bigness’ of the data is only part of the problem, and perhaps the least interesting:It’s not just the volume that is growing. More interestingly, the diversity and dynamic nature of the data is growing at an even faster rate. For example, cloud, mobile, logistics, manufacturing, power/utilities, healthcare, etc.Over the past few decades the number of machines generating data has grown even faster than that of the data itself.Traditional tools weren’t designed for the diversity and volume at the growing scale. Not without extensive design, consulting and ongoing maintenance work. With enough time and money you can make anything work, but out of the box, handling both volume and diversity is challenging. Volume and diversity are interesting but the real trick is how long does it take to get from when I think of a question to when I get an answer?And I don’t mean how long to run a report but I mean from when I think of something totally new.Over the years as data volume and diversity grow, the time to answer question is dropping as fast as the data is growing.It’s about more than just the BIGness – it’s the variety, the volume and the fact that the business needs answers faster.
Data is being created more rapidly than ever before. In terms of volume, Eric Schmidt eloquently puts it. “Between the birth of the world and 2003, 5 exabytes of data had been created. Now we create 5 exabytes every 2 days.” The majority of this is machine data.Over time, IT has developed as “silos” of systems, focused on specific technologies, functions, departments, groups of systems and people, etc. As a consequence, IT ends up being managed as silos, with narrow, focused tools that provided a limited view of what’s really going on. What’s more, all IT systems in these silos generate data. This machine-generated IT data or “exhaust” contains a categorical record of behavior – behavior of customers, user transactions, networks, servers, applications, and more. This data helps diagnose and fix issues, but is also a source of critical intelligence for the business.Even purported, “Single Panes of Glass”, like SIEMs, Application Performance Management, Event Correlation and Analysis systems and Data Warehouses, don’t provide the complete picture, because they aren’t designed for the full scope of this data.Today’s IT management tools, security solutions and even business intelligence systems are NOT designed to leverage the full scope of machine-generated IT data. Data which is non-standard, unstructured, high volume and generated every millisecond of every day.
The common theme with logs is that they lack standards. Let’s look at logs generated by custom applications. Applications run in environments that are complex made up of diverse technologies used over time – client/server, revamped legacy, packaged applications, Java EE and .NET, SOA-based, etc.Most homegrown and packaged applications write local logfiles, often via logging services built into middleware - J2EE application servers like Weblogic, Websphere and JBoss, .Net, PHP and more. These files are critical for day-to-day debugging of production applications by developers and application support. They're also often the best way to report on business and user activity and detect fraud scenarios, since they have all the details of transactions. When developers put timing information into their log events, they can also be used to monitor and report on application performance.However, log messages are unpredictable and highly variable even when coming from a single source. Custom applications can run new code, which result in entirely new log events at any time. Application log files often span multiple lines, an issue for most traditional log management solutions.There is little to no visibility of applications running within virtualized environments or the cloud and access restrictions to production systems impact the ability of Tier 1 help desk staff or developers to diagnose and fix problems.This is a significant challenge for most traditional management tools, which rely on an upfront schema, building custom parsers for known vendor formats, instead of supporting the dynamic nature of application logs. Splunk started first with this custom application logging challenge, designing an approach which indexes all IT data, regardless of format or location. Splunk doesn’t rely on traditional RDBMS technology, so there are no schemas which break as soon as new log formats are introduced. And unlike the majority of other log vendors, Splunk can also handle multi-line logs.
The lack of standards transcends all log sources, from custom applications to logs from packaged applications, servers and network devices.There is literally an A-Z list of vendors each generating a multitude of formats, even for the same device.Timestamp formats all over the map, entries are structurally different, the data itself is different!Traditional approaches to log management require custom parsers to translate this plethora of formats, types and sources, because they have been built and designed for a more static world of IT. Splunk indexes all your IT data and getting data into Splunk is easy. You can:Stream data to Splunk over any TCP or UDP network portRead live files on any mounted file systemBatch upload files to a spool directory on your Splunk serverRun Splunk as a lightweight forwarder to collect many different sources of data and forward them over a single, secure network port to one or more Splunk servers
Logs contain the record of activity for IT components, such as applications, servers and network devices. They can tell you about your IT infrastructure and the behavior of your users and would-be attackers.Log data is vital to maintain uptime, ensure system security and meet compliance mandates. Every environment has its unique footprint of IT data that can be leveraged.Examples of critical activities that use logs: Monitoring for issues before customers’ services are impactedTroubleshooting IT problemsConducting security forensics investigationsReporting on compliance mandatesLogs also contain useful data for higher level analytics and reporting:Track patterns of usage for a particular service or userProvide service assurance and revenue assurance Trace complex patterns to enable detection of more sophisticated eventsEnable SLA tracking and reportingFrom operational management to customer usage stats, as IT infrastructure become more dynamic, getting visibility and making better use of your logs becomes indispensable.
Dashboards let you extend the power of your data to wherever it’s needed, by role and on an authenticated basis. With Splunk you can create custom dashboards in minutes with the dashboard editor and make more sense of the huge volumes of data at your disposal. Combine pre-defined searches, charts, alerts and reports into a powerful dashboard. Or create mashups with other Web-based Apps, such as Tivoli, SAP, security consoles and more. Now your management, security analysts, auditors, developers and sysadmins are all empowered to get the visibility, information and intelligence they need.
So what are the differences in the technologies?We know about relational databases – a whole industry has been built up around these. They are the most widely used type of database today, used to store structured enterprise data, such as financial records, employee records, manufacturing, logistical information, etc. They are by design structured with rigid schemas. Schema changes can lead to broken functionality, so introduce lengthy delays and risk when making changes.Multidimensional databases are designed for analyzing large groups of records. The term OLAP (On-Line Analytical Processing) has become almost synonymous with multidimensional database - OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides match computation of dense data, trend analysis, etc. Great for data mining and monthly reporting, but not for real-time events. While most companies don’t realize it, machine data is the fastest growing, most complex and yet most valuable segment of big data. All websites, communications, networking and complex IT infrastructures generate massive streams of data every second of every day, in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner. Splunk focuses on analyzing large volumes of machine-generated data in underlying applications and systems, which includes application and system logs, network traffic, sensor data, click streams and other loosely structured information sources.
So what are the differences in the technologies?We know about relational databases – a whole industry has been built up around these. They are the most widely used type of database today, used to store structured enterprise data, such as financial records, employee records, manufacturing, logistical information, etc. They are by design structured with rigid schemas. Schema changes can lead to broken functionality, so introduce lengthy delays and risk when making changes.Multidimensional databases are designed for analyzing large groups of records. The term OLAP (On-Line Analytical Processing) has become almost synonymous with multidimensional database - OLAP tools enable users to analyze different dimensions of multidimensional data. For example, it provides match computation of dense data, trend analysis, etc. Great for data mining and monthly reporting, but not for real-time events. While most companies don’t realize it, machine data is the fastest growing, most complex and yet most valuable segment of big data. All websites, communications, networking and complex IT infrastructures generate massive streams of data every second of every day, in an array of unpredictable formats that are difficult to process and analyze by traditional methods or in a timely manner. Splunk focuses on analyzing large volumes of machine-generated data in underlying applications and systems, which includes application and system logs, network traffic, sensor data, click streams and other loosely structured information sources.
We collect data from any source, giving you real-time visibility and intelligence into what’s happening across your IT infrastructure – whether it’s physical, virtual or in the cloud. Splunk gives you one place to search, report on, analyze and visualize all this data.
Let’s not forget Splunk is an integrated, end-to-end solution. It collects and organizes your machine data into a NoSQL data fabric that can be searched, browsed, navigated, analyzed and visualized enabling. Simply point Splunk at your data and start using it immediately. Easy to deploy, easy to use. One person can download and implement Splunk in hours, rather than having a team of people take months or even years to deploy a solution. You can connect to your data in a few clicks and create powerful dashboards with a few more. Never miss a thing. Search and analyze live streaming and terabytes of historically indexed data from one place. Splunk automatically monitors your data for trends and specific patterns of activity or behavior. Then notifies the people that need to know immediately. Designed for novices and experts. Powerful search, drilldown and reporting capabilities meet the needs of novice users and expert analysts alike. Easy-to-create dashboards put critical insights from your machine data into the hands of the people who need it.
Splunk scales to support the data velocity and volumes of the largest and fastest growing companies in the world. These companies Splunk massive amounts of new data a day and perform historical searches over upwards of 1PB of data to support a myriad of use cases ranging from real-time error detection to business analytics.
Since June 2006, more than 3,000 users in over 70 countrieshave purchased the enterprise license of Splunk (as of Q3 2010). This includes almost half of the Fortune 100. Enterprises, service providers and government agencies in 78 countries use Splunk to improve service levels, reduce IT operations costs, mitigate security risks and drive new levels of operational visibility.As they gain new visibility into their real-time and historical machine data, Splunk’s customers are finding answers and solving the most challenging issues facing IT and the business.
Salesforce.com has been named not only one of the fastest growing companies by both Fortune and Forbes, but also one of the top 100 companies to work for. Founded in 1999, Salesforce.com has succeeded by being first to define and establish new products and services, forcing competitors to play catch-up. Beginning with its on-demand CRM solution, Salesforce.com has evolved into one of the top providers of cloud computing services for enterprises of all sizes. Splunk is helping Salesforce.com dramatically reduce troubleshooting time, free developers to focus on building new features, deliver insight into product and feature usability, and reduce the cost of monitoring infrastructure. Splunk is used all across Salesforce.com, but of particular note is the success of monitoring the Chatter campaign, which led directly to a range of Splunk-powered dashboards that provide executives with product metrics and key performance indicators for all products. Dashboards are used to determine, for instance, which product features are adopted most readily, where additional customer training may be required, and to determine accurate counts of unique users.
NPR, the award winning, multimedia news organization reaching 26.8 million listeners per week, use Splunk to provide visibility and analysis of their digital asset infrastructure.They provide the ability for users to consume content online, but had no way of measuring popularity of programs (when were different assets streamed, how many concurrently, etc.), reconcile royalty payments from a digital rights perspective, measure abandonment rates, etc.Initially implementing Splunk as a radically faster way to troubleshoot issues, the Splunk power user realized Splunk gave her incredible visibility beyond their web analytics system. Sondra from NPR implemented Splunk in a few days to get the visibility their business needed, compared to 6 months of effort with the current market-leadingweb analytics system.
Today, Cricket Wireless is the mobile phone carrier that serves more than 5.4 million customers across the U.S. Cricket delivers leading-edge mobile phones, nationwide, high-quality coverage, and all the latest features such as mobile web, downloadable games, popular ringtones, wallpapers and more.For Cricket, Splunk = unparalleled visibility. “Splunk significantly decreases downtime—we’re getting to resolution 40-50% faster”Further details:Feeding Splunk:F5 Middleware (jBoss, Tibco); Custom scripts for JMX metricsFront end appsOracle logsWeb metricsFirewallsVMwareWindows and Linux Servers150 LW forwardersCricket felt they were “chasing ghosts”:It was impossible to provide end-to-end visibilityDifficult to troubleshoot appsZero real-time visibilityGrepping through 100s of logs to find errorsNo red flags and suddenly our services are down.For Cricket, Splunk=Visibility:Can now track transactions across the infrastructureCan now trend on errorsCan now solve problems before customers see errors or feel painCan now give individual teams specific permissions to view specific data with specific views and dashboards
Becoming a next-generation digital enterprise means generating a greater percentage of enterprise revenue via information and Internet technologies. This contrasts with the first wave of the digital revolution, which measured how digital an enterprise was based on its Web presence. By the new definition, most enterprises have much work to do before they can become fully digitized (see figure below). They need to become digital from the front office to the back office, which gives them the opportunity to reimagine IT as the center of the next digital revolution.