SUM TWO is making 'serious investments' in big data, cloud, mobility !!! “Big data refers to the datasets whose size is beyond the ability of atypical database software tools to capture ,store, manage and analyze.defines big data the following way: “Big data is data that exceeds theprocessing capacity of conventional database systems. The data is too big, moves toofast, or doesnt fit the strictures of your database architectures. The 3 Vs of Big data.Apache Hadoop is 100% open source, and pioneered a fundamentally new way of storing and processing data. Instead of relying on expensive, proprietary hardware and different systems to store and process data, Hadoop enables distributed parallel processing of huge amounts of data across inexpensive, industry-standard servers that both store and process the data, and can scale without limits. With Hadoop, no data is too big. And in today’s hyper-connected world where more and more data is being created every day, Hadoop’s breakthrough advantages mean that businesses and organizations can now find value in data that was recently considered useless.Hadoop’s cost advantages over legacy systems redefine the economics of data. Legacy systems, while fine for certain workloads, simply were not engineered with the needs of Big Data in mind and are far too expensive to be used for general purpose with today's largest data sets.One of the cost advantages of Hadoop is that because it relies in an internally redundant data structure and is deployed on industry standard servers rather than expensive specialized data storage systems, you can afford to store data not previously viable . And we all know that once data is on tape, it’s essentially the same as if it had been deleted - accessible only in extreme circumstances.Make Big Data the Lifeblood of Your Enterprise
With data growing so rapidly and the rise of unstructured data accounting for 90% of the data today, the time has come for enterprises to re-evaluate their approach to data storage, management and analytics. Legacy systems will remain necessary for specific high-value, low-volume workloads, and compliment the use of Hadoop-optimizing the data management structure in your organization by putting the right Big Data workloads in the right systems. The cost-effectiveness, scalability and streamlined architectures of Hadoop will make the technology more and more attractive. In fact, the need for Hadoop is no longer a question.
1. BIGDATA
Decisions! Delivered !!
SARAVANAN . M
SALES MANAGER
21ST NOV 2013
Index:
1. What is BIGDATA?
2. BIGDATA Analytics
3. Characteristics of BIGDATA
4. Attributes of BIGDATA
5. Examples of BIGDATA
6. Size of BIGDATA
7. BIGDATA landscape
8. Industries using BIGDATA
9. Technologies Used
10. HADOOP
11. When should we go for HADOOP
12. Advantages of BIGDATA
13. Risks of BIGDATA
2. WHAT IS BIGDATA ?
Bigdata is a term that describes large volumes of high velocity, complex and
variable data that require advance techniques and technologies to enable the
capture, storage, distribution, management and the analysis of information.
Bigdata is a data that exceeds the processing capacity of conventional
database systems.
The data is too big , moves too fast, or doesn’t fit the structures of your
database architecture.
To gain value from this data, you must choose an alternative way to process
it.
3. BIGDATA ANALYTICS :
Bigdata analytic is the process of examining and interrogating big data
assets to derive insights of value for decision making.
4. CHARACTERISTICS OF BIGDATA
The word ―big‖ in bigdata is not just about the volume. Its also about the
3v`s.
They are;
Volume
Velocity
Variety.
5. ATTRIBUTES OF BIGDATA :
Volume – is that huge amount of digital data created by all sources –
companies, individuals and devices. (What constitutes ―big‖ varies by
perspective and will certainly change over time.)
Velocity – is the speed of creation, which in turn drives interest in real-time
analytics and automated decision-making.
Variety - comes from increasing types of data – some structured, as in
databases, much of it unstructured text or video and some semi-structured
data like social media data, location-based data, and log-file data.
6. EXAMPLES OF BIGDATA
Sensor networks
Social networks
Internet search index
Astronomy
Internet text and documents
Large scale e-commerce
Weblogs and video archives
Medical records and call detail records ,etc.
7. SIZE OF BIGDATA?
Google :
24PB data processed daily.
Facebook:
750 million users
12TB daily content
2.7 billion ―likes‖ and ―comments‖.
Twitter:
340 million daily tweets
1.6 billion search queries
7TB added daily.
9. INDUSTRIES THAT ARE USING BIGDATA:
Banking
Risk & Fraudulent management
Customer Analytics
Telecommunications
Call detail record processing
Customer profile
Health care
Medical Record text analytics
Genomic Analysis
Digital Media
Real-time ad targeting
Website analysis
Government
Abuse & Fraudulent management
Customer Analytics
10. TECHNOLOGY:
Bigdata is Driven mainly by Open Source Initiatives such as :
Apache TM HADOOP Project
Apache TM CASSANDRA Project
Apache TM HBASE Project
Apache TM HIVE Project
Apache TM SOLR Project
11. HADOOP :
What is Hadoop?
Flexible infrastructure for
large scale computation
and data processing on
a network of commodity
hardware.
Hadoop is completely
written using JAVA.
Hadoop is an open
source and it is
distributed under Apache
license,
Hadoop is not :
a file system nor a
database.
Not a replacement for
exciting data warehouse
systems nor for all
programing logics.
Not an On Line
Transaction Processing
(OLTP) system.
12. WHEN SHOULD WE GO FOR HADOOP?
When the data is too huge
When the processes are independent
For online analytical processing (OLAP)
For a better scalability
For Unstructured data
Also for Parallelism
13. --BACK TO BIGDATA—
ADVANTAGES:
Largest and fast growing market
Leaveraging bigdata for insights can enhance productivity and
competitiveness for companies
Harnessing bigdata will enable business to improve market intelligence
Latest trend for IT Professionals in the area of data analytics
14. RISKS OF BIGDATA:
Will b so overwhelmed
Need the right people and solve the right problem
Technological considerations
Open source
Scalability and performance issue
Many source of bigdata is privacy
Self regulation
Legal regulation