SlideShare ist ein Scribd-Unternehmen logo
1 von 7
Downloaden Sie, um offline zu lesen
As eCommerce Sites Harvest Big Data, They Mature the
Value from Transactional Benefits to Managing Multiple
Data Sets Across Cloud Models
Transcript of a Briefings Direct discussion on how HP Vertica helps a big-data consultancy in its
relationship to a variety of enterprises.
Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android.
Sponsor: HP Enterprise
Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm
Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator
for this ongoing sponsored discussion on IT innovation and how it’s making an
impact on people’s lives.
Once again, we're focusing on how companies are adapting to the new style of IT
to improve IT performance and deliver better user experiences, as well as better
business results.
Our next innovation user interview highlights how a consultant is helping big organizations
better manage their big data and provide the insights that they need to thrive in the fast-paced
Digital eCommerce Environment.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
With that, please join me in welcoming our guest. We are here with Jimmy Mohsin. He is the
Principal Software Architect at Norjimm LLC, a consultancy based in Princeton, New Jersey.
Welcome, Jimmy.
Jimmy Mohsin: Thank you, Dana. How are you?
Gardner: We've been hearing an awful lot of about some extraordinary situations where the fast-
paced environment and data volumes that users are dealing with have left them with a need for a
much better architecture.
Tell me what you are seeing in the marketplace? How desperate are people to find the right
architecture now that big data is upon them? 
Moshin There's a lot of interest in trying to deal with large data volumes, not only large data
volumes, but also data that changes rapidly. Now, there are many companies that have very large
datasets, some in terabytes, some in petabytes and then they're getting live feeds.
Gardner
The data is there and it’s changing rapidly. The traditional databases sometimes can’t handle that
problem, especially if you're using that database as a warehouse and you're reporting against it.
Basically, we have kind of a moving-target situation. With Vertica, what we've seen is the ability
to solve that problem in at least some of the cases that I've come across, and I can talk about
specific use cases in that regard.
Input/output issues
Gardner: Before we get into a specific use case, I'm interested particularly in some of these
input/output issues. People are trying to decide how to move the data around. They're toying with
cloud. They're trying to bring data for more types of traditional repositories. And, as you say,
they're facing new types of data problems with streaming and real-time feeds.
How do you see them beginning this process when they have to handle so many variables? Is it
something that’s an IT architecture, or enterprise architecture, or data architecture? Who's
responsible for this, given that it’s now a rather holistic problem?
Moshin In my present project, we ran into that. The problem is that many companies don't even
have a well defined data-architecture team. Some of them do. You'll find a lot of
companies with an enterprise-architect role and you'll have some companies
with a haphazard definition of an architectural group.
Net-net, at least at this point, unless companies are more structured, it becomes
a management issue in the sense that someone at the leadership level needs to
know who has what domain knowledge and then form the appropriate team to
skin this cat.
I know of a recent situation where we had to build a team of four people, and only one was an
architect. But we built a virtual team of four people who were able to assemble and collate all the
repositories that spanned 15 years and four different technology flavors, and then come up with
an approach that resulted in a single repository in Vertica.
So there are no easy answers yet, because organizations just aren't uniformly structured.
Gardner: Well, I imagine they'll be adapting, just like we all are, to the new realities. In the
meantime, tell me about a specific use case that demonstrates the intensity of scale and velocity,
and how at least one architecture has been deployed to manage that?
Moshin One of my present projects deals with one of the world's largest retailers. It's
eCommerce, online selling. One of the things they do, in addition to their transactions of buying
and selling, is email campaign management. That means staying in touch with the customer on
the basis of their purchases, their interests, and their profiles.
Moshin
One of the things we do is see what a certain customer’s buying preferences have been over the
past 90 days. Knowing that and the customer’s profile, we can try to predict what
their buying patterns will be. So we send them a very tailored message in that
regard. In this project, we're dealing with about 150 to 160 million emails a day.
So this is definitely big data.
Here we have online information coming into one warehouse as to what's
happening in the world of buying and selling. Then, behind the scenes, while that
information is being sent to the warehouse, we're trying to do these email campaigns.
This is where the problem becomes fairly complicated. We tried traditional relational database
management systems (RDBMS), and they kind of worked, but we ran into a slew of speed and
performance issues. That's really where the big-data world was really beneficial. We were able to
address that problem in about a seven-month project that we ran.
Gardner: And this was using Vertica?
Large organization
Moshin We did an evaluation. We looked at a few databases, and the corporate choice was
Vertica. We saw that there is a whole bunch of big-data vendors. The issue is that many of the
vendors don't have any large organizations behind them, and Vertica does. The company
management felt that this was a new big database, but HP was behind it, and the fact that they
also use HP hardware helped a lot.
They chose Vertica. The team I was managing did a proof of concept (POC) and we were able to
demonstrate that Vertica would be able to handle the reporting that is tied to the email campaign
management. We ran a 90 day POC, and the results were so positive that there was an interest in
going live. We went live in about another 90 days, following a 90-day POC.
Gardner: I understand that Vertica is quite versatile. I've heard of a number of ways in which it's
used technically. But this email campaign problem almost sounds like a transactional issue, a
complex event processing issue, or a transfer agent scaling issue. How does big data, Vertica, and
analytics come to bear on this particular problem?
Moshin It's exactly what you say it is. As we are reporting and pushing out the campaigns, new
information is coming in every half hour, sometimes even more frequently. There's a live feed
that's updating the warehouse. While the warehouse is being updated, we want to report against it
in real time and keep our campaigns going.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
The key point is that we can't really stop any of these processes. The customers who are
managing the campaigns want to see information very frequently. We can’t even predict when
they would want their information. At the same time, the transactional systems are sending us
live feeds.
The problem we ran into with the traditional RDBMS is that the reporting didn't function when
the live feeds were underway. We couldn't run our backend email campaign reports when new
data was coming in.
One of the benefits Vertica has, due to its basic architecture and its columnar design is that it's
better positioned to do that. This is what we were able to demonstrate in the live POC, and
nobody was going to take our word for it.
The end user said, "Take few of our largest clients. Take some of our clients that have a lot of
transactions. Prove that the reports will work for those clients." That's what we did in 30 days.
Then, we extended it, and then in 90 days, we demonstrated the whole thing end to end.
Following that was the go-live.
Gardner: You had to solve that problem of the live feeds, the rapidity of information. Rather
going to a stop, batch process, analyze, repeat, you've gained a solution to your problem.
But at the same time, it seems like you're getting data into an environment where you can
analyze it and perhaps extract other forms of analysis, in addition to solving your email,
eCommerce trajectory issues. It seems to me that you're now going to have the opportunity to
add a new dimension of analysis to what's going on and perhaps we find these transactions more
towards a customer inference benefit.
More than a database
Moshin One of the things internally that I like to say is that Vertica isn't just a big database,
it’s more than just a database. It's really a platform, because you have distributed all, you are
publishing other tools. When we adopted it and went live with this technology, we first solved
the feeds and speeds problem, but now we're very much positioned to use some of the
capabilities that exist in Vertica.
We had Distributed R being one of them, Inference Analysis being another one, so that we can
build intelligent reports. To date, we've been building those outside the RDBMS. RDBMS has no
role in that. With Vertica, I call it more of a data platform. So we definitely will go there, but that
would be our second phase.
As the system starts to function and deliver on the key use cases, the next stage would be to build
more sophisticated reports. We definitely have the requirements and now we have the ability to
deliver.
Gardner: Perhaps you could add visualization capabilities to that. You could make a data pool
available to more of the constituents within this organization so that they could innovate and do
experiments. That’s a very powerful stuff indeed.
Is there anything else you can tell us for other organizations that might be facing similar issues
around real-time feeds and the need to analyze and react, now that you have been through this on
this particular project. Are there any lessons learned for others.
If you're facing transactional issues and you haven't thought about a big-data platform as part of
that solution, what do you offer to them in terms of maybe lighting a light bulb in their mind
about looking for alternatives to traditional middleware.
Moshin Like so many people try to do, we tried to see if anyone else had done this. One of the
issues in big data at least today is that you can’t find a whole slew of clients who have already
gone live and who are in production.
There are lots of people in development, and some are live, but in our space, we couldn't find
anyone who was live. We solved that issue via a quick-hit POC. The big lesson there was that we
scoped the POC right. We didn’t want to do too much and we didn’t want to do too little. So that
was a good lesson learned.
The other big thing is the data-migration question. Maybe, to some extent, this problem will
never be solved. It's not so easy to pull data out of legacy database systems. Very few of them
will give you good tools to migrate away from them. They all want you to stay. So we had to
write our own tooling. We scoured the market for it, but we couldn’t find too many options out
there.
Understand your data
So a huge lesson learned was, if you really want to do this, if you want to move to big data, get
a handle on understanding your data. Make sure you have the domain experts in-house. Make
sure you have the tooling in place, however rudimentary it might be, to be able to pull the data
out of your existing database. Once you have it in the file system, Vertica can take it in minutes.
That’s not the problem. The problem is getting it out.
We continue to grapple with that and we have made product enhancement recommendations. But
in fairness to Vertica, this is really not something that Vertica can do much about, because this is
more in the legacy database space.
Gardner: I've heard quite a few people say that, given the velocity with which they are seeing
people move to the cloud, that obviously isn't part of their problem, as the data is already in the
cloud. It's in the standardized architecture that that cloud is built around, if there is a platform-as-
a-service (PaaS) capability, then getting at the data isn't so much of a problem, or am I not
reading that correctly?
Moshin No, you're reading that correctly. The problem we have is that a lot of companies are
still not in the cloud. There is still a lingering fear of the cloud. People will tell you that the cloud
is not secure. If you have customer information, if you have personalized data, many
organizations don't want to put it in the cloud.
Slowly, they are moving in that direction. If we were all there, I would completely agree with
you, but since we still have so many on-premise deployments, we're still in a hybrid mode --
some is on-prem, some is in the cloud.
Gardner: I just bring it up because it gives yet another reason to seriously consider cloud. It’s a
benefit that is actually quite powerful -- the data access and ability to do joins and bring datasets
together because they're all in the same cloud.
Moshin I fundamentally agree with you. I fundamentally believe in the cloud and that it really
should be the way to go. Going through our very recent go-live, there is no way we could have
the same elasticity in an on-prem is deployment that we can have in a cloud. I can pick up the
phone, call a cloud provider, and have another machine the next day. I can't do that if it’s on-
premise.
Again, a simple question of moving all the assets into the cloud, at least in some organizations,
will take several months, if not years.
Gardner:  Very good. I'm afraid we will have to leave it there. We have been discussing how a
specific enterprise in the eCommerce space has solved some unique problems using big data and,
in particular, the HP Vertica platform.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
That sets the stage for a wider use of big data for transactional problems and live-feed issues. It's
also why moving to cloud has also some potential benefits for speed, velocity, and dexterity
when it comes to data across multiple data sources and implementations.
So with that, a big thank you to our guest. We've been joined by Jimmy Mohsin,  Principal
Software Architect at Norjimm LLC, a consultancy based in Princeton, New Jersey. Thanks,
Jimmy.
Moshin Thanks, Dana. Have a great day.
Gardner: And a big thank you to our audience as well, for joining us for the special new style of
IT discussion.
I'm Dana Gardner; Principal Analyst at Interarbor Solutions, your host for this ongoing series of
HP sponsored discussions. Thanks again for listening, and come back next time.
Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android.
Sponsor: HP Enterprise
Transcript of a Briefings Direct discussion on how HP Vertica helps a big-data consultancy in its
relationship to a variety of enterprises. Copyright Interarbor Solutions, LLC, 2005-2015. All
rights reserved.
You may also be interested in:
	

 •	

 Full 360 takes big data analysis cloud services to new business heights
	

 •	

 HP hyper-converged appliance delivers speedy VDI and apps deployment and a direct
onramp to hybrid cloud
	

 •	

 Enterprises opting for converged infrastructure as stepping stone to hybrid cloud
	

 •	

 How big data technologies Hadoop and Vertica drive business results at Snagajob
	

 •	

 Zynga builds big data innovation culture by making analytics open to all developers
	

 •	

 How big data powers GameStop to gain retail advantage and deep insights into its
markets
	

 •	

 Data-driven apps performance monitoring spurs broad business benefits for Swiss insurer
and Turkish mobile carrier
	

 •	

 How Malaysia’s Bank Simpanan Nasional implemented a sweeping enterprise content
management system
	

 •	

 Redcentric Uses Advanced Configuration Database to Focus Massive Merger Across
Multiple Networks
	

 •	

 HP at Discover delivers the industry's first open, hybrid, ecosystem-wide cloud
architecture
	

 •	

 How Tableau Software and Big Data Come Together: Strong Visualization Embedded on
an Agile Analytics Engine
	

 •	

 Big Data Helps Conservation International Proactively Respond to Species Threat in
Tropical Forests
	

 •	

 How Globe Testing helps startups make the leap to cloud- and mobile-first development
	

 •	

 GoodData analytics developers on what they look for in a big data platform

Weitere ähnliche Inhalte

Kürzlich hochgeladen

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Empfohlen

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 

Empfohlen (20)

Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 

As eCommerce Sites Harvest Big Data, They Mature the Value from Transactional Benefits to Managing Multiple Data Sets Across Cloud Models

  • 1. As eCommerce Sites Harvest Big Data, They Mature the Value from Transactional Benefits to Managing Multiple Data Sets Across Cloud Models Transcript of a Briefings Direct discussion on how HP Vertica helps a big-data consultancy in its relationship to a variety of enterprises. Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Sponsor: HP Enterprise Dana Gardner: Hello, and welcome to the next edition of the HP Discover Podcast Series. I'm Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing sponsored discussion on IT innovation and how it’s making an impact on people’s lives. Once again, we're focusing on how companies are adapting to the new style of IT to improve IT performance and deliver better user experiences, as well as better business results. Our next innovation user interview highlights how a consultant is helping big organizations better manage their big data and provide the insights that they need to thrive in the fast-paced Digital eCommerce Environment. Become a member of myVertica today Register now Gain access to the free HP Vertica Community Edition With that, please join me in welcoming our guest. We are here with Jimmy Mohsin. He is the Principal Software Architect at Norjimm LLC, a consultancy based in Princeton, New Jersey. Welcome, Jimmy. Jimmy Mohsin: Thank you, Dana. How are you? Gardner: We've been hearing an awful lot of about some extraordinary situations where the fast- paced environment and data volumes that users are dealing with have left them with a need for a much better architecture. Tell me what you are seeing in the marketplace? How desperate are people to find the right architecture now that big data is upon them?  Moshin There's a lot of interest in trying to deal with large data volumes, not only large data volumes, but also data that changes rapidly. Now, there are many companies that have very large datasets, some in terabytes, some in petabytes and then they're getting live feeds. Gardner
  • 2. The data is there and it’s changing rapidly. The traditional databases sometimes can’t handle that problem, especially if you're using that database as a warehouse and you're reporting against it. Basically, we have kind of a moving-target situation. With Vertica, what we've seen is the ability to solve that problem in at least some of the cases that I've come across, and I can talk about specific use cases in that regard. Input/output issues Gardner: Before we get into a specific use case, I'm interested particularly in some of these input/output issues. People are trying to decide how to move the data around. They're toying with cloud. They're trying to bring data for more types of traditional repositories. And, as you say, they're facing new types of data problems with streaming and real-time feeds. How do you see them beginning this process when they have to handle so many variables? Is it something that’s an IT architecture, or enterprise architecture, or data architecture? Who's responsible for this, given that it’s now a rather holistic problem? Moshin In my present project, we ran into that. The problem is that many companies don't even have a well defined data-architecture team. Some of them do. You'll find a lot of companies with an enterprise-architect role and you'll have some companies with a haphazard definition of an architectural group. Net-net, at least at this point, unless companies are more structured, it becomes a management issue in the sense that someone at the leadership level needs to know who has what domain knowledge and then form the appropriate team to skin this cat. I know of a recent situation where we had to build a team of four people, and only one was an architect. But we built a virtual team of four people who were able to assemble and collate all the repositories that spanned 15 years and four different technology flavors, and then come up with an approach that resulted in a single repository in Vertica. So there are no easy answers yet, because organizations just aren't uniformly structured. Gardner: Well, I imagine they'll be adapting, just like we all are, to the new realities. In the meantime, tell me about a specific use case that demonstrates the intensity of scale and velocity, and how at least one architecture has been deployed to manage that? Moshin One of my present projects deals with one of the world's largest retailers. It's eCommerce, online selling. One of the things they do, in addition to their transactions of buying and selling, is email campaign management. That means staying in touch with the customer on the basis of their purchases, their interests, and their profiles. Moshin
  • 3. One of the things we do is see what a certain customer’s buying preferences have been over the past 90 days. Knowing that and the customer’s profile, we can try to predict what their buying patterns will be. So we send them a very tailored message in that regard. In this project, we're dealing with about 150 to 160 million emails a day. So this is definitely big data. Here we have online information coming into one warehouse as to what's happening in the world of buying and selling. Then, behind the scenes, while that information is being sent to the warehouse, we're trying to do these email campaigns. This is where the problem becomes fairly complicated. We tried traditional relational database management systems (RDBMS), and they kind of worked, but we ran into a slew of speed and performance issues. That's really where the big-data world was really beneficial. We were able to address that problem in about a seven-month project that we ran. Gardner: And this was using Vertica? Large organization Moshin We did an evaluation. We looked at a few databases, and the corporate choice was Vertica. We saw that there is a whole bunch of big-data vendors. The issue is that many of the vendors don't have any large organizations behind them, and Vertica does. The company management felt that this was a new big database, but HP was behind it, and the fact that they also use HP hardware helped a lot. They chose Vertica. The team I was managing did a proof of concept (POC) and we were able to demonstrate that Vertica would be able to handle the reporting that is tied to the email campaign management. We ran a 90 day POC, and the results were so positive that there was an interest in going live. We went live in about another 90 days, following a 90-day POC. Gardner: I understand that Vertica is quite versatile. I've heard of a number of ways in which it's used technically. But this email campaign problem almost sounds like a transactional issue, a complex event processing issue, or a transfer agent scaling issue. How does big data, Vertica, and analytics come to bear on this particular problem? Moshin It's exactly what you say it is. As we are reporting and pushing out the campaigns, new information is coming in every half hour, sometimes even more frequently. There's a live feed that's updating the warehouse. While the warehouse is being updated, we want to report against it in real time and keep our campaigns going. Become a member of myVertica today Register now Gain access to the free HP Vertica Community Edition
  • 4. The key point is that we can't really stop any of these processes. The customers who are managing the campaigns want to see information very frequently. We can’t even predict when they would want their information. At the same time, the transactional systems are sending us live feeds. The problem we ran into with the traditional RDBMS is that the reporting didn't function when the live feeds were underway. We couldn't run our backend email campaign reports when new data was coming in. One of the benefits Vertica has, due to its basic architecture and its columnar design is that it's better positioned to do that. This is what we were able to demonstrate in the live POC, and nobody was going to take our word for it. The end user said, "Take few of our largest clients. Take some of our clients that have a lot of transactions. Prove that the reports will work for those clients." That's what we did in 30 days. Then, we extended it, and then in 90 days, we demonstrated the whole thing end to end. Following that was the go-live. Gardner: You had to solve that problem of the live feeds, the rapidity of information. Rather going to a stop, batch process, analyze, repeat, you've gained a solution to your problem. But at the same time, it seems like you're getting data into an environment where you can analyze it and perhaps extract other forms of analysis, in addition to solving your email, eCommerce trajectory issues. It seems to me that you're now going to have the opportunity to add a new dimension of analysis to what's going on and perhaps we find these transactions more towards a customer inference benefit. More than a database Moshin One of the things internally that I like to say is that Vertica isn't just a big database, it’s more than just a database. It's really a platform, because you have distributed all, you are publishing other tools. When we adopted it and went live with this technology, we first solved the feeds and speeds problem, but now we're very much positioned to use some of the capabilities that exist in Vertica. We had Distributed R being one of them, Inference Analysis being another one, so that we can build intelligent reports. To date, we've been building those outside the RDBMS. RDBMS has no role in that. With Vertica, I call it more of a data platform. So we definitely will go there, but that would be our second phase. As the system starts to function and deliver on the key use cases, the next stage would be to build more sophisticated reports. We definitely have the requirements and now we have the ability to deliver.
  • 5. Gardner: Perhaps you could add visualization capabilities to that. You could make a data pool available to more of the constituents within this organization so that they could innovate and do experiments. That’s a very powerful stuff indeed. Is there anything else you can tell us for other organizations that might be facing similar issues around real-time feeds and the need to analyze and react, now that you have been through this on this particular project. Are there any lessons learned for others. If you're facing transactional issues and you haven't thought about a big-data platform as part of that solution, what do you offer to them in terms of maybe lighting a light bulb in their mind about looking for alternatives to traditional middleware. Moshin Like so many people try to do, we tried to see if anyone else had done this. One of the issues in big data at least today is that you can’t find a whole slew of clients who have already gone live and who are in production. There are lots of people in development, and some are live, but in our space, we couldn't find anyone who was live. We solved that issue via a quick-hit POC. The big lesson there was that we scoped the POC right. We didn’t want to do too much and we didn’t want to do too little. So that was a good lesson learned. The other big thing is the data-migration question. Maybe, to some extent, this problem will never be solved. It's not so easy to pull data out of legacy database systems. Very few of them will give you good tools to migrate away from them. They all want you to stay. So we had to write our own tooling. We scoured the market for it, but we couldn’t find too many options out there. Understand your data So a huge lesson learned was, if you really want to do this, if you want to move to big data, get a handle on understanding your data. Make sure you have the domain experts in-house. Make sure you have the tooling in place, however rudimentary it might be, to be able to pull the data out of your existing database. Once you have it in the file system, Vertica can take it in minutes. That’s not the problem. The problem is getting it out. We continue to grapple with that and we have made product enhancement recommendations. But in fairness to Vertica, this is really not something that Vertica can do much about, because this is more in the legacy database space. Gardner: I've heard quite a few people say that, given the velocity with which they are seeing people move to the cloud, that obviously isn't part of their problem, as the data is already in the cloud. It's in the standardized architecture that that cloud is built around, if there is a platform-as- a-service (PaaS) capability, then getting at the data isn't so much of a problem, or am I not reading that correctly?
  • 6. Moshin No, you're reading that correctly. The problem we have is that a lot of companies are still not in the cloud. There is still a lingering fear of the cloud. People will tell you that the cloud is not secure. If you have customer information, if you have personalized data, many organizations don't want to put it in the cloud. Slowly, they are moving in that direction. If we were all there, I would completely agree with you, but since we still have so many on-premise deployments, we're still in a hybrid mode -- some is on-prem, some is in the cloud. Gardner: I just bring it up because it gives yet another reason to seriously consider cloud. It’s a benefit that is actually quite powerful -- the data access and ability to do joins and bring datasets together because they're all in the same cloud. Moshin I fundamentally agree with you. I fundamentally believe in the cloud and that it really should be the way to go. Going through our very recent go-live, there is no way we could have the same elasticity in an on-prem is deployment that we can have in a cloud. I can pick up the phone, call a cloud provider, and have another machine the next day. I can't do that if it’s on- premise. Again, a simple question of moving all the assets into the cloud, at least in some organizations, will take several months, if not years. Gardner:  Very good. I'm afraid we will have to leave it there. We have been discussing how a specific enterprise in the eCommerce space has solved some unique problems using big data and, in particular, the HP Vertica platform. Become a member of myVertica today Register now Gain access to the free HP Vertica Community Edition That sets the stage for a wider use of big data for transactional problems and live-feed issues. It's also why moving to cloud has also some potential benefits for speed, velocity, and dexterity when it comes to data across multiple data sources and implementations. So with that, a big thank you to our guest. We've been joined by Jimmy Mohsin,  Principal Software Architect at Norjimm LLC, a consultancy based in Princeton, New Jersey. Thanks, Jimmy. Moshin Thanks, Dana. Have a great day. Gardner: And a big thank you to our audience as well, for joining us for the special new style of IT discussion. I'm Dana Gardner; Principal Analyst at Interarbor Solutions, your host for this ongoing series of HP sponsored discussions. Thanks again for listening, and come back next time.
  • 7. Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Sponsor: HP Enterprise Transcript of a Briefings Direct discussion on how HP Vertica helps a big-data consultancy in its relationship to a variety of enterprises. Copyright Interarbor Solutions, LLC, 2005-2015. All rights reserved. You may also be interested in: • Full 360 takes big data analysis cloud services to new business heights • HP hyper-converged appliance delivers speedy VDI and apps deployment and a direct onramp to hybrid cloud • Enterprises opting for converged infrastructure as stepping stone to hybrid cloud • How big data technologies Hadoop and Vertica drive business results at Snagajob • Zynga builds big data innovation culture by making analytics open to all developers • How big data powers GameStop to gain retail advantage and deep insights into its markets • Data-driven apps performance monitoring spurs broad business benefits for Swiss insurer and Turkish mobile carrier • How Malaysia’s Bank Simpanan Nasional implemented a sweeping enterprise content management system • Redcentric Uses Advanced Configuration Database to Focus Massive Merger Across Multiple Networks • HP at Discover delivers the industry's first open, hybrid, ecosystem-wide cloud architecture • How Tableau Software and Big Data Come Together: Strong Visualization Embedded on an Agile Analytics Engine • Big Data Helps Conservation International Proactively Respond to Species Threat in Tropical Forests • How Globe Testing helps startups make the leap to cloud- and mobile-first development • GoodData analytics developers on what they look for in a big data platform