SlideShare a Scribd company logo
1 of 61
Download to read offline
Analyse your SEO Data with R and Kibana
June 10th, 2016
Vincent Terrasi
Vincent Terrasi
--
SEO Director - Groupe M6Web
CuisineAZ, PasseportSanté, MeteoCity, …
--
Join the OVH adventure in July 2016
Blog : data-seo.com
Agenda
Mission : Do a Real-Time Log Analysis Tool
1. Using Screaming Frog to crawl a website
2. Using R for SEO Analysis
3. Using PaasLogs to centralize logs
4. Using Kibana to build fancy dashboards
5. Test !
3
“The world is full of obvious things which nobody by any chance ever observes.”
Sherlock Holmes Quote
Real-Time Log Analysis Tool 4
• Screaming Frog
• Google Analytics
• R
Crawler
• IIS Logs
• Apache Logs
• Nginx Logs
Logs
Using Screaming Frog
Screaming Frog : Export Data 6
When the crawl is
finished, click the
export button and save
the XLSX file
Add your url and click
the start button
Screaming Frog : Data ! 7
"Address"
"Content"
"Status Code"
"Status"
"Title 1"
"Title 1 Length"
"Title 1 Pixel Width"
"Title 2"
"Title 2 Length"
"Title 2 Pixel Width"
"Meta Description 1"
"Meta Description 1 Length“
"Meta Description 1 Pixel Width"
"Meta Keyword 1"
"Meta Keywords 1 Length"
"H1-1"
"H1-1 length"
"H2-1"
"H2-1 length"
"H2-2"
"H2-2 length"
"Meta Robots 1“
"Meta Refresh 1"
"Canonical Link Element 1"
"Size"
"Word Count"
"Level"
"Inlinks"
"Outlinks"
"External Outlinks"
"Hash"
"Response Time"
"Last Modified"
"Redirect URI“
"GA Sessions"
"GA % New Sessions"
"GA New Users"
"GA Bounce Rate"
"GA Page Views Per Sesssion"
"GA Avg Session Duration"
"GA Page Value"
"GA Goal Conversion Rate All"
"GA Goal Completions All"
"GA Goal Value All"
"Clicks"
"Impressions"
"CTR"
"Position"
"H1-2"
"H1-2 length"
Using R
Why R ?
Scriptable
Big Community
Mac / PC / Unix
Open Source
 7500 packages
9
Documentation
WheRe ? How ?
 https://www.cran.r-project.org/
10
Rgui RStudio
Using R : Step 1
Export All Urls
11
"request“;"section“;"active“;
"speed“;"compliant“;"depth“;"inlinks"
Packages :
Stringr
Ggplot
Dplyr
Readxl
R Examples
 Crawl via Screaming Frog
 Classify URLs by :
 Section
 Load Time
 Number of Inlinks
 Detect Active Pages
 Min 1 visit per month
 Detect Compliant Pages
 Canonical Not Equal
 Meta No-index
 Bad HTTP Status Code
 Detect Duplicate Meta
12
R : read files 13
# Read xlsx file
urls <- read_excel("internal_html_blog.xlsx",
sheet = 1,
col_names = TRUE,
skip=1)
# Read csv file
urls <- read.csv2("internal_html_blog.csv", sep=";", header = TRUE)
Detect Active Pages 14
#default
urls_select$Active <- FALSE
urls_select$Active[ which(urls_select$`GA Sessions` > 0) ] <- TRUE
#factor
urls_select$Active <- as.factor(urls_select$Active)
Classify URLs by Section 15
schemas <- read.csv(“conf.csv”,header = FALSE, col.names = "schema", stringsAsFactors = FALSE)
urls_select$Cat <- "no match"
for (j in 1:length(schemas))
{
urls_select$Cat[ which(stri_detect_fixed(urls_select$Address , schemas[j]) ) ] <- schemas[j]
}
/agenda/sorties-cinema/
/agenda/parutions/
/agenda/evenements/
/agenda/programme-tv/
/encyclopedie/
Conf.csv
Classify URLs By Load Time 16
urls_select$Speed <- NA
urls_select$Speed[ which(urls_select$`Response Time` < 0.501 ) ] <- "Fast“
urls_select$Speed [ which(urls_select$`Response Time` >= 0.501
& urls_select$`Response Time` < 1.001) ] <- "Medium“
urls_select$Speed[ which(urls_select$`Response Time` >= 1.001
& urls_select$`Response Time` < 2.001) ] <- "Slow“
urls_select$Speed[ which(urls_select$`Response Time` >= 2.001) ] <- "Slowest"
urls_select$Speed <- as.factor(urls_select$Speed)
Classify URLs By Number of Inlinks 17
urls_select$`Group Inlinks` <- "URLs with No Follow Inlinks"
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` < 1 ) ] <- "URLs with No Follow Inlinks"
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` == 1 ) ] <- "URLs with 1 Follow Inlink“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` > 1
& urls_select$`Inlinks` < 6) ] <- "URLs with 2 to 5 Follow Inlinks“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 6
& urls_select$`Inlinks` < 11 ) ] <- "URLs with 5 to 10 Follow Inlinks“
urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 11) ] <- "URLs with more than 10 Follow Inlinks"
urls_select$`Group Inlinks` <- as.factor(urls_select$`Group Inlinks`)
Detect Compliant Pages 18
# Compliant Pages
# Canonical Not Equal
# Meta No-index
# Bad HTTP Status Code
# Not Equal
urls_select$Compliant <- TRUE
urls_select$Compliant[ which(urls_select$`Status Code` != 200
| urls_select$`Canonical Link Element 1` != urls_select$Address
| urls_select$Status != "OK"
| grepl("noindex",urls_select$`Meta Robots 1`)
) ] <- FALSE
urls_select$Compliant <- as.factor(urls_select$Compliant)
Detect Duplicata Meta 19
urls_select$`Status Title` <- 'Unique'
urls_select$`Status Title`[ which(urls_select$`Title 1 Length` == 0) ] <- "No Set"
urls_select$`Status Description` <- 'Unique'
urls_select$`Status Description`[ which(urls_select$`Meta Description 1 Length` == 0) ] <- "No Set"
urls_select$`Status H1` <- 'Unique'
urls_select$`Status H1`[ which(urls_select$`H1-1 Length` == 0) ] <- "No Set"
urls_select$`Status Title`[ which(duplicated(urls_select$`Title 1`)) ] <- 'Duplicate'
urls_select$`Status Description`[ which(duplicated(urls_select$`Meta Description 1`)) ] <- 'Duplicate'
urls_select$`Status H1`[ which(duplicated(urls_select$`H1-1`)) ] <- 'Duplicate'
urls_select$`Status Title` <- as.factor(urls_select$`Status Title`)
urls_select$`Status Description` <- as.factor(urls_select$`Status Description`)
urls_select$`Status H1` <- as.factor(urls_select$`Status H1`)
Generate CSV 20
urls_light <- select(urls_select,Address,Cat,Active,Speed,Compliant,Level,Inlinks) %>%
mutate(Address=gsub(“http://moniste.fr","",Address))
colnames(urls_light) <- c("request","section","active","speed","compliant","depth","inlinks")
write.csv2(“file.csv”, filename, row.names = FALSE)
Package dplyr : select and mutate
Edit colnames
Use write.csv2
R : ggplot2 command 21
DATA
Create the ggplot object and populate it with data (always a data frame)
ggplot( mydata, aes( x=section,y=count, fill=active ))
LAYERS
Add layer(s)
+ geom_point()
FACET
Used for conditionning on variable(s)
+ facet_grid(~rescode)
ggplot2 : Geometry 22
R Chart : Active Pages 23
urls_level_active <- group_by(urls_select,Level,Active) %>%
summarise(count = n()) %>%
filter(Level<12)
Geometry Aesthetic
p <- ggplot(urls_level_active, aes(x=Level, y=count, fill=Active) ) +
geom_bar(stat = "identity", position = "stack") +
scale_fill_manual(values=c("#e5e500", "#4DBD33")) +
labs(x = "Depth", y ="Crawled URLs")
#display
print(p)
# save in file
ggsave(file=“chart.png")
R Chart : GA Sessions 24
urls_cat_gasessions <- aggregate( urls_select$`GA Sessions`,
by=list(Cat=urls_select$Cat, urls_select$Compliant), FUN=sum, na.rm=TRUE)
colnames(urls_cat_gasessions) <- c("Category","Compliant","GA Sessions")
p <- ggplot(urls_cat_gasessions, aes(x=Category, y=`GA Sessions`,
fill=Compliant))+
geom_bar(stat = "identity", position = "stack") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
labs(x = "Section", y ="Sessions") +
scale_fill_manual(values=c("#e5e500","#4DBD33"))
#display
print(p)
# save in file
ggsave(file=“chart.png")
R Chart : Compliant 25
urls_cat_compliant_statuscode <- group_by(urls_select,Cat,
Compliant,`Status Code`) %>%
summarise(count = n()) %>%
filter(grepl(200,`Status Code`) | grepl(301,`Status Code`))
p <- ggplot(urls_cat_compliant_statuscode, aes(x=Cat, y=count,
fill= Compliant ) ) +
geom_bar(stat = "identity", position = "stack") +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
facet_grid(`Status Code` ~ .) +
labs(x = "Section", y ="Crawled URLs") +
scale_fill_manual(values=c("#e5e500","#4DBD33"))
R : SEO Cheat Sheet 26
Package Dplyr
select() allows you to rapidly zoom in on a useful subset using operations that usually only work on numeric variable positions
mutate() a data frame by adding new or replacing existing columns
filter() allows you to select a subset of rows in a data frame.
Package Gplot2
aes - geom
ggsave()
Package Readxl
read_excel()
read.csv2()
write.csv2()
ELK
Architecture 28
Hard to monitor and optimize host server performance
Architecture 29
Using PaasLogs
PaasLogs 31
PaasLogs 32
164 noeuds au sein du cluster Elastic Search
180 machines connectées
Entre 100 000 et 300 000 logs traités par seconde
12 milliards de logs transitent tous les jours
211 milliards de documents enregistrés
8 clicks and 3 copy/paste to use it !
PaasLogs: Step 1 33
PaasLogs : Step 2 34
PaasLogs 35
PaasLogs : Streams 36
The Streams are the recipient of your logs. When you send a log with the
right stream token, it arrives automatically to your stream in a awesome
software named Graylog.
PaasLogs : Dashboards 37
The Dashboard is the global view of your logs, A Dashboard is an efficient
way to exploit your logs and to view global information like metrics and
trends about your data without being overwhelmed by the logs details.
PaasLogs : Aliases 38
The Aliases will allow you to access directly your data from your Kibana or
using an Elasticsearch query
DON’T FORGET TO ENABLE KIBANA INDICES AND WRITE YOUR USER PASSWORD
PaasLogs : Inputs 39
The Inputs will allow you to ask OVH to host your own dedicated collector
like Logstash or Flowgger.
PaasLogs : Network Configuration 40
PaasLogs : Plugins Logstash 41
OVHCOMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}]
"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion_num:float})?|%{DATA:rawrequest})"
%{NUMBER:response_int:int} (?:%{NUMBER:bytes_int:int}|-)
OVHCOMBINEDAPACHELOG %{OVHCOMMONAPACHELOG} "%{NOTSPACE:referrer}" %{QS:agent}
PaasLogs : Config Logstash 42
if [type] == "apache" {
grok {
match => [ "message", "%{OVHCOMBINEDAPACHELOG}"]
patterns_dir => "/opt/logstash/patterns"
}
}
if [type] == "csv_infos" {
csv {
columns => ["request", "section","active", "speed",
"compliant","depth","inlinks"]
separator => ";"
}
}
How to send Logs to PaasLogs ? 43
Use Filebeat 44
Filebeat : Install 45
Install filebeat
curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb
sudo dpkg -i filebeat_1.2.1_amd64.deb
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
Filebeat : Edit filebeat.yml 46
filebeat:
prospectors:
-
paths:
- /home/ubuntu/lib/apache2/log/access.log
input_type: log
fields_under_root: true
document_type: apache
-
paths:
- /home/ubuntu/workspace/csv/crawled-urls-filebeat-*.csv
input_type: csv
fields_under_root: true
document_type: csv_infos
output:
logstash:
hosts: ["c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com:5044"]
worker: 1
tls:
certificate_authorities: ["/home/ubuntu/workspace/certificat/key.crt"]
Filebeat : Start 47
Copy / Paste Key.crt
-----BEGIN CERTIFICATE-----
MIIDozCCAougAwIBAgIJALxR4fTZlzQMMA0GCSqGSIb3DQEBCwUAMGgxCzAJBgNVBAYTAkZSMQ
8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBAcMBVBhcmlzMQwwCgYDVQQKDANPVkgxCzAJBgNVB
AYTAkZSMR0wGwYDVQQDDBRpbi5sYWFzLnJ1bmFib3ZlLmNvbTAeFw0xNjAzMTAxNTEzMDNaFw0
xNzAzMTAxNTEzMDNaMGgxCzAJBgNVBAYTAkZSMQ8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBA
cMBVBhcmlzMQwwCgYDVQQKDANPVkgx
-----END CERTIFICATE-----
Start Filebeat
sudo /etc/init.d/filebeat start
sudo /etc/init.d/filebeat stop
How to combine multiple sources ? 48
Paaslogs : Plugins ES 49
Description : Copies fields from previous log events in Elasticsearch to current events
if [type] == "apache" {
elasticsearch {
hosts => "laas.runabove.com"
index => "logsDataSEO" # alias
ssl => true
query => ‘ type:csv_infos AND request: "%{[request]}" ‘
fields => [["speed","speed"],["compliant","compliant"],
["section","section"],["active","active"],
["depth","depth"],["inlinks","inlinks"]]
}
}
# TIP : fields => [[src,dest],[src,dest]]
Using Kibana
Kibana : Install 51
Download Kibana 4.1
• Download and unzip Kibana 4
• Extract your archive
• Open config/kibana.yml in an editor
• Set the elasticsearch.url to point at your Elasticsearch instance
• Run ./bin/kibana (or binkibana.bat on Windows)
• Point your browser athttp://yourhost.com:5601
Kibana : Edit Kibana.yml 52
Update Kibana.xml
server.port: 8080
server.host: "0.0.0.0"
elasticsearch.url: "https://laas.runabove.com:9200"
elasticsearch.preserveHost: true
kibana.index: "ra-logs-33078"
kibana.defaultAppId: "discover"
elasticsearch.username: "ra-logs-33078"
elasticsearch.password: "rHftest6APlolNcc6"
Kibana : Line Chart 53
Number of active crawled from google over a period of time
Kibana : Vertical Bar Chart 54
Kibana : Pie Chart 55
How to compare two periods ? 56
Kibana : Use Date Range 57
Final Architecture
PassLogs Kibana
Filebeat
@
58
@
Soft RealTime
--
Old Logs
IIS
Apache
Ngnix
HA Proxy
Test yourself 59
Use Screaming Frog Spider Tool
www.screamingfrog.co.uk
Teach R
www.datacamp.com
www.data-seo.com
www.moise-le-geek.fr/push-your-hands-in-the-r-introduction/
Test PassLogs
www.runabove.com
Install Kibana
www.elastic.co/downloads/kibana
TODO List 60
- Create a GitHub Repository with all source code
- Add Plugin Logstash to do a reverse DNS lookup
- Schedule A Crawl By Command Line
- Upload Screaming Frog File to web server
Thank you
Keep in touch
June 10th, 2016
@vincentterrasi Vincent Terrasi

More Related Content

What's hot

Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsMaterialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsDatabricks
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesDatabricks
 
An Introduction to Higher Order Functions in Spark SQL with Herman van Hovell
An Introduction to Higher Order Functions in Spark SQL with Herman van HovellAn Introduction to Higher Order Functions in Spark SQL with Herman van Hovell
An Introduction to Higher Order Functions in Spark SQL with Herman van HovellDatabricks
 
Algorithms Lecture 3: Analysis of Algorithms II
Algorithms Lecture 3: Analysis of Algorithms IIAlgorithms Lecture 3: Analysis of Algorithms II
Algorithms Lecture 3: Analysis of Algorithms IIMohamed Loey
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comJungsu Heo
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowWes McKinney
 
Algorithms Lecture 5: Sorting Algorithms II
Algorithms Lecture 5: Sorting Algorithms IIAlgorithms Lecture 5: Sorting Algorithms II
Algorithms Lecture 5: Sorting Algorithms IIMohamed Loey
 
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang Spark Summit
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guideRyan Blue
 
On Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLOn Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLDatabricks
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAljoscha Krettek
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Max Lapan
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta LakeDatabricks
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 

What's hot (20)

Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested ColumnsMaterialized Column: An Efficient Way to Optimize Queries on Nested Columns
Materialized Column: An Efficient Way to Optimize Queries on Nested Columns
 
Presto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation EnginesPresto on Apache Spark: A Tale of Two Computation Engines
Presto on Apache Spark: A Tale of Two Computation Engines
 
R-Shiny Cheat sheet
R-Shiny Cheat sheetR-Shiny Cheat sheet
R-Shiny Cheat sheet
 
An Introduction to Higher Order Functions in Spark SQL with Herman van Hovell
An Introduction to Higher Order Functions in Spark SQL with Herman van HovellAn Introduction to Higher Order Functions in Spark SQL with Herman van Hovell
An Introduction to Higher Order Functions in Spark SQL with Herman van Hovell
 
Algorithms Lecture 3: Analysis of Algorithms II
Algorithms Lecture 3: Analysis of Algorithms IIAlgorithms Lecture 3: Analysis of Algorithms II
Algorithms Lecture 3: Analysis of Algorithms II
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
Web analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.comWeb analytics at scale with Druid at naver.com
Web analytics at scale with Druid at naver.com
 
Solving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache ArrowSolving Enterprise Data Challenges with Apache Arrow
Solving Enterprise Data Challenges with Apache Arrow
 
Algorithms Lecture 5: Sorting Algorithms II
Algorithms Lecture 5: Sorting Algorithms IIAlgorithms Lecture 5: Sorting Algorithms II
Algorithms Lecture 5: Sorting Algorithms II
 
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
On Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQLOn Improving Broadcast Joins in Apache Spark SQL
On Improving Broadcast Joins in Apache Spark SQL
 
Advanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applicationsAdvanced Flink Training - Design patterns for streaming applications
Advanced Flink Training - Design patterns for streaming applications
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
 
Hyperspace for Delta Lake
Hyperspace for Delta LakeHyperspace for Delta Lake
Hyperspace for Delta Lake
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 

Viewers also liked

SEO et redaction web: les bonnes pratiques
SEO et redaction web: les bonnes pratiquesSEO et redaction web: les bonnes pratiques
SEO et redaction web: les bonnes pratiquesTom Maccario
 
Référencement d'un site web
Référencement d'un site webRéférencement d'un site web
Référencement d'un site webHanae Guenouni
 
Network Analysis for SEO
Network Analysis for SEONetwork Analysis for SEO
Network Analysis for SEOcharlottebourne
 
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...Search Foresight
 
Revolutionazing Search Advertising with ElasticSearch at Swoop
Revolutionazing Search Advertising with ElasticSearch at SwoopRevolutionazing Search Advertising with ElasticSearch at Swoop
Revolutionazing Search Advertising with ElasticSearch at SwoopSimeon Simeonov
 
Offre de référencement (Audit, SEO et forfait) - Gini Concept Design
Offre de référencement (Audit, SEO et forfait) - Gini Concept DesignOffre de référencement (Audit, SEO et forfait) - Gini Concept Design
Offre de référencement (Audit, SEO et forfait) - Gini Concept DesignGini Concept Design
 
Presentation meetup ml bd
Presentation meetup ml bdPresentation meetup ml bd
Presentation meetup ml bdantoine vastel
 
Queduweb 2016 ndd expiré - sébastien moity - ls project
Queduweb 2016   ndd expiré - sébastien moity - ls projectQueduweb 2016   ndd expiré - sébastien moity - ls project
Queduweb 2016 ndd expiré - sébastien moity - ls projectSébastien Moity
 
Chocolade
ChocoladeChocolade
Chocoladepellynl
 
Onderzoek presentatie 3
Onderzoek presentatie 3Onderzoek presentatie 3
Onderzoek presentatie 3maartenblom
 
Chocolade, vanille en kaneel
Chocolade, vanille en kaneelChocolade, vanille en kaneel
Chocolade, vanille en kaneeljoostdevos
 
Spreekbeurt over katjes
Spreekbeurt over katjesSpreekbeurt over katjes
Spreekbeurt over katjesan100000
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEODimitri Brunel
 
Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientistsAjay Ohri
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesRevolution Analytics
 

Viewers also liked (20)

Meetup Data-science OVH
Meetup Data-science OVHMeetup Data-science OVH
Meetup Data-science OVH
 
SEO et redaction web: les bonnes pratiques
SEO et redaction web: les bonnes pratiquesSEO et redaction web: les bonnes pratiques
SEO et redaction web: les bonnes pratiques
 
Référencement d'un site web
Référencement d'un site webRéférencement d'un site web
Référencement d'un site web
 
Network Analysis for SEO
Network Analysis for SEONetwork Analysis for SEO
Network Analysis for SEO
 
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...
Synodiance > Fondamentaux du SEO appliqués au E-Commerce - Conférence Salon E...
 
Revolutionazing Search Advertising with ElasticSearch at Swoop
Revolutionazing Search Advertising with ElasticSearch at SwoopRevolutionazing Search Advertising with ElasticSearch at Swoop
Revolutionazing Search Advertising with ElasticSearch at Swoop
 
Offre de référencement (Audit, SEO et forfait) - Gini Concept Design
Offre de référencement (Audit, SEO et forfait) - Gini Concept DesignOffre de référencement (Audit, SEO et forfait) - Gini Concept Design
Offre de référencement (Audit, SEO et forfait) - Gini Concept Design
 
Presentation meetup ml bd
Presentation meetup ml bdPresentation meetup ml bd
Presentation meetup ml bd
 
Queduweb 2016 ndd expiré - sébastien moity - ls project
Queduweb 2016   ndd expiré - sébastien moity - ls projectQueduweb 2016   ndd expiré - sébastien moity - ls project
Queduweb 2016 ndd expiré - sébastien moity - ls project
 
Chocolade
ChocoladeChocolade
Chocolade
 
Chocolade
ChocoladeChocolade
Chocolade
 
Onderzoek presentatie 3
Onderzoek presentatie 3Onderzoek presentatie 3
Onderzoek presentatie 3
 
Chocolade, vanille en kaneel
Chocolade, vanille en kaneelChocolade, vanille en kaneel
Chocolade, vanille en kaneel
 
Kat af
Kat afKat af
Kat af
 
Kat af helemaal
Kat af helemaalKat af helemaal
Kat af helemaal
 
Spreekbeurt over katjes
Spreekbeurt over katjesSpreekbeurt over katjes
Spreekbeurt over katjes
 
Evolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEOEvolutions et nouveaux outils SEO
Evolutions et nouveaux outils SEO
 
SEO local : SMX Paris 2016
SEO local : SMX Paris 2016SEO local : SMX Paris 2016
SEO local : SMX Paris 2016
 
Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientists
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 

Similar to Analyse your SEO Data with R and Kibana

E2 appspresso hands on lab
E2 appspresso hands on labE2 appspresso hands on lab
E2 appspresso hands on labNAVER D2
 
E3 appspresso hands on lab
E3 appspresso hands on labE3 appspresso hands on lab
E3 appspresso hands on labNAVER D2
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchSperasoft
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R StudioRupak Roy
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaPrajal Kulkarni
 
Logstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtimeLogstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtimeAndrea Cardinale
 
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Databricks
 
Windows Server 2008 (PowerShell Scripting Uygulamaları)
Windows Server 2008 (PowerShell Scripting Uygulamaları)Windows Server 2008 (PowerShell Scripting Uygulamaları)
Windows Server 2008 (PowerShell Scripting Uygulamaları)ÇözümPARK
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introductionantoinegirbal
 
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps_Fest
 
Ajax Performance Tuning and Best Practices
Ajax Performance Tuning and Best PracticesAjax Performance Tuning and Best Practices
Ajax Performance Tuning and Best PracticesDoris Chen
 
Black Hat: XML Out-Of-Band Data Retrieval
Black Hat: XML Out-Of-Band Data RetrievalBlack Hat: XML Out-Of-Band Data Retrieval
Black Hat: XML Out-Of-Band Data Retrievalqqlan
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
FleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in ClojureFleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in Clojureelliando dias
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAmazon Web Services
 
FleetDB: A Schema-Free Database in Clojure
FleetDB: A Schema-Free Database in ClojureFleetDB: A Schema-Free Database in Clojure
FleetDB: A Schema-Free Database in ClojureMark McGranaghan
 

Similar to Analyse your SEO Data with R and Kibana (20)

E2 appspresso hands on lab
E2 appspresso hands on labE2 appspresso hands on lab
E2 appspresso hands on lab
 
E3 appspresso hands on lab
E3 appspresso hands on labE3 appspresso hands on lab
E3 appspresso hands on lab
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
 
Logstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtimeLogstash for SEO: come monitorare i Log del Web Server in realtime
Logstash for SEO: come monitorare i Log del Web Server in realtime
 
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
Spark Machine Learning: Adding Your Own Algorithms and Tools with Holden Kara...
 
Windows Server 2008 (PowerShell Scripting Uygulamaları)
Windows Server 2008 (PowerShell Scripting Uygulamaları)Windows Server 2008 (PowerShell Scripting Uygulamaları)
Windows Server 2008 (PowerShell Scripting Uygulamaları)
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
 
Ajax Performance Tuning and Best Practices
Ajax Performance Tuning and Best PracticesAjax Performance Tuning and Best Practices
Ajax Performance Tuning and Best Practices
 
Black Hat: XML Out-Of-Band Data Retrieval
Black Hat: XML Out-Of-Band Data RetrievalBlack Hat: XML Out-Of-Band Data Retrieval
Black Hat: XML Out-Of-Band Data Retrieval
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
前端概述
前端概述前端概述
前端概述
 
Elastic Search
Elastic SearchElastic Search
Elastic Search
 
R data interfaces
R data interfacesR data interfaces
R data interfaces
 
FleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in ClojureFleetDB A Schema-Free Database in Clojure
FleetDB A Schema-Free Database in Clojure
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
FleetDB: A Schema-Free Database in Clojure
FleetDB: A Schema-Free Database in ClojureFleetDB: A Schema-Free Database in Clojure
FleetDB: A Schema-Free Database in Clojure
 

More from Vincent Terrasi

SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...Vincent Terrasi
 
IA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOIA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOVincent Terrasi
 
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a mentislides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a mentiVincent Terrasi
 
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOUne IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOVincent Terrasi
 
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...Vincent Terrasi
 
Génération de contenu pour le SEO
Génération de contenu pour le SEOGénération de contenu pour le SEO
Génération de contenu pour le SEOVincent Terrasi
 
Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Vincent Terrasi
 
Explainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsExplainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsVincent Terrasi
 
Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Vincent Terrasi
 
Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Vincent Terrasi
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHVincent Terrasi
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?Vincent Terrasi
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projectsVincent Terrasi
 
How Data Science can boost your SEO ?
How Data Science can boost your SEO ?How Data Science can boost your SEO ?
How Data Science can boost your SEO ?Vincent Terrasi
 

More from Vincent Terrasi (14)

SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
SEO CAMP'us Paris 2024 - Déploiement de l'IA générative privée dans les organ...
 
IA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEOIA générative : Menace ou Opportunité pour le SEO
IA générative : Menace ou Opportunité pour le SEO
 
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a mentislides SEO CAMP'us Paris 2022 - Google et tools SEO  On vous a menti
slides SEO CAMP'us Paris 2022 - Google et tools SEO On vous a menti
 
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEOUne IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
Une IA pour votre SEO, une méthode inédite pour accélérer vos projets Data SEO
 
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
SEO AnswerBox, une méthode inédite pour interroger vos données et créer vos d...
 
Génération de contenu pour le SEO
Génération de contenu pour le SEOGénération de contenu pour le SEO
Génération de contenu pour le SEO
 
Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?Comment faire du Data SEO sans savoir programmer ?
Comment faire du Data SEO sans savoir programmer ?
 
Explainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking FactorsExplainable Machine Learning for Ranking Factors
Explainable Machine Learning for Ranking Factors
 
Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !Fausses données et Bad Data : restez vigilant !
Fausses données et Bad Data : restez vigilant !
 
Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?Comment les plateformes de Data Science métamorphosent le SEO ?
Comment les plateformes de Data Science métamorphosent le SEO ?
 
Find out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVHFind out how DataScience has revolutionized SEO for OVH
Find out how DataScience has revolutionized SEO for OVH
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
 
How to automate all your SEO projects
How to automate all your SEO projectsHow to automate all your SEO projects
How to automate all your SEO projects
 
How Data Science can boost your SEO ?
How Data Science can boost your SEO ?How Data Science can boost your SEO ?
How Data Science can boost your SEO ?
 

Recently uploaded

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Analyse your SEO Data with R and Kibana

  • 1. Analyse your SEO Data with R and Kibana June 10th, 2016 Vincent Terrasi
  • 2. Vincent Terrasi -- SEO Director - Groupe M6Web CuisineAZ, PasseportSanté, MeteoCity, … -- Join the OVH adventure in July 2016 Blog : data-seo.com
  • 3. Agenda Mission : Do a Real-Time Log Analysis Tool 1. Using Screaming Frog to crawl a website 2. Using R for SEO Analysis 3. Using PaasLogs to centralize logs 4. Using Kibana to build fancy dashboards 5. Test ! 3 “The world is full of obvious things which nobody by any chance ever observes.” Sherlock Holmes Quote
  • 4. Real-Time Log Analysis Tool 4 • Screaming Frog • Google Analytics • R Crawler • IIS Logs • Apache Logs • Nginx Logs Logs
  • 6. Screaming Frog : Export Data 6 When the crawl is finished, click the export button and save the XLSX file Add your url and click the start button
  • 7. Screaming Frog : Data ! 7 "Address" "Content" "Status Code" "Status" "Title 1" "Title 1 Length" "Title 1 Pixel Width" "Title 2" "Title 2 Length" "Title 2 Pixel Width" "Meta Description 1" "Meta Description 1 Length“ "Meta Description 1 Pixel Width" "Meta Keyword 1" "Meta Keywords 1 Length" "H1-1" "H1-1 length" "H2-1" "H2-1 length" "H2-2" "H2-2 length" "Meta Robots 1“ "Meta Refresh 1" "Canonical Link Element 1" "Size" "Word Count" "Level" "Inlinks" "Outlinks" "External Outlinks" "Hash" "Response Time" "Last Modified" "Redirect URI“ "GA Sessions" "GA % New Sessions" "GA New Users" "GA Bounce Rate" "GA Page Views Per Sesssion" "GA Avg Session Duration" "GA Page Value" "GA Goal Conversion Rate All" "GA Goal Completions All" "GA Goal Value All" "Clicks" "Impressions" "CTR" "Position" "H1-2" "H1-2 length"
  • 9. Why R ? Scriptable Big Community Mac / PC / Unix Open Source  7500 packages 9 Documentation
  • 10. WheRe ? How ?  https://www.cran.r-project.org/ 10 Rgui RStudio
  • 11. Using R : Step 1 Export All Urls 11 "request“;"section“;"active“; "speed“;"compliant“;"depth“;"inlinks" Packages : Stringr Ggplot Dplyr Readxl
  • 12. R Examples  Crawl via Screaming Frog  Classify URLs by :  Section  Load Time  Number of Inlinks  Detect Active Pages  Min 1 visit per month  Detect Compliant Pages  Canonical Not Equal  Meta No-index  Bad HTTP Status Code  Detect Duplicate Meta 12
  • 13. R : read files 13 # Read xlsx file urls <- read_excel("internal_html_blog.xlsx", sheet = 1, col_names = TRUE, skip=1) # Read csv file urls <- read.csv2("internal_html_blog.csv", sep=";", header = TRUE)
  • 14. Detect Active Pages 14 #default urls_select$Active <- FALSE urls_select$Active[ which(urls_select$`GA Sessions` > 0) ] <- TRUE #factor urls_select$Active <- as.factor(urls_select$Active)
  • 15. Classify URLs by Section 15 schemas <- read.csv(“conf.csv”,header = FALSE, col.names = "schema", stringsAsFactors = FALSE) urls_select$Cat <- "no match" for (j in 1:length(schemas)) { urls_select$Cat[ which(stri_detect_fixed(urls_select$Address , schemas[j]) ) ] <- schemas[j] } /agenda/sorties-cinema/ /agenda/parutions/ /agenda/evenements/ /agenda/programme-tv/ /encyclopedie/ Conf.csv
  • 16. Classify URLs By Load Time 16 urls_select$Speed <- NA urls_select$Speed[ which(urls_select$`Response Time` < 0.501 ) ] <- "Fast“ urls_select$Speed [ which(urls_select$`Response Time` >= 0.501 & urls_select$`Response Time` < 1.001) ] <- "Medium“ urls_select$Speed[ which(urls_select$`Response Time` >= 1.001 & urls_select$`Response Time` < 2.001) ] <- "Slow“ urls_select$Speed[ which(urls_select$`Response Time` >= 2.001) ] <- "Slowest" urls_select$Speed <- as.factor(urls_select$Speed)
  • 17. Classify URLs By Number of Inlinks 17 urls_select$`Group Inlinks` <- "URLs with No Follow Inlinks" urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` < 1 ) ] <- "URLs with No Follow Inlinks" urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` == 1 ) ] <- "URLs with 1 Follow Inlink“ urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` > 1 & urls_select$`Inlinks` < 6) ] <- "URLs with 2 to 5 Follow Inlinks“ urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 6 & urls_select$`Inlinks` < 11 ) ] <- "URLs with 5 to 10 Follow Inlinks“ urls_select$`Group Inlinks`[ which(urls_select$`Inlinks` >= 11) ] <- "URLs with more than 10 Follow Inlinks" urls_select$`Group Inlinks` <- as.factor(urls_select$`Group Inlinks`)
  • 18. Detect Compliant Pages 18 # Compliant Pages # Canonical Not Equal # Meta No-index # Bad HTTP Status Code # Not Equal urls_select$Compliant <- TRUE urls_select$Compliant[ which(urls_select$`Status Code` != 200 | urls_select$`Canonical Link Element 1` != urls_select$Address | urls_select$Status != "OK" | grepl("noindex",urls_select$`Meta Robots 1`) ) ] <- FALSE urls_select$Compliant <- as.factor(urls_select$Compliant)
  • 19. Detect Duplicata Meta 19 urls_select$`Status Title` <- 'Unique' urls_select$`Status Title`[ which(urls_select$`Title 1 Length` == 0) ] <- "No Set" urls_select$`Status Description` <- 'Unique' urls_select$`Status Description`[ which(urls_select$`Meta Description 1 Length` == 0) ] <- "No Set" urls_select$`Status H1` <- 'Unique' urls_select$`Status H1`[ which(urls_select$`H1-1 Length` == 0) ] <- "No Set" urls_select$`Status Title`[ which(duplicated(urls_select$`Title 1`)) ] <- 'Duplicate' urls_select$`Status Description`[ which(duplicated(urls_select$`Meta Description 1`)) ] <- 'Duplicate' urls_select$`Status H1`[ which(duplicated(urls_select$`H1-1`)) ] <- 'Duplicate' urls_select$`Status Title` <- as.factor(urls_select$`Status Title`) urls_select$`Status Description` <- as.factor(urls_select$`Status Description`) urls_select$`Status H1` <- as.factor(urls_select$`Status H1`)
  • 20. Generate CSV 20 urls_light <- select(urls_select,Address,Cat,Active,Speed,Compliant,Level,Inlinks) %>% mutate(Address=gsub(“http://moniste.fr","",Address)) colnames(urls_light) <- c("request","section","active","speed","compliant","depth","inlinks") write.csv2(“file.csv”, filename, row.names = FALSE) Package dplyr : select and mutate Edit colnames Use write.csv2
  • 21. R : ggplot2 command 21 DATA Create the ggplot object and populate it with data (always a data frame) ggplot( mydata, aes( x=section,y=count, fill=active )) LAYERS Add layer(s) + geom_point() FACET Used for conditionning on variable(s) + facet_grid(~rescode)
  • 23. R Chart : Active Pages 23 urls_level_active <- group_by(urls_select,Level,Active) %>% summarise(count = n()) %>% filter(Level<12) Geometry Aesthetic p <- ggplot(urls_level_active, aes(x=Level, y=count, fill=Active) ) + geom_bar(stat = "identity", position = "stack") + scale_fill_manual(values=c("#e5e500", "#4DBD33")) + labs(x = "Depth", y ="Crawled URLs") #display print(p) # save in file ggsave(file=“chart.png")
  • 24. R Chart : GA Sessions 24 urls_cat_gasessions <- aggregate( urls_select$`GA Sessions`, by=list(Cat=urls_select$Cat, urls_select$Compliant), FUN=sum, na.rm=TRUE) colnames(urls_cat_gasessions) <- c("Category","Compliant","GA Sessions") p <- ggplot(urls_cat_gasessions, aes(x=Category, y=`GA Sessions`, fill=Compliant))+ geom_bar(stat = "identity", position = "stack") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + labs(x = "Section", y ="Sessions") + scale_fill_manual(values=c("#e5e500","#4DBD33")) #display print(p) # save in file ggsave(file=“chart.png")
  • 25. R Chart : Compliant 25 urls_cat_compliant_statuscode <- group_by(urls_select,Cat, Compliant,`Status Code`) %>% summarise(count = n()) %>% filter(grepl(200,`Status Code`) | grepl(301,`Status Code`)) p <- ggplot(urls_cat_compliant_statuscode, aes(x=Cat, y=count, fill= Compliant ) ) + geom_bar(stat = "identity", position = "stack") + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + facet_grid(`Status Code` ~ .) + labs(x = "Section", y ="Crawled URLs") + scale_fill_manual(values=c("#e5e500","#4DBD33"))
  • 26. R : SEO Cheat Sheet 26 Package Dplyr select() allows you to rapidly zoom in on a useful subset using operations that usually only work on numeric variable positions mutate() a data frame by adding new or replacing existing columns filter() allows you to select a subset of rows in a data frame. Package Gplot2 aes - geom ggsave() Package Readxl read_excel() read.csv2() write.csv2()
  • 27. ELK
  • 28. Architecture 28 Hard to monitor and optimize host server performance
  • 32. PaasLogs 32 164 noeuds au sein du cluster Elastic Search 180 machines connectées Entre 100 000 et 300 000 logs traités par seconde 12 milliards de logs transitent tous les jours 211 milliards de documents enregistrés 8 clicks and 3 copy/paste to use it !
  • 36. PaasLogs : Streams 36 The Streams are the recipient of your logs. When you send a log with the right stream token, it arrives automatically to your stream in a awesome software named Graylog.
  • 37. PaasLogs : Dashboards 37 The Dashboard is the global view of your logs, A Dashboard is an efficient way to exploit your logs and to view global information like metrics and trends about your data without being overwhelmed by the logs details.
  • 38. PaasLogs : Aliases 38 The Aliases will allow you to access directly your data from your Kibana or using an Elasticsearch query DON’T FORGET TO ENABLE KIBANA INDICES AND WRITE YOUR USER PASSWORD
  • 39. PaasLogs : Inputs 39 The Inputs will allow you to ask OVH to host your own dedicated collector like Logstash or Flowgger.
  • 40. PaasLogs : Network Configuration 40
  • 41. PaasLogs : Plugins Logstash 41 OVHCOMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion_num:float})?|%{DATA:rawrequest})" %{NUMBER:response_int:int} (?:%{NUMBER:bytes_int:int}|-) OVHCOMBINEDAPACHELOG %{OVHCOMMONAPACHELOG} "%{NOTSPACE:referrer}" %{QS:agent}
  • 42. PaasLogs : Config Logstash 42 if [type] == "apache" { grok { match => [ "message", "%{OVHCOMBINEDAPACHELOG}"] patterns_dir => "/opt/logstash/patterns" } } if [type] == "csv_infos" { csv { columns => ["request", "section","active", "speed", "compliant","depth","inlinks"] separator => ";" } }
  • 43. How to send Logs to PaasLogs ? 43
  • 45. Filebeat : Install 45 Install filebeat curl -L -O https://download.elastic.co/beats/filebeat/filebeat_1.2.1_amd64.deb sudo dpkg -i filebeat_1.2.1_amd64.deb https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-installation.html
  • 46. Filebeat : Edit filebeat.yml 46 filebeat: prospectors: - paths: - /home/ubuntu/lib/apache2/log/access.log input_type: log fields_under_root: true document_type: apache - paths: - /home/ubuntu/workspace/csv/crawled-urls-filebeat-*.csv input_type: csv fields_under_root: true document_type: csv_infos output: logstash: hosts: ["c002-5717e1b5d2ee5e00095cea38.in.laas.runabove.com:5044"] worker: 1 tls: certificate_authorities: ["/home/ubuntu/workspace/certificat/key.crt"]
  • 47. Filebeat : Start 47 Copy / Paste Key.crt -----BEGIN CERTIFICATE----- MIIDozCCAougAwIBAgIJALxR4fTZlzQMMA0GCSqGSIb3DQEBCwUAMGgxCzAJBgNVBAYTAkZSMQ 8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBAcMBVBhcmlzMQwwCgYDVQQKDANPVkgxCzAJBgNVB AYTAkZSMR0wGwYDVQQDDBRpbi5sYWFzLnJ1bmFib3ZlLmNvbTAeFw0xNjAzMTAxNTEzMDNaFw0 xNzAzMTAxNTEzMDNaMGgxCzAJBgNVBAYTAkZSMQ8wDQYDVQQIDAZGcmFuY2UxDjAMBgNVBA cMBVBhcmlzMQwwCgYDVQQKDANPVkgx -----END CERTIFICATE----- Start Filebeat sudo /etc/init.d/filebeat start sudo /etc/init.d/filebeat stop
  • 48. How to combine multiple sources ? 48
  • 49. Paaslogs : Plugins ES 49 Description : Copies fields from previous log events in Elasticsearch to current events if [type] == "apache" { elasticsearch { hosts => "laas.runabove.com" index => "logsDataSEO" # alias ssl => true query => ‘ type:csv_infos AND request: "%{[request]}" ‘ fields => [["speed","speed"],["compliant","compliant"], ["section","section"],["active","active"], ["depth","depth"],["inlinks","inlinks"]] } } # TIP : fields => [[src,dest],[src,dest]]
  • 51. Kibana : Install 51 Download Kibana 4.1 • Download and unzip Kibana 4 • Extract your archive • Open config/kibana.yml in an editor • Set the elasticsearch.url to point at your Elasticsearch instance • Run ./bin/kibana (or binkibana.bat on Windows) • Point your browser athttp://yourhost.com:5601
  • 52. Kibana : Edit Kibana.yml 52 Update Kibana.xml server.port: 8080 server.host: "0.0.0.0" elasticsearch.url: "https://laas.runabove.com:9200" elasticsearch.preserveHost: true kibana.index: "ra-logs-33078" kibana.defaultAppId: "discover" elasticsearch.username: "ra-logs-33078" elasticsearch.password: "rHftest6APlolNcc6"
  • 53. Kibana : Line Chart 53 Number of active crawled from google over a period of time
  • 54. Kibana : Vertical Bar Chart 54
  • 55. Kibana : Pie Chart 55
  • 56. How to compare two periods ? 56
  • 57. Kibana : Use Date Range 57
  • 58. Final Architecture PassLogs Kibana Filebeat @ 58 @ Soft RealTime -- Old Logs IIS Apache Ngnix HA Proxy
  • 59. Test yourself 59 Use Screaming Frog Spider Tool www.screamingfrog.co.uk Teach R www.datacamp.com www.data-seo.com www.moise-le-geek.fr/push-your-hands-in-the-r-introduction/ Test PassLogs www.runabove.com Install Kibana www.elastic.co/downloads/kibana
  • 60. TODO List 60 - Create a GitHub Repository with all source code - Add Plugin Logstash to do a reverse DNS lookup - Schedule A Crawl By Command Line - Upload Screaming Frog File to web server
  • 61. Thank you Keep in touch June 10th, 2016 @vincentterrasi Vincent Terrasi