3. @dsottimanowww.smxl.it #SMXL19 @dsottimanowww.smxl.it #SMXL19
Working knowledge of HTML, CSS and JavaScript
Interpreting data from tools like SEMRush, Ahrefs,
Screaming Frog, etc.
Strong grasp of Microsoft Outlook, Excel, PowerPoint, and
Word
Yep
always
it’s 2019.
come on.
13. @dsottimanowww.smxl.it #SMXL19
How do you parse the URL path
here?
https://www.lastampa.it/sport/calcio/2
019/10/26/news/pareggia-anche-l-inter
-fallito-il-sorpasso-sulla-juve-1.3779317
4
37. @dsottimanowww.smxl.it #SMXL19
I need to scrape Google search
results
To..
perform competitive analysis
check if a page is indexed
check page ranking
45. @dsottimanowww.smxl.it #SMXL19
You’ll need an API Key first.
https://proxycrawl.com
ProxyCrawl is a great API based crawler with
several options. It’s free for 1000 requests a month.
Note: I do not work for ProxyCrawl and do not receive any
compensation from them
48. @dsottimanowww.smxl.it #SMXL19
Then we can scrape using
Importxml like this
=importxml(“https://api.proxycrawl.co
m/?token=123&url=https://lastampa.it”,
”//h1”)
50. @dsottimanowww.smxl.it #SMXL19
But that isn’t smart.
If we store the page, we can make
mistakes in code without paying for extra
requests.
Luckily, proxycrawl.com makes this very
easy.
56. @dsottimanowww.smxl.it #SMXL19
Using the =GOOGLE_SEARCH() function, we can do a
site:thetrainline.com Milan to Turin by train to find the
next most relevant page
62. @dsottimanowww.smxl.it #SMXL19
Step 1 - Use Wayback Machine to
save pages
To save a page in the archive,
simply add
https://web.archive.org/save/ to
the start of the URL
We’re going to automate this.
Spreadsheet: bit.do/smx-milan
Code: bit.do/smxl-milan-code
63. @dsottimanowww.smxl.it #SMXL19
Step 2 - Set up automated captures
Look for “WAYBACK_SAVE” in the code and
change the URLs
Spreadsheet: bit.do/smx-milan
Code: bit.do/smxl-milan-code
64. @dsottimanowww.smxl.it #SMXL19
Step 2.1 - Add email
Change the variable emailAddress to your email
address if you want email updates.
example: var emailAddress = “info@example.com”
72. @dsottimanowww.smxl.it #SMXL19
Step 1 - Training data
3 sites’ organic
keyword data from
Semrush.com
Nytimes.com =
informational
Yelp.com = local
Amazon.com =
transactional