Quantcast
Channel: GSA Email Spider — GSA SEO Forum
Viewing all articles
Browse latest Browse all 237

[GSA Email Spider] How to exclude useless directories?

$
0
0

When scraping emails, a site will have something like:

 

domain.com/contacts or domain.com/people/contact, or something like that, one specific directory that has the emails, and pretty much all the rest is useless, I don't really need to scrape domain.com/shop or domain.com/products, I waste dozens of hours on that.

 

Yet for large sites, there are thousands of useless pages in those directories. Would it be possible to do something like:

 

- If a lot of emails are found in a specific directory, ignore the other directories. This means that if I set depth to level 3, it will go to level 3 in that directory only.

 

In other words, is there a smart and more efficient way to detect which directories have contact, and make it ignore all the rest?

 

Thanks!

 

Edit: I normally have 1,000+ urls so I can't look up each site manually and enter the directory with contacts, I would like to auto-detect it somehow.


Viewing all articles
Browse latest Browse all 237

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>