Posted by : Unknown
Friday, 24 May 2013
In the Goblin
- Use the Goblin's "Auto-Select" Judges
- If you need proxies for scraping Serps, check the "Perform Google Verification" checkbox.
- Choose Only Elite Or Anonymous proxies
- Activate your Blacklists to get rid of Codeen & Planetlab proxies.
Scraper Settings
These settings control how the Goblin downloads/gathers new proxies
- Read Timeout - Maximum time in seconds before the disconnecting from the proxy source after connecting.
- Connect Timeout - Maximum time to try and connect to the proxy source. If the time exceeds, the Goblin will skip that source and move on to the next.
- Max Simultaneous Connections - This is the number of proxy sources the Goblin will try to contact simultaneously.
- If you are running XP, patching your TCPIP.SYS will allow you to use more simultaneous connections. Simply put, the more simultaneous connections your computer can handle, the faster the Goblin will run.
- Bypass Built-In Url Sources - Checking this will mean the Goblin will not use the internal proxy sources to find proxies. Instead it will try to scrape proxies from the
section.
You can easily get better proxies, faster, by applying the following settings
-----------------------------------------------------------------------
Step #1:
Bypass the built in url's and use only the Premium lists.
This list is much smaller, about 6000. But the proxies have been
pre-filtered by me and are updated every 6 hours, so you'll be able to get at least a few
hundred live elite proxies from that list.
pre-filtered by me and are updated every 6 hours, so you'll be able to get at least a few
hundred live elite proxies from that list.
Step #2.
Enable preservation mode.
Enable preservation mode.
This way only your first run will be slow. The subsequent runs will just
filter the last run's results so it'll be much faster.
filter the last run's results so it'll be much faster.
Step #3.
Reduce your max connections to 25. (Judge)
Reduce your max connections to 25. (Judge)
-----------------------------------------------------------------------
Just by applying the above settings you can cut down the scraping time by hours. If you've got any tips of your own, do share them here.
In ScrapeBox
- Use a low value for the Max Connections for the Proxy Harvester
Menu >> Settings >> Adjust Maximum Connections >> Proxy Harvester : Set to max 5
- Change the proxy harvester's timeout settings to a high value. This might make SB run slightly slower, but it will make sure all working proxies are used.
Menu >> Settings >> Adjust Timeout Settings >> Proxy Harvester Timeout : Set to above 50 seconds
Use the New Proxy Harvester.
And finally when testing the proxies,
skip the google test, if you're only going to use the proxies for posting and not scraping.
Using the above setting, you should see a marked improvement in the number of valid proxies. :) If you've got some of your own tips & tricks do share them.
Important:
At least 95% of proxies the Goblin scrapes WILL PASS in ScrapeBox. If you experience results that are low and get many "failed" messages in ScrapeBox, IT IS NOT NORMAL! Contact our support team and we'll help you tune your results.