Geocoding large lists of locations is a frequent subject on the forums and in my inbox. I think there is yet another “Google API Key Over Query Limit” post in the main forums right now, in fact. I’ll touch briefly on what that message means and then delve into some new settings in SLP4 to address some comments from the Beta Test Group.
First let’s start by addressing the most common misconception about the Geocoding process. Your free Google Maps API key DOES NOTHING to help with the Overy Query Limit (OQL) issue. The API Key only has an impact if you are using the paid Enterprise License API Key from Google. If you did not pay $17,000 per year (or more) for your API Key then you can ignore that setting.
Second, why are you getting Over Query Limit when Google says they allow up to 2500 requests per day when you only have 400 locations? The most likely culprit is that the server where you are hosting your Store Locator Plus installation on WordPress is NOT assigned a dedicated IP address. Google’s primary (but not only) method of tracking API requests to geocode addresses is based on your IP address. If you did not specifically request and pay for a static IP address that is ONLY USED BY YOUR SITE then you are most likely sharing an IP Address with hundreds of other websites. Cloud hosting, shared virtual servers, virtual private servers, and a myriad of other “pretend to be dedicated” hosting solutions are all using IP sharing. Static IP addresses to a single site are rare these days and with the nearly-depleted IPv4 address space this issue is not getting better. The short answer, between you and your 578 other “nearby friends”, all 2500 requests have been used up on your server for the rolling 24-hour long day that Google is tracking.
Want to see this for yourself? Load up 400 locations, realize there is an issue and delete them all, re-load the list. You’ve just used up 800 units of your 2500 allocation. It only takes a few sites doing this, or your own re-loading of the list a few times, to use up your Google allocation. I know, I did it with a 1200 item list that I loaded twice and was locked out for 48 hours on my test server WITH a dedicated IP address.
SLP4 Over Query Limit Improvements
First a few new settings that vastly improve the “hit ratio” when loading a large CSV file in Pro Pack. In SLP4 there are a few changes that have been discussed on prior posts. Fist of all the geocoding process is smarter about dealing with things when the first “Over Query Limit” post has been reached. Google does not differentiate between “you are flooding us” (too many requests per second) from “you are done for the day” when sending back messages. Thus SLP4 does things a bit differently based on Google API V3 best practices:
1) Start by sending no more than one request every 1/10th of a second. (SLP3 was always waiting 0.5 seconds on EVERY request, so SLP4 can be much faster).
2) If an OQL message comes back from Google start by waiting 2 seconds until the next request. This is Googles suggested wait time for the first OQL message. After 2 seconds re-try the same address up to N times.
What is “N times”? In SLP4 the same address will be tried up to 3 times by default (default for SLP3 was do not retry). You can set this to any number from none to 10.
3) If the same address gets another OQL response, bump the wait time by 1 full second then try again. Keep doing this up to the “N times’ limit.
4) If that address reaches N times, go to the next address and start with however many seconds delay you reached.
5) As soon as an address does NOT get the OQL message, reset the system so addresses start going at 1/10th of a second intervals and the next OQL response starts over at a 2-second starting wait period.
This is the basic algorithm that will prevail until you’ve gone through all of the locations on the list. Yes, I know it can use more refinement and I have some ideas like adding a “after n OQL addresses in a row, stop trying to geocode and just load the list” or “stop trying to geocode and stop loading the list” or “abort the list”. I didn’t have enough time to add this into the product if SLP4 was ever going to ship before 2014 rolls around.
I’ve touched on the Google API Key setting (useless unless you paid a $17k Google license fee) and the Geocoding Retries settings. What is this new “Maximum Retry Delay” setting? If you look back at item 3 above you will note that the retry limit goes up by one full second every time an geocoding request fails. This is “remembered” across all attempts. In other words if you have 10 locations that failed geocoding and have retries set to 10, you could potentially be waiting 100 seconds BETWEEN ATTEMPTS. That is nearly 2-minutes between each location lookup, or up to 20 minutes per location as you get toward the tail end of your list. Thus the “maximum retry delay” setting. In reality this should stay something closer to 5 seconds and in a future release I will likely make this a drop-down and force users to pick a number from 1 to 10 seconds, but again time was not my friend when it came to adding features like this to the admin settings.
With SLP4 the new 1/10th-of-a-second throttle between each new location will mean loading up hundreds of locations should be far faster than SLP3. When you do hit that throttle, SLP4 is smarter about what to do, waiting the recommended 2 seconds instead of another simple half-second delay between requests. With SLP4, I’ve loaded lists of 1400 locations in about 10 minutes that used to take 30+ minutes with SLP3. Turns out when you hit the first OQL message from Google they treat you nicer if you don’t ask again within a half-second.
The 1/10th of a second initial rate was based on extensive research on how fast you can hit the Google servers before they tell you “slow down there you’re coming at us too fast”. The initial 2-second and follow-on 1-second delays are also based on Google recommendations, as of today at least, for throttle rates when you do reach the Overy Query Limit warning. It has made a BIG difference in my test cases and should help most customers with larger location lists.
Mitigating The Problem
How can you get locations geocoded when your site is on a busy shared server? If you have 2500+ locations? If you’ve had to reload locations more than once? Here are some tips and SLP4 tools that will help
1) Pro Pack import allows for the latitude/longitude to be part of the data set. Anything with a lat/long will NOT be processed through the geocoding system.
2) Pro Pack v4 has an export feature that will get all of your location data INCLUDING the lat/long back out to a CSV file. Instead of loading in raw data, try getting any existing lat/long data into the spreadsheet you are about to load.
3) Pro Pack v4 has an option to turn OFF the geocoding when loading your CSV file. This will allow you to self-throttle your locations. Load thousands of locations into the Store Locator Plus interface, then use the “Show Uncoded” filter to see just those locations that need geocoding. Set your page length to something like 100 or 500 locations, click Check All to select those 100 or 500 locations and choose “Geocode” from bulk actions. This will send a 100 or 500 location list to Google for processing rather than geocoding 500 or so locations from a 7000 location file that is waiting 5-seconds between locations when you’ve hit the Google limit for the day.
4) Use a third party service like Texas A&M’s goecoding service, which is provided at no-cost. They even have a bulk CSV processor you can use and you can buy “longer list processing” for far less than a Google license. Load up your CSV import for Store Locator Plus with their lat/long information. And yes, I have thought of ways to add a “use Texas A&M when Google hits OQL” or other options for a future Pro Pack release.
I know that loading thousands of locations into Store Locator Plus and getting them all geocoding can be a challenge. I’ve been in your shoes and continually work toward improving the process while addressing the hundreds of other feature requests and bug fixes that come in every month. Hopefully you’ll find SLP4 is a step in the right direction.