This weekend I have been processing a large 200,000 location data file for a Store Locator Plus customer. This is one of the larger files I have processed on my test system and it is the first file over 60,000 locations I’ve processed since Store Locator Plus 4.2 and WordPress 4.x have been released. This large file processing and the geocoding required is taxing several systems in the Store Locator Plus hierarchy. WordPress, Google OEM API calls, and the locator are all showing their weak spots with this volume of data processing. They can all handle it to some degree, but maximizing efficiency is the key.
The temporary solution to most of the issues is to increase memory and process limits. These are some of the key findings, as posted on the CSV Import documentation pages for Store Locator Plus:
Check your php.ini post_max_size setting if doing a direct file import versus a cron URL based import. post_max_size is typically set to 8 (MiB) on most servers. This is typically enough for around 25,000 locations but it depends on how long your descriptions are and how many data fields you have filled out. SLP 4.2.41 will warn you if you try to upload a file larger than your post_max_size limit.
Check your php.ini memory_limit setting and make sure it is large enough to handle the WordPress overhead plus the size of your CSV file times two. The WordPress database interface and the CSV file processing will consume lots of memory. The more plugins, widgets, and advanced theme features you have more more memory WordPress will use and the more PHP memory will leak over time. A setting of 256M is enough for approximately 15,000 locations.
Check your wp-config WP_MEMORY_LIMIT. You may need to add this define to wp-config.php. define(‘WP_MEMORY_LIMIT’ , ‘256M’). The number needs to be equal-to or less-than the php.ini memory-limit. It is the WordPress-specific memory limit and works with php.ini memory_limit.
Check your wp-config WP_MAX_MEMORY_LIMIT. You may need to add this define to wp-config.php. define(‘WP_MAX_MEMORY_LIMIT’ , ‘256M’). This is the WordPress admin interface memory limit and works like WP_MEMORY_LIMIT for admin pages.
Set Duplicates Handling to Add especially if you know you do not have duplicate locations in your data. SLP 4.2.41 further improves the performance when using ‘add’ mode by eliminating extra data reads from the database.
Set Server-To-Server speed to Fast under the General Settings tab unless you are on a shared host or experience a large number of uncoded locations during import.
Set the PHP Time Limit to 0 (unlimited) under the General Settings tab. For hosting providers that allow your web apps to change this, the unlimited value will let the import run to completion.
Keep in mind Google limits you to 2500 latitude/longitude (geocoding) lookups per 24 hours per server IP address. If you are on a shared host you share that limit with all other sites on that host.
However, even with all of these settings tweaked to fairly high values for my VirtualBox development system running on a MacBook Pro Retina host, the 4GB of RAM allocated to WordPress still is not enough. The system eventually runs out of memory when the file gets close to the 45,000 location mark. Luckily the “skip duplicate addresses” option allows the process to continue. The “out of memory” error still rears its ugly head in the wpdb WordPress database engine and is a problem for handling larger files.
Enter Xdebug and memory profiling. Somewhere buried in the Store Locator Plus code, WordPress code, PHP MySQL interface, or PHP core engine there is a memory leak. With a complex application environment finding the leak is going to be a monumental task. It may not be something I can fix, but if I can mitigate the memory usage when processing large files that will help enterprise-class sites use Store Locator Plus with confidence.
Getting Xdebug On CentOS 7
If you follow my blog posts on development you will know that I run a self-contained WordPress development environment. The system uses Vagrant to fire up a VirtualBox guest that runs CentOS 7 with GUI tools along with a full WordPress install including my plugin code. This gives me a 2GB “box file” that I can ship around and have my full self-contained development environment on any system capable of running VirutalBox. Here is how I get Xdebug connected to my local Apache server running WordPress.
Install xdebug from the yum install script.
# sudo yum install php-pecl-xdebug.x86_64
Turn on xdebug in the php.ini file
# find / -name xdebug.so /usr/lib64/php/modules/xdebug.so #sudo vim /etc/php.ini zend_extension="/usr/lib64/php/modules/xdebug.so"
Check if xdebug is installed:
# php --version ... PHP 5.4.16 .... with xdebug v2.2.7
Enable some xdebug features by editing php.ini again.
# sudo vim /etc/php.ini xdebug.default_enable=1 ; turns on xdebug any time a PHP page loads on this local server xdebug.idekey="PHPSTORM" ; in case I turn on the automated listener for built-in PHP Storm debugging/tracing xdebug.profiler_enable = 1 ; turn on the profiler which creates cachegrind files for stack trace/CPU execution analysis xdebug.profiler_enable_trigger = 1; turn on a cookie "hook" so third party browser plugins can turn the profiler on/off with a bookmark link xdebug.profiler_output_dir = "/var/www/xdebug" ; make sure this directory is writable by apache and readable by your local user xdebug.auto_trace = 1 ; when any page loads, enable the trace output for capturing memory data xdebug.show_mem_delta = 1 ; this is what tells trace to trace memory consumption changes on each function call xdebug.trace_output_dir = "/var/www/xdebug" ; same idea as the profiler output, this will be where trace txt files go
Restart the web server to get the php.ini settings in effect:
# sudo service httpd restart
At this point I can now open any WordPress page including the admin pages. Shortly after the page has rendered the web server will finish the processing through xdebug and a trace* file will appear in /var/www/xdebug. I can now see the stack trace of the functions that were called within WordPress with the memory consumption at each call. This is the start of tracking down which processes are eating up RAM while loading a large CSV file without adding thousands of debugging output lines in the web app.
Be warned, if you are tracing large repetitive processes your trace file can be many GiB in size, make sure you have the disk space to run a full trace.