Posted on

Boosting WordPress Site Performance : Upgrade PHP

As with every single WordCamp I’ve attended there is something new to be learned no matter how much of a veteran you are.   My 5th WordCamp at WordCamp US 2015 was no different.    There are a lot of things I will be adding to my system admin and my development tool belt after the past 48 hours in Philadelphia.

Today’s update that was just employed on the Store Locator Plus website:   Upgrading PHP.

Turns out that many web hosting packages and server images, including the Amazon Linux Image, run VERY OLD versions of PHP.    I knew that.   What I didn’t know was the PERFORMANCE GAINS of upgrading even a minor version of PHP.    PHP 5.6 is about 25% faster than PHP 5.3.    PHP 5.3 was the version I was running on this site until midnight.

WP Performance On PHP
WP Performance on PHP. Source: http://talks.php.net/fluent15#/wpbench

The upgrade process?  A few dozen command-line commands, testing the site, and restoring the name server configurations from the Apache config file which the upgrade process auto-saved for me.  EASY.

What about PHP 7?   That is 2-3x faster.  Not 2%.  100 – 200%.   WOW!    As soon as Amazon releases the install packages for their RHEL derivative OS it will be time to upgrade.

 

If you are not sure what version your web server is running (it can be different than command line on you server) you can find that info in the Store Locator Plus info tab.

SLP PHP Info
SLP PHP Info

The take-away?   If you are not running PHP 5.6, the latest release of PHP prior to PHP 7, get on it.  One of the main components of your WordPress stack will be running a lot faster, have more bug fixes, security patches, and more.

Posted on

PHP Switch Vs If Performance

When writing a complex application such as Store Locator Plus, you will often find multiple logic branches to handle a given task.   There are two oft-used methods for processing the logic; the If-Else construct and a Switch statement.    Since I am always looking to optimize the Store Locator Plus codebase for performance, some sites do have hundreds-of-thousands of locations after all, it was time to look into the performance-versus-readability of those two options.

The general consensus, though I’ve not taken the time to run performance tests with the WordPress stack myself, is that “you should use whatever makes your code easier to read are more easily maintained”.  For me that means using switch statements.    I find the construct much easier to extend and not cause inadvertent side effects.  Something I’ve learned in 20-plus years of working on code teams and in long-run projects like Store Locator Plus.

On the pure performance standpoint an if-else can be marginally faster if performing less than 5 logic comparisons.

PHP If Else Statement
PHP If Else Statement

Switch statements will often be faster at-or-near 5 logic comparisons as code optimization within C, and likely carried forth in the PHP psuedo-compiler, will often turn the 5+ logic branches of a switch statement into a hash table.  Hash tables tend to be faster with all branches of the code having equal access time.    Statistically speaking a large number of iterations will favor the equal access time model over the “first-fastest” model of an If-Else.

PHP Switch Statement
PHP Switch Statement

Possibly faster and always easier-to-extend and read, Switch will be my “weapon of choice” whenever I have more than  a simple 2-state if/else logic branch to be tested.

Posted on

Store Locator Plus Improves High Volume Site Processing

Store Locator Plus 4.2.41 focuses on speeding up the performance of location import processing for comma separated files of 50,000 locations or more.

CSV Import in Add Mode

Some of the updates include reduce execution time when setting the CSV import duplicates processing to ‘add’ mode.    Add mode is useful when importing to a “clean” list of locations.    This mode eliminated extra processing when you know you are not updating existing locations during a Pro Pack CSV Import.

Improved Import Error Reporting

The CSV Import processing system has been updated to provide better reporting when CSV imports of location data via Pro Pack or category data via Tagalong fails the import process.   Various file processing issues as well as memory restriction errors are reported with details including hints as to how to manage the issue.

Memory Improvements

Memory consumption is reduced on per-process system calls.  This reduction can be significant on systems that are running large location lists, saving up to 1k per location processing request.

Store Locator Plus Change Log

Posted on

WordPress Performance Tuning

MySQL InnoDB Banner

The Charleston Software Associates website has been getting slower every week.   For the past month I’ve been spending an hour or two trying to figure out what was going on.    I’ve used several tools to analyze the site but it seems no matter what I do the site gets marginally faster at best.   I still receive daily alerts of “your site took more than 15 seconds to respond” from the Windows Azure monitoring service that pings my hosted Linux virtual machines and alerts me to issues.

Last night I decide to switch some of my tables from MyISAM to InnoDB.  The theory being that the MySQL InnoDB engine does row-level locking and since the bulk of my site is order transactions and forum posts the insert and update operations are at least as frequent as the read-only operations.   Row level locking should be better.

Wow, did I SCREW THAT ONE UP.

Actually I only partly screwed it up.

I made the change around 11PM last evening and everything looked OK.  No notable performance gains but I wanted to see what happened over 24 hours with the “your site is slow as hell” notices I get every day.   By noon today it was clear I moved in the wrong direction.   I was now getting “your site sucks” warnings from my monitoring service every half-hour.   Ouch!

Fixing InnoDB

It turns out you should not just convert a table from MyISAM to InnoDB without first “checking under the hood”.   While it is easy to convert from MyISAM to InnoDB with a simple MySQL ALTER TABLE <x> ENGINE = InnoDB command (make sure the table does not have Full Text indexes with SHOW INDEXES IN <x> first) that may not be prudent.

When I got on my server today I noticed the CPU would spike to 90% or more, then drop back to normal.  Every time it was MySQL spinning the CPU as hard as it could and always when accessing the half-dozen InnoDB tables I created.

Turns out that InnoDB could use some serious tuning before turning it on.  InnoDB can really use a helping hand by allowing multiple CPU threads and by utilizing a buffer.   Turns out there are some other general MySQL buffers and MyISAM settings that can help performance as well.

For MySQL on a Red Hat based distribution, and many other Linux distros, you can tweak the initial startup settings by editing the my.cnf file.  On CentOS it lives in /etc/my.cnf.

Here is what I added to the file in the [mysqld] section:

# default: 1M
innodb_additional_mem_pool_size=32M
# default: 8M
innodb_buffer_pool_size=256M
# default: 0
innodb_thread_concurrency=8
# default: 128K (131072)
join_buffer_size=64M
# default:8M
myisam_sort_buffer_size=32M
# default: 0 (disabled)
query_cache_size=128M
# default: 2M
sort_buffer_size=32M

The MySQL InnoDB settings page will tell you more about each setting. Google can help you find out what the other settings mean.

You can use the SHOW VARIABLES MySQL command to see what your current settings are.   Most of the settings can be added to the my.cnf file and tweaked.  Just make sure you read about settings before blindly guessing what correct values are for your server.  Learn what they do, tweak, and monitor.   Hopefully you can avoid doing this to your CPU and response times:

InnoDB Tuning Fail
InnoDB Tuning Fail

By increasing the buffer sizes and turning on multiple threads my CPU is not spinning up to 90% or more when an SQL query is sent along from WordPress. With 4GB of RAM on my server I am certain I can tune this even further and there are other settings that will help, but this is a great start to getting MySQL to perform better than it has been.

Lesson Learned

The lesson here, especially if you are on a dedicated server, is to look at your MySQL configuration. The “out of the box” setup is not usually ready for production in ANY environment. I should know better, but when running a business solo it is one of the many “I’ll do it later” tasks that don’t get addressed until it bites you in the ass. Like it did for me today.

My guess is my quad-core 4GB RAM server can more than handle the load my website brings to it.    If I turn all of the right dials in the right direction I should be able to get the page response times to under 1 second.   As one of the most impatient web surfers you’ll meet I find my current 4-second average page response time is simply far too long to wait.

Some good articles on WordPress DB Tuning

http://wordpresswebmaster.net/mysql-database-optimization-on-wordpress/
http://www.prelovac.com/vladimir/wordpress-optimization-guide

Posted on

WordPress Site Performance

The Charleston Software Associates web server came crashing down again this afternoon.  About once-per-month the server has been going into an unresponsive state.   Today I finally had enough logging turned on to track down the issue.   The problem?   The Apache web server was running out of memory.

The server was not under heavy load, but just the right combination of visitors and background processes triggered critical mass.   The wait times for a process to finish were long enough to start putting more things in the queue than could be emptied.   Memory soon ran out and the server stopped responding.

In researching the problem I came across two things that have made a substantial impact on the performance of my WordPress site.   If you are running a WordPress site, even with a limited number of visitors, you  may want to employ these techniques.

W3 Total Cache

W3 Total Cache is one of the most popular and top recommended WordPress plugins.    It is noted in several areas of the WordPress Codex as well as the WordPress forums and on core contributor blogs.    It is a powerful site caching plugin that can yield significant improvements in page loading time.

In the simplest configuration, you can turn on page caching which will run the PHP script that builds your WordPress page and create a static HTML file.   All future requests for your static pages will serve the HTML file versus loading the entire PHP and WordPress codebase.   This is a significant performance boost for many sites.     If your content changes or you change the style of your site, the plugin will re-generate the pages automatically.

This is just one simple way that W3 Total Cache can improve your site performance.     After reviewing the technology, the configuration parameters, and the various options available, using W3 Total Cache can be a great way to improve the visitor experience on your site.

W3 Total Cache and Store Locator Plus

Just be careful with the extra options for JavaScript minify and caching as it can wreak havoc on more complex scripts.   The Store Locator Plus scripts, for instance, can have issues on sites that either minify the script and/or add date/time stamps or version stamps to the JavaScript calls.   Those timestamps create unexpected parameters for the backend AJAX listener.

PHP APC

PHP APC is a PHP caching system that can be implemented by any PHP application.   Enabling this feature is typically done at the system admin level and is a system-wide setting.   Thus, this is more appropriate for people running a dedicated server.  If you are on a shared server you will likely be limited to disk storage caching in plugins like W3 Total Cache or Super Cache.

After installing W3 Total Cache, I noticed settings for Opcode style caching.    After some research I found the simplest way to implement the more advanced Opcode cache was to install PHP APC.   PHP APC, or the Alternative PHP Cache, is a simple install on most Linux systems running PHP.  On my CentOS boxes I can just run the yum install php-pecl-apc command.  There is a simple command on most Linux servers.  The APC module needs to special compilation, simply install and restart Apache.

Once you have PHP APC installed the easiest way to take advantage of it is to go into W3 Total Cache and enable Object Cache and set the cache type to Opcode : APC.    This is the recommended option and should be used, when possible, over the database cache.

One side note here, this can be memory intensive.   Thus it is best to only use the APC cache for memory-centric applications, such as the storage of PHP code modules.    Thus, enabling this for object cache is a great use of APC.    However, using it to store cache pages is not optimal use of the memory stack.   Your WordPress site probably has more pages that are accessed on a regular basis than will fit in the memory cache, so use the disk storage setting for the page cache and reserve the APC cache for objects.

When you configure W3 Total Cache to use APC to store objects, the most often used sections of the WordPress core and more popular plugins will load into memory.   Now, whenever someone is visiting your site much of the calculation and algorithmic “gyrations” that happen to build a page or to load configuration settings are already pre-calculated and stored in RAM.   Through W3 Total Cache, WordPress can simply fetch the “ready to go” information directly from RAM, saving on disk I/O an dramatically increasing performance.

Configuring APC

It should be noted the out-of-the-box, APC is set for a small-to-moderate environment.  WordPress with W3 Total Cache is a bit heavier than a simple web app, so you will likely want to change the default APC parameters.    You can find the settings in the php.ini file, or on newer configurations in the php.d directory in the apc.ini file.     The first thing you should consider changing is the base memory size reserved for APC.   The default is set to 64M which is not really enough for WordPress to make good use of it.  On my site I find that 128M seems adequate.

If you are not sure about how your server is performing with the setting defaults, hunt down the apc.php script on your server (it was installed when you installed php-apc) and put it somewhere in your web server directory path.  I do NOT recommend putting it in the document root, as noted in the APC hints.   Instead put it in a protected sub-directory.   Access it by surfing directly to the URL path where you installed apc.php.

The first thing you should look at is the “Detailed Memory Usage and Fragmentation” graph.    If your fragmentation is over 50% then you probably need to adjust  your working memory space or adjust which apps are using APC (in W3 Total Cache, unset Opcode : APC and use Disk Store for everything, then turn on Opcode : APC one-at-a-time).

memory fragmentation chart in apc
memory fragmentation chart in apc

The second thing to look at, once you’ve adjusted the default memory share, is the free versus used memory for APC.   You want to have a small amount of free memory available.   Too much and your server has less memory to work with for doing all the other work that is required to serve your pages, the stuff that is never cache.    Too little (0% free) and your fragmentation rises.

Here is what my server looks like with the 128M setting.   I have a little too much allocated to APC, but changing my setting from 128M to something like 112M isn’t going to gain me much.  The 16M of extra working memory pales in comparison to the average 2.7GB I have available on the server.

memory usage chart in apc
memory usage chart in apc

My Observations

On my server I noticed a few things immediately after spending 30 minutes to install W3 Total Cache and turning on/tuning APC with 128M of APC memory.   This is on a dedicated server running CentOS 6.4 server with 3.8GB of RAM and 2 1.12Ghz cores.

Server Load Average went from 1.8 to 0.09. 

Load average is an indicator of how badly “traffic is backed up” for the cores on your server.  On a 2-core server, like mine, you can think of the load average as the “number of cars waiting to cross the 2-lane bridge”.   On my server, if the number is less than 2 that means there is no bottleneck and the CPUs can operate at peak efficiency.  On a 2-core system the goal is to have the number always less than 2 and preferably less than 80% of 2.    At 100% utilization you consume more power and generate more heat which decreases the life span of the CPU.

Memory Consumption went from 1.8GB under light load to 0.6GB under the same load.

Page load time for the home page went from 750ms to 120ms on average.

cached page load time comparison
cached page load time comparison

By the way, that 18.31s spike in response time?  That is when the server started tripping over itself when memory and page requests could not keep up, the server load crossed the 2.0 line around 2PM and because it was a on a light-traffic day (Saturday afternoon) it took nearly 2 hours for the traffic jam to get so bad the server just gave up trying.

The bottom line, even if you are not running a site that is getting a  lot of visitors you can still improve the user experience by installing a simple plugin.  If you want to take it a step further, look into the PHP APC module.   Just be sure to have a backup and if possible test first on a staging environment.  After all, every server configuration and WordPress installation is different.  Not all plugins and themes will play well with a caching system.

 

Posted on

PHP + PostgreSQL App Performance Example

PHP and PostgreSQLOne of our projects this week was to improve the performance in a payment processing system for a client. The system performed well under limited data sets, but as the data set grew larger the performance response time increased exponentially. In reviewing the application we found several design flaws that were significantly impacting performance. This article outlines some of the issues we found.

Overview

The take-away from this article is that you should know the details of the implementation from ALL angles.  If you are a code junkie, then make sure you review and understand the database angle.  Same thing in reverse if you are a DB guru, take the time to understand the code.    Nobody can do an effective job of application design & implementation without understanding the big picture.

In this case there is way too much emphasis on putting all the business intelligence into the database.   While there is nothing wrong with that, and in fact that is often a preferred architecture, it was not well thought out and thus not well implemented in this case.   One of the bigger mistakes was putting the business intelligence into simple views versus using proper stored procedure designs.

Bottom line, sometimes the better solution is to put SOME intelligence in the data engine and move some of the extraction/looping and other processing logic on the code side.  ESPECIALLY in a case like this one where we know that the client will not, at least any time soon, be accessing the data from any applications outside the server-side PHP application that is implemented for them.  Thus we know we could put all the intelligence in the code, though that makes for a LOT more work if/when they decide to introduce things like mobile apps.

Lesson 1: Don’t Put Multiple Sub-Selects In A View

This is a simplified example from a simple view that was built in Postgres for the client.

SELECT a,b,
to_char(c_date, (select settings.value from settings where settings.id='date_format')) as c_date_str,
to_char(d_date, (select settings.value from settings where settings.id='date_format')) as d_date_str,
to_char(e_date, (select settings.value from settings where settings.id='date_format')) as e_date_str
...

This view is using a settings table which holds the date format.  The client can use the web interface to change the date format, which is stored in the settings table.   That is a good web app design.

Doing multiple queries to retrieve that date format in a single view is a bad design.   In the simple example we have above we end up hitting the database, and thus doing disk I/O, no less than FOUR TIMES for a single query.  That is going to be slow.

There are a myriad of better options here, here are the two options I would consider:

  • Move the sub-select into a stored procedure and turn it into a function.An intelligent design of that procedure will retain the initial data fetch in a global variable that is tested on each call, blocking future data I/O requests.   Data I/O is now 2calls v. 4+ for the view.
  • Return the raw data.Allow the code to format the strings.   The code can easily fetch the date format and apply the equivalent PHP formatting call ONCE and apply it to all raw data data.  This also cuts down the data I/O.Using raw data also increases the chances for the PostgreSQL engine to optimize the query via the internal heuristics engine.
In our application improvement the path taken was to avoid this view whenever possible.  As it turns out, this view is so complex and convoluted that there are often multiple shortcuts that get to just the few data elements that are needed.  Constructing new queries retrieved the necessary data without all the view complexities and data overload.In this case the view is so complex that is hampers performance throughout the application and has limited benefit.    The long term solution will be to break the view into a subset of stored procedures.  For the few cases where the larger complex view is actually viable we will see improved performance via an intelligent series of cascading stored procedures or code-side logic.

 Lesson 2: Use Parameter Binding

Any modern database and their related interfaces will support data binding.  If your database does not support this and you are building enterprise-class applications it is time to select a new data engine.  PostgreSQL has supported this for years.   Nearly all of the data interfaces, including PHP’s MDB2 interface have also supported parameter binding for years. With parameter binding you will get a significant performance boost when iterating over data, especially in a nested loop fashion.
In our example the code was doing something similar to this, simplified for instructional purposes:
$qry1 = 'SELECT v_id,a,b,c,d FROM my_first_view WHERE NOT paid';
$result = $database->db()->query($qry1);
$dataset1 = $result->fetchAll(MDB2_FETCHMODE_ASSOC); 
$datastack = $dataset1;
$qry2 = 'SELECT v_id,e,f,g FROM my_second_view WHERE NOT paid';
$result = $database->db()->query($qry2);
$dataset2 = $result->fetchAll(MDB2_FETCHMODE_ASSOC);
foreach ($dataset1 as $data1) {
    foreach ($dataset2 as $data2) {
        if ($data1['v_id'] == $data2['v_id']) { 
             $datastack['subdata'][] = $data2; 
        }
    }
}

There are several significant performance issues here.   To start with there is significant memory consumption as we need to collect ALL the data from the database into memory.  We then collate the data from two complete sets in memory to create a single data element.    There are much more efficient ways to do this without fetching all data in memory first.

The better option would be to fetch the data from dataset1 on a row-by-row basis and push the data onto the stack one record at a time.  The inner loop for dataset2 should then select a subset of data that is specifically for the matching v_id from the outer dataset1 loop.   This is where parameter binding comes in.

Here is the same loop in untested simplified code format, using parameter binding.  In our real-world example this one simple change increased performance more than 50x because the database can be much more intelligent about how it selects subsets of data from the database & the PHP overhead both in memory and stack management is significantly reduced:

// give us parameter binding in MDB2 please
$database->db()->loadModule('Extended'); 

// setup our queries
$qry1 = 'SELECT v_id,a,b,c,d FROM my_first_view WHERE NOT paid';
$qry2 = 'SELECT v_id,e,f,g FROM my_second_view WHERE v_id = ?';
// get the "outer" data
// since we know we will use all the "outer" data, just put it
// directly on the data stack, cutting this memory consumption in half
$result = $database->db()->query($qry1);
$datastack = $result->fetchAll(MDB2_FETCHMODE_ASSOC);

// still loop through outer data to drive the 2nd query
foreach ($datastack as $data1) {
    // Fetch the data2 subset of matching data as
    // a named array, only getting those records that
    // match on v_id... in essence an SQL JOIN done
    // via code
    $database->db()->extended()->getAll(
        $qry2,
        null,
        array($data1['v_id']),
        array('integer'),
        MDB2_FETCHMOD_ASSOC
        );
    // Now we attach each of the matching elements in
    // the "inner" data set to the data stack, attaching
    // it under the related v_id 
    //
    foreach ($dataset2 as $data2) {
             $datastack['v_id'][$data1['v_id']]['subdata'][] = $data2; 
    }
}

This can be further refined and improved per our discussion above, but you get the idea.  I’ll leave it to you to figure out how to further refine the process.

You may be asking “why didn’t you just do a simple JOIN in the database engine?”  Good question.  The real world example is much more complex than this and some of the data elements and views in play make that solution complex to maintain and causes the database engine to trip-up on the optimization and is actually SLOWER in our real world case.   Here we are simplifying to illustrate the general concept only.

 Summary

A couple of simple real-world examples of improving performance have been illustrated here.    When refactoring a real-world application there are often complex interactions that need to be researched & evaluated.  The solutions are rarely simple and often can be approached in several ways.   The options shown here are not necessarily the best solutions but are the solutions that were the quickest to implement while providing a significant performance increase to the client.

Finding the balance between results and costs is always a challenge from the business perspective.    From the technical perspective a similar balance is often necessary between database and code intelligence.  Knowing your environment as a whole will help guide you down the proper path toward improving application performance.

 

Posted on

PostgreSQL – Exclusive Locking On Triggers

We have been working on a client’s project and dealing with an especially complex view in a PostgreSQL database. The view has a few tables linked together as well as several sub queries, a case statement, and a number of elements that do aggregate functions to build a single record of the view.

After updating the admin panel to allow them to use the database & admin page for processing orders (versus an email + excel system) we ran into several issues. The first issue was a significant degradation in performance whenever the system was under moderate load. This is important to note as our baseline release testing did not uncover the problem. The logic in the app was syntactically and functionally correct. However the application misbehaved under load, causing a bottleneck and essentially shutting down the app.

The Table Trigger Loop

We implemented a series of update triggers and stored procedures to propagate certain data elements between tables. Part of the design allowed for 2 tables to update each other. The UI allowed for an admin user to update Table A, which called a trigger to push a status field update into Table B (the legacy data). In other places Table B *could* be updated first, so we had a trigger to update Table A. The problem is the triggers would loop. Table A triggers an update in Table B triggers an update in Table A etc.

The initial designed called for turning off the triggers in the other table within the stored procedure. This requires an ALTER TABLE command. ALTER TABLE forces an exclusive lock on the database. NOTHING else can happen during this command, it is one of the few (if not the only) truly exclusive locks on the locking hierarchy.

This really slows down bulk updates. “Lock entire table”… insert… “unlock entire table”.

Master/Slave Tables Is A Better Solution

A better solution was to take out the alter table commands completely and remove the Table B updating Table A trigger. We made an application logic decision to make Table A the master and Table B the slave and eliminate (and/or log) any where Table B gets updated first. That should NEVER happen in the new app but some legacy stuff may still do that.

This is much faster as it generates no exclusive DB locks.

Entire Application Assessment v. Technical Complexity

In this case thinking about the overall solution which includes the application design and implementation as well as the technical syntax on the application and DB layer all work together to improve performance. Knowing the application and coming up with an overall design solution is often as effective as fancy technological solutions. Sometimes less complexity or technical “mojo” is the right answer to a complex problem.