Tips for Optimizing & Updating MySQL

Your database is one of the most important aspects of your server software stack.  Call me old fashion, but I still rely on MySQL as my database of choice and don’t see myself changing any time soon.  Over the past 7 years of running FantasySP, I’ve learned quite a bit about maintaining MySQL and making sure my server is running as smooth as possible.

Keeping MySQL Up to Date

The most obvious way to ensure MySQL performs its best is to make sure that your server software is up to date.  Updating to the latest minor release of MySQL should be one of the safer routes to go in.  These fix bugs, security, and minor performance improvements.  You should update to the minor release versions on a pretty regular basis.

A major release version upgrade is much more involved.  If you see that MySQL is nearing its end of life, then it’s time you start planning.   I have done 2 major upgrades from 5.1 -> 5.5, and then from 5.5 -> 5.6.  Chances are you will encounter problems with query performance with a major release update.  MySQL Query Cache Optimizer can behave very differently depending on what major release version you are using.   Queries that ran fine on 5.4 will run poorly on 5.5.

When Upgrading MySQL to a New Major Release

In order to safely upgrade to a newer MySQL major release version you should first make sure that you are running something like NewRelic to monitor your query performance.  Get a good baseline for typical application performance so you can compare it to when you upgrade.

To be safe, you could be running the old & new version of MySQL on your development box if you can.  This way you can plug in two queries with the same dataset on two different version of MySQL and get an idea of performance beforehand.  I usually don’t do this and just deal with the shitstorm that ensues, but some of you may want to play it safe.

Usually when a query performs badly it is because the newer version of MySQL decided to ignore the indices that were previously created for a table.  I have seen this happen quite a few times.  If you are running a vanilla WordPress installation then you probably won’t have to deal with this, but custom apps will probably encounter this problem.  You’ll be able to see bad query performance when you use EXPLAIN on your troublesome query. Full table scans lead to terrible performance, so don’t be alarmed if this happens.

To fix this you may have to develop a better index to speed up performance, or you can force MySQL to use a specific index with the following query:

SELECT * FROM  table_name USE INDEX (names) WHERE last = ‘something’

Another reason an updated major release of MySQL runs poorly could be that your configuration file is outdated.  Sometimes they may change the names of something or remove it and MySQL will start with Query Cache disabled or default settings.  Make sure you save a backup of your original my.conf file.

Useful Scripts to See Performance

Using NewRelic is great, but there are two popular database scripts to help with diagnosing performance issues.  You have probably heard of them, but I just want to double check you have them.  One is mysqltuner.pl and the other is tuning-primer.sh.

They will tell you how your cache is performing, if you have full table scans, how many prunes are happening per day, if your query cache is big enough, etc.  All very useful stuff to determine performance.  I am assuming you know how to deal with fixing those issues, but if you don’t you can Google them or buy a MySQL book or two to learn more.

Software to Manage & Run MySQL Queries

My development environment of choice is Windows.  With that in mind, I like to run a local MySQL application to make my life easier to diagnose queries, create tables, indexes, synchronize data, etc.

My software of choice is SQLyog and can’t say enough about how awesome it is.  If you deal with custom MySQL queries a lot, then I strongly suggest using an application similar to this to make your life easier.

Pruning Old Data

MySQL databases tend to be like old attics where you just keep saving shit even if you don’t actually need it.  If you have data from 2011 that you don’t need then make sure you get rid of it.

If you properly prune your database on an annual basis then you can extend the life of your current hardware stack, save money, and have much better performance.  A smaller database means less space for MyISAM indexes, which means you need less memory for caching.  (Assuming you use MyISAM)

So the big question is, how do you prune old rows in a table with 20 GIG worth of data without using DELETE queries? We all know that DELETE queries are way too expensive to run at this scale.  The solution?  Create a new temporary table with the same indices and table names.  Then copy the data you need from the old table to the new table.  Once your done, DROP the original table and rename your new table to the old table.

I learned this trick from Sean at the Clicky blog and trimmed the size of my database in half.

 

Google Analytics Tracking Change on March 20th?

I noticed a sudden trend in my GA starting on March 20th. I thought it was a fluke, until I noticed that 3 days later the change in numbers still persists.

All signs point to Google Analytics blocking an IP range and/or bots from being tracked.

Starting on March 20th, the number of pageviews and unique visits had a sudden drop.  This also coincided with an increase of pages per visit (shown below).

Pages Per Visit
Pages Per Visit

I have compared my pageview data to Clicky and AdSense and their data shows no change.  This leads me to only one conclusion: GA is blocking some type of traffic and has not announced this yet.

Did you discover similar findings?  Let me know.

How to Designate an IP Address as Permitted Sender

If you have your own domain handling email, then chances are you may run into email issues.  It is extremely important that you properly designate your IP Address as a permitted sender.

First, let’s test to see if your domain is a permitted sender.

Log into your domain’s email account, brant@example.com let’s say.  Send an email to another email account that you have access to.  Just to name a few… gmail, Yahoo!, or Hotmail.

The important part is for you to view headers of this email, often referred to as “Show Full Headers”.

You will see something that looks like this, but note the SPF softfail error:

—- Original message —–

X-Received: by 10.224.25.8 with SMTP id x8mr31561868qab.77.1382465370255;
Tue, 22 Oct 2013 11:09:30 -0700 (PDT)
Return-Path: brant@example.com
Received: from my3.example.com (my3.example.com. [64.131.70.223])
by mx.google.com with ESMTPS id k5si10380639qen.50.2013.10.22.11.09.30
for send@domain.com
(version=TLSv1 cipher=RC4-SHA bits=128/128);
Tue, 22 Oct 2013 11:09:30 -0700 (PDT)
Received-SPF: softfail (google.com: domain of transitioning  brant@example.com does not designate 64.131.70.223 as permitted sender) client-ip=64.131.70.223;
Authentication-Results: mx.google.com;
spf=softfail (google.com: domain of transitioning brant@example.com does not designate 64.131.70.223 as permitted sender) smtp.mail=brant@example.com

 

If you see Received-SPF: pass in the header then you have nothing to worry about.  However, as you can see from above, I have a softfail issue.  You may also see some with “fail”.

The fix is actually pretty easy.  Login to your registrar where you registered the domain.  What we need to do is create a new TXT record and add that specific IP (64.131.70.223) as a permitted sender.

The record will look something like this:

TXT    example.com    “v=spf1 ip4:64.131.70.223 ~all”

Add this entry to your registrar and wait for the record to update across the web.  You can test your TXT / SPF entry with this handy tool.

A few things to note here:

  • Make sure you use the quotation marks around the whole thing.  (Do not just copy and paste from here because WordPress sometimes uses a different characters for quotes.)
  • The last part ~all designates a softfail for any non permitted senders.  Softfail basically means that you haven’t finalized your email settings yet.  These emails have a chance of getting through, even though they fail.
  • Once you send another test email and confirm that you PASS, change the ~all to -all.  This means that any emails not from your designated senders should fail.

Sounds great, but what happens when you use another email provider, like mailjet to send email on your behalf?

Your TXT entry will then look like this:

TXT    example.com    “v=spf1 ip4:64.131.70.223 include:spf.mailjet.com -all”

Setting up email can be a bit frustrating, so hopefully this helps speed things along.

Still confused?  You may also find Eric’s article on setting up SPF records useful.

 

Apache mod_fcgi (FastCGI) vs DSO (mod_php)

My experiment with FastCGI has come to an end.

You may remember my last post about upgrading my server to the latest Apache, PHP 5.4.x, and switching to FastCGI.

The biggest issue with this new setup was the fact that APC was not shared amongst all the PHP processes, which really made things more difficult.  Did I really want to recode my caching solution?  An alternative would be to use Redis, but that is assuming FastCGI was faster than DSO and worth using.

Caching aside, FastCGI is slower than DSO based on testing in a production environment.  In some high traffic situations FastCGI performed significantly slower.

On the other hand, FastCGI was great with resource management compared to DSO.  It managed to keep memory usage low during both real world testing and benchmarks that I threw at it.

In the end, I guess it depends on what you are looking for.  Speed or better resource management?

 

 

 

Initial Impressions of Apache 2.4.x, PHP 5.4.x, and FastCGI (mod_fcgid)

Over the past two weeks my main server for FantasySP has undergone a massive software update.  I wanted to make sure things will be rock solid for burstable traffic.  Sunday mornings during football season can be pretty rough unless you are prepared.

My previous configuration was Apache 2.2.x, PHP 5.3.x, and DSO ( mod_php).

The new configuration is  Apache 2.4.x, PHP 5.4.x, and FastCGI (mod_fcgid).

Let’s start off by showing you guys some performance graphs…

NewRelic Data
NewRelic Data

In order for you to see the correct data, I removed everything except php and httpd to get a fair comparison.  MySQL and other services are irrelevant.

You will notice a few trends:

  1. Because I moved away from mod_php to mod_fcgid the memory allocation is completely different.  The drop in httpd memory is now passed to separate php processes.
  2. CPU Usage is also different due to mod_fcgid and is now shown in the php process rather than httpd.
  3. Overall CPU% is lower under the new configuration.
  4. Overall Memory Usage is also lower under the new configuration.

What accounts for the lower CPU% and Memory usage?  Apache, PHP, mod_fcgid, or a combination?  The newest versions of Apache and PHP are supposed to have better memory utilization.  If you want to learn more about mod_fcgid in comparison to mod_php then this is a must read.

There are a few drawbacks to using mod_fcgid, most notably is that APC is no longer a shared cache to all processes.  Instead there are many instances of APC running, which can use more memory and cause higher loads to MySQL due to more misses than before.

I honestly did not anticipate this and had to react accordingly.  Load times to the site were noticeably affected.  I had no choice but to alter some of my caching to be saved to a separate MySQL table, rather than being stored in APC.

Huge pain in the ass, but something I considered doing previously for historical data purposes.

I’ve seen evidence from the PHP site that there will be speed increases just by upgrading to PHP 5.4.x from 5.3.x.  However, the jury is still out with this for me.  It appears that it could be true based on early New Relic data.  However, I’m still making caching changes due to APC and it has skewed the data a bit.

It appears that I’ve successfully offloaded slow load times to the odd hours of the day (12AM – 5AM), and by the time 8AM hits things are speedy. I won’t know for sure until a month goes by and things normalize.

The biggest question is if Apache 2.4.x is faster than 2.2.x.  Early benchmarks are mixed, but it is entirely way too early to make a definitive call.  I know for a fact that memory and cpu usage have decreased due to the new setup based on the NewRelic graphs above.  However, I cannot confirm that performance has improved.

Unfortunately I could not find my previous Apache Bench numbers based on my older configuration.  That would have likely given me all I needed to know.

I will continue to tweak Apache in the coming weeks and see what happens.  By the end of September I should know for sure if my current Apache configuration is noticeably better or not.  Though I do plan to stick with mod_fcgid for now and see where things go.

Stay tuned for a follow-up post down the line.

UPDATE: Check out the follow-up post here.