Optimization Maverick Blog: 2010

You can hire my services

Good old button testing

Everyone, at some point, does some optimisation testing of buttons designs. Some people think it's a trivial exercise to undertake when there's bigger fish to fry. Well I disagree, button design testing is exactly the kind of thing you can be doing quickly and easily with Google Optimiser or similar. We've done loads of testing in the past on buttons, testing colours, sizes, Apply text and so on, but I read an interesting article from Get Elastic on how unusual button designs can give you an easy uplift in conversion. So I tried over the course of a couple of months on a landing page testing all the designs you see here. No.1 was the default design, and the winner was...No.3 the 'boxed arrow' design with a 32% uplift in click to apply rate. The arrow-based designs were in the upper end of the winning designs overall, but the, *cough* phallus-based designs stole an early lead but didn't win out overall. Give it a go on your website, it's quick easy and surprisingly fun.

Google Experiments Follow-Up experiments

First off - What is a follow-up experiment?

Google says:

"

When the results of an experiment suggest a winning combination, you can choose to stop that experiment and run another where the only two combinations are the original and the winning combination. The winning combination will get most of the traffic while the original gets the remaining. This way, you can effectively install the winner and check to see how it performs against the original to verify your previous results."

And why should I run a follow-up experiment?

"Running a follow-up experiment will give you two benefits. First, it will enable you to verify the results of your original experiment by running a winning combination alongside the original. Second, it will maximize conversions, by delivering the winning combination to the majority of your users. We encourage you to run follow-up experiments to get the best, most confident results for any changes you make to your site."

But what happens when a follow-up experiment delivers contradictory results?

The screenshot below shows the original MVT test results....

I commenced a follow-up test running the the winning variant from this test in a head to head with the original default. And this is what happened...

The blue line is the original design beating the first test winning variant. This has happened time & again with my follow-up experiments. Then I noticed something. When you set up a follow-up experiment it's easy to overlook the weightings setting or the 'choose the percentage of visitors that will see your selected combination' option of a follow-up test. By default it's set to 95% for your selected combination.

Now I cant offer any explanation but from previous testing with other tools such as Maxymiser we've seen when you up-weight a particular variant in a test in favour of another, invariably it's conversion performance goes down, sometimes radically so. I recommend only doing a 50-50 weighting at anytime in any follow-up experiment because for whatever reason an unequal weighting seems to skew performance. Just be aware of this possibility and you'll be fine : )

If anyone can offer me a scientific explanation for this behaviour I'm all ears!

By the way, below shows the test after the weightings are reset to a 50/50 split. Bit different from the original follow-up experment no?

Give a Huq?

It's good to see some examples of AB testing that aren't just about which web page works best. And here's another example of Magazine cover split testing. This months issue of Company magazine is running with two variations of it's cover, one with presenter Fearne Cotton, the other with presenter Konnie Huq (who's married to this guy by the way). As you can see both covers are the same bar mention of the featured presenter and the hero shot; even the poses are almost identical. I've mentioned magazine cover testing before in a previous article here but I havn't seen any recent evidence of the Press engaging in this kind of marketing test in recent years. Would be (mildly) interesting to see who wins this particular test.

Google Trends for MVT Terminology

Out of curiousity I've ran a couple of queries in Google Trends to see what are the more popular terms in the world of web optimisation. I queried 'ab test' versus 'split test' and found the latter to be less widely used. However I was more keen to see exactly what terms people were using for 'MVT', 'multivariate', versus the old skool 'multivariable'. I thought that the term MVT would be more widely used in this day & age, or indeed be on the increase. It turns out to be far less popular than multivariate and on a par with 'multivariable'. I think it would be handy if we all stuck to a single expression, my personal preference being 'MVT'! This would certainly help for job searchs too! This particular blog gets ranked well for the term 'MVT blog' but is off the radar for 'Multivariate blog' too, so there's another side-effect to of our mis-aligned industry terminolgy! : )

UPDATE 16th August 2011: Whilst Google Trends is still a good tool for examining terminology usage or just trending full stop a far better tool for this job is Google's Keyword Tool . Obvious really : )

Amazon cross-selling

This is clever. Well I think so anyway. Amazon are cross-selling their new Mastercard credit card via the medium of your actual shopping basket. They give you a £10 credit if you apply for the card but take this one step further by applying this credit in a hypothetic, yet obvious way to your current check-out total. Automatically your eyes are drawn to the discounted price of your purchase. Nothing seems unusual as we're now finely accustomed to the concept of discount codes et al. Personally I think this is fairly slick marketing example.

Online polls

I'm always looking for tools and services that compliment the core optimisation activity we do using MVT and AB testing. In the past I've tried getting the general public to rate a selection of test variants using an on-line survey (see earlier post here for details). To me testing is always going to be the ultimate customer feedback on what works for on-line sales conversion but there's no harm in garnering opinion of your winning web design and page content. I'm now piloting on-line polling on a few landing pages using a Polldaddy poll widget. This simply embeds a rating widget in the top corner of selected pages enabling visitors to either give the page a thumbs up or thumbs down . Obviously the shortcoming here is that you cant tell if people are giving the page design or page content a good or bad rating, but at this stage I am trying to get a very high level user response to the page as a whole. Ideally I would have this widget on every page on the site as a means of flagging up any under performing pages or highlighting pages that people have an affinity with.

Use of Awards endorsement in web pages

In the past we've included any awards the company or product have attained based upon the assumption that they can only be a positive thing to have on the web page. Well I finally got around to measuring exactly what effect the use of an award logo had when added to a page. Basically I had concluded an earlier optimisation exercise on a landing page using Maxymiser and tweaked the winning page combination to show an award logo instead of the image of two people. I let the adapted test run for 3 weeks and the winning combination fell from a 9.96% uplift to minus 4.05% thus showing that the use of award imagery in this context was ineffective. Click on the image below to enlarge for a summary of what happened.

Is Fatwire any good?

Looking for information on Fatwire and MVT testing?

If you didn't know Fatwire is a content management system...

We have been doing MVT and AB testing on a Fatwire (FW) website for the last 10 months. FW dictates that your website is constructed through a series of templates which dont lend themselves to conventional mvt tagging; well not easily anyway.

We decided upon using Maxymiser as a testing solution primarily because their one touch solution could be worked into the header of the websites master template, thereby enabling us to conduct tests across site with zero (any I really mean zero) IT department input from then on. Putting Maxymiser (or any third party testing tool) tagging in the header means it needs to be in a site-wide 'asset' of FW, thus enabling you to conduct MVT or AB testing anywhere within your website. Doing this allowed Maxymiser to then render out alternative/test content on top of your 'restrictive' Fatwire content.

If you want to be consistent with your Fatwire templates, you really need to come up with test variants based on the content restrictions of your current, available templates. This is a high level choice to make. Personally because Maxymiser could render any designs I liked over the top of the current suite of Fatwire templates we had, I decided to go with page designs and content that didn't bear much,if any, resemblance to the Fatwire content beneath. This was because I wanted to see what gains could be made by testing alternative layouts and use any uplifts as leverage with our IT department in getting new templates built, becauseI could prove the benefits of doing so. Additionally while your running an MVT test over the top of the Fatwire you are effectively 'making hay while the sun shines'. By that I mean that even while your running a test, nine times out of ten, because your layouts are not restricted by the Fatwire template business rules, you're already going to benefit in customer conversion during the test period alone which you wouldn't have got otherwise. I will talk about using your testing tool as a CMS in a later post. It's a bit controversial : )

In all honesty if I had the choice I would not be working with Fatwire at all, especially not the very old version we have. It's effectively a step backward in terms of web design, it's highly restrictive in terms of design flexibility. The illustration I've taken to recently using if that a one man band building an e-commerce website in his bedroom using something like Microsoft Front-page, actually has the potential means to add 90% more functionality to his website than someone using a traditional content management system such as FW. Think about that. If you have some JavaScript based web service or dynamic content you want to have a go with using, just by way of a trial, with FW, how much heartache do you have to go through to get that tagging, or JavaScript included into your website body header or body? If it's not a major hassle to you or your IT department then fair play but you're probably not the norm.

Anyway here's my true feelings about good old FATWIRE laid bear through the medium of satirical animated video.....

Google Optimizer - Landing Page A/B Testing

I launched a split test on one of our highly trafficed Current Account landing pages last week.

This basically saw the rather stale current champion page design challenged by a much more creative led design for the same page. The graphic above shows the readout for the visitor conversion rate which shows the creative led design as the outright winner (16.9% uplift in click to apply rate). However, I have tagged both pages so that we can identify which pages actually result in submitted current account applications, and the results show the original design leading over the creative led design. This is yet another example of how pretty design may compell people to click an apply button but if they havn't read the fine detail they're less likely to go through the entire end to end transactional process.

The original page design (champion) is shown below on the left and the creative led design (challenger) is shown below on the right.

An unbiased review of Google Experiments

I notice that a lot of reviews for Google Website Optimizer (GWO) have been done by what at first appear to be independent companies & individuals, but at second glance are actually affiliated or partnered to Google in some way. So this being the case, here's my unbiased opinion & experience of GWO for what it's worth. And please note: I am not pushing or offering any service here relating to Google.

Anyway, I've been itching to give GWO a trial for the past couple of years. Up until now our websites have not been conducive to implementing Google tags and besides which we've been using a managed testing service from Maxymiser. However a few things attract me towards GWO over a paid service.

First off, it's free. If you have the means to implement Google tracking tags into your web pages, why wouldn't you at least try a test or two?

Secondly, even if like our company your paying a third party company to build and run multivariate and AB tests for you on your behalf, no matter what contract you're on there's never enough time and resource to run with all testing concepts and ideas you might like. Which is why I've persisted with GWO for the past couple of months as it's relieved a bottleneck in our testing schedule being able to run quick & dirty tests on the fly alongside our more formal testing.

If your business is to be an optimization expert you need to try a variety of tools, especially the most commonly used one to add to your knowledge base.

We now have a sub domain now where we can do whatever we like on landing pages including testing and tagging with GWO. I've now conducted three landing page tests ranging from AB tests to MVT and just launched my fourth.

My first test was on a Personal Loans page where we tried a dozen different page designs and copy changes to see if we could get more people to apply for a loan. It was a fairly simplistic affair but has yielded a 14.77% uplift in visitor conversion by offering up a different page header and product introduction copy.
The second test was on a High Interest Current Account landing page (phase 1). This looked at simplifying overly complicated product information and has yielded a 20% uplift in conversion. I've just launched a phase 2 of this test where I champion/challenge the previous winner, a rather stodgy design against a more creative led design.
My third test was recreating a Bank Accounts comparison page from our main website but in a landing page environment. We then employed the usability web experts Foolproof to come up with an improved design for the page and test it (amongst other designs) against the default page. It's still running and yielding a 9% uplift in submitted current account applications to date.

Now the problems with GWO.

For whatever reason, aside from my most recent test, none of the tests have displayed progress data in the reporting interface of the GWO console! Fortunately in every test I have been able to tag each variant with a bespoke/unique tracking value which appears in our downstream sales database when someone submits an application . Because of this I have been able run successful tests with a reliable MI to go by.

Customer support for GWO is non-existent and you're reliable on forums to get any kind of useful information about troubleshooting. I have had to solve a number of glitches with the tests ran to date myself using significant time & effort!

Because I have web developer experience I am able to setup a test and build the page variants for my test also. But this is not the norm and I can imagine working with an internal or external agency to implement and update GWO test tags and content might make the whole experience unworkable.

Actually coding the test variants is badly handled in that they just give you a single unformatted text area to do your HTML editing in, so lot's of cutting and pasting from a conventional web editing tool is involved, such as in Dream Weaver or Visual Studio (even Notepad is better!).

The positives of GWO

I love the GWO console. Both thr reporting interface (below is the reporting interface showing a current MVT test in progress) and the test build interface are simplistic but highly functional in design. The step by step code implementation are really straight forward. You tell GWO what your original, test and conversion pages are and it tells you what code to paste into those pages to get the test going.

It allows you to weight test variants so that they are displayed to a set portion of your test audience, and the test follow-up feature is great too for validating the results of a concluded test.

If Google could come up with a solution that didn't involve placing tags on test and conversion pages then I think the technology could take off like a rocket.

In conclusion

So besides some teething problems with getting up to speed with GWO its been a good learning curve and I've been able to turn in some very reasonable test results to compliment my other testing.

If you want to see a previous article comparing GWO to other optimisation providers click here

Culling multivariate test variants in a Maxymiser test

I've covered the topic of culling before on this blog here. Now I'll go through my method of identifying test variants to cull from a running MVT test in Maxymiser where you have multiple actions.

1. Below is a screengrab of an MVT test report in Maxymiser after a test has run for a week. On the left side are the bottom ranking variants for a 'click apply button' action and on the right are the bottom ranking variants for a 'submit application' action.

2. The conversion rate uplift is negative for these variants and are not adding anything to the test overall and so need to be removed or 'culled'.

3. Indentify page combinations (see Page ID field above) that are both negative in uplift and appear in the bottom ranking across both actions then select the 'remove page' option within the console.

4.Looking at the test report before and after the cull identifies the immediate effect on the 'chance to beat all' metric within the test.

The chance to beat all value moves from 27.86 % for the lead variant to 28.11%, this uplift shift also cascades downwards through the other variants left in the test.

This culling exercise would then be repeated at periodic intervals for the remainder of the test.

note: Caution should be taken when removing variants from the test. The number of generations and actions should be taken into consideration, whilst over-culling a test can bring it to an early and unproductive conclusion.

Optimisation providers compared

We recently reviewed Optimisation providers and their tools as we needed to procure another years contract for such a tool and service. Here's a few advantages and disadvantages that we found with a few of the top names in the business at the moment.

Omniture

Advantages

Comes with segmentation engine by default
Has self optimising content feature once test has completed
No limits to the number of domains you can test on

Disadvantages

Requires multiple pieces of code on every test page - very IT resource heavy
Reporting only shows winning design not all results so test within test not possible
Poor reporting interface
Technology-wise they seem to have fallen behind the opposition providers.

Maxymiser

Advantages

Solely undertake optimisation work, not distracted by other functions such as web analytics etc
Minimal IT resource required as tests require a single line of Javascript at the bottom of the test page. Once implemented no further work is required.
Full end to end managed service including design resource.
Product has real time reporting
Manual control over tests if required,i.e; culling of test variants
They have an extensive serving infrastructure that can serve winning content until new templates can be developed.

Disadvantages

Initial quotes were high compared to competition
Sometimes sluggisn in terms of test turnaround time from past experience

Autonomy

Advantages

Offers all the functionality you'd expect from a top-end provider
Full managed service
Very strong test recommendations and results analysis
Appears to cope well with more technically challenging tests
Uses Wave testing approach which means end results are more accurate when implemented live
No limits to numbers of tests and number of domains covered with contract
Segmentation engine included
Minimal IT resource required as tests require a single line of Javascript at the bottom of the test page. Once implemented no further work is required (although complete clarification on this capability was ever obtained from the vendor at the time).

Disadvantages

Poor reporting interface
Solution requires code to be put in the header - this may cause issues for templated sites
3 month contract required to implement a Pilot
Perception of a limitation on doing radical page layout testing, more elemental testing favoured.

Sitespect

Advantages

Once Sitespect server implemented then no additional IT resource required
Unlimited control over what content we wish to test and no limits to the number of tests that can be run (limited only by our resource)
It can test all content including any online forms
Comes with a segmentation engine at no additional cost

Disadvantages

It is not a Managed Service so labour intensive from an eCommerce point of view
Developer know-how required within our team
User interface very complicated and requires full training
Results reporting not very user friendly

Google Optimizer

Advantages

It's free to use
Wide customer base and as a result there seems to be a wide knowledge base although forums are not the best means of resolving a critical testing issue in a timely manner!
Likely to improve over time due to the volume of people using it.
Integrated reporting with Google analytics, which is good if you like the GA interface.
Can be either self optimising by automatically removing under-performing variants or done manually, but once variants are out of a test that's it.
Can be adapted to do event tracking, ie. onclick events so in theory you dont need a conversion page to undertake a test which is dead handy.

Disadvantages

Fairly IT resource heavy in its implementation
Cant easily track across domains

Obviously not a managed service
Can only record an action on the test page and not all the way to application submit (hence missing a critcal metric for serious testers).
No consultancy service from Google so dependant on forums for problem solving
Feels like it's still in beta
Have been some noise about outages and slow reporting in terms of seeing test progress or even confirmation that your test is firing correctly.

OUTCOME OF TESTING TOOLS EVALUATION

In the end we choose Maxymiser mainly because we'd already worked with them before and knew what to expect more or less, going forward. Also because they had the ability to remove under-performing variants within a test and their reporting console is user friendly compared to the competition. They are keen and quick to improve their tools and service and welcome feedback.

UPDATE:

Since this article was originally posted we have also decided to pursue use of Google Optimizer in parallel to the chosen managed service as it's a free service and I personally have the developer know-how to code the test variants without IT input (ie, it's not for everyone in the testing world).
UPDATE:I've posted my feedback on this in a seperate article here .

You can hire my services

Blog Post Labels

First off - What is a follow-up experiment?

Google says:

"

And why should I run a follow-up experiment?

But what happens when a follow-up experiment delivers contradictory results?