I suppose this post has been a long time coming.
It was spurred into reality by an exchange with @bladeguy who pointed out that Cisco, too, sponsors tests of their equipment – just like HP and the Tolly reports. At first, I’d intended to do a comparison of the Tolly reports and the Principled Technologies reports, looking for obvious (or not so obvious) bias. Once I started down that path, however, I realized it really wasn’t necessary. Sponsored tests (from any organization) will always be biased, and therefore unreliable from a technical perspective. There are always tuning parameters that the “loser” will insist were wrong which skewed the results, there are always different ways to architect the test that would have given the “loser” an edge. That’s why they’re sponsored tests.
I commented briefly before on Tolly’s first HP vs. Cisco UCS report, so I won’t repeat it again here. Suffice it to say, the bloggers and such did a pretty good job of chopping it up.
My issue with the Tolly reports I’ve seen thus far, the bandwidth debacle and the airflow test, simply don’t pass the smell test. Yes, they’re repeatable. Yes, the testing results are defensible. But the conclusions and declarations of a “winner”? Not so much. I’m not faulting Tolly here. They’re doing their job, producing the product that their customer has asked them for. These tests aren’t sponsored by the HP engineering groups (for whom I have a lot of respect) looking to validate their technological prowess – they’re sponsored by the marketing departments to provide ammunition for the sales process. As such, do you really think they’re going into this looking for a fair fight? Of course not. They’re going to stack the deck in their favor as much as they think they can get away with (and knowing marketing departments, more than they can get away with). That’s what marketing groups (from every vendor) do.
@bladeguy pointed out that Cisco has engaged Principled Technologies to do some testing of UCS equipment versus both legacy and current HP equipment. At first glance, I didn’t detect any significant bias – especially in the tests comparing legacy equipment to current UCS gear. I’m not sure how any bias could be construed, since really they’re just showing the advantage and consolidation ratios capable when moving from old gear to new gear. Obviously you can’t compare against Cisco’s legacy servers (since there aren’t any), and HP servers are the logical choice since they have a huge server market share. I would suspect that similar results would have been achieved when comparing legacy HP equipment against current HP equipment as well. HP can (and I’m sure has) perform similar tests if they’d like to demonstrate that.
The more troublesome tests are those comparing current generations of equipment from two manufacturers. The sponsor of the test will always win, or that report will never see the light of day. That’s simply how it works. Companies like Tolly, Principled Technologies, etc aren’t going to bite the hand that feeds them. As such, they’re very careful to construct the tests such that the sponsor will prevail. This is no secret in the industry. It’s been discussed many times before.
Even the Principled Technologies tests that compared current generations of hardware looked like pretty fair fights to me. If you look closely at the specifications of the tested systems, they really tend to reveal the benefits of more memory, or other such considerations, as opposed to the hardware itself. @bladeguy pointed out several items in the Principled Technologies tests that, in his opinion, skewed the results towards Cisco. I’m not in any position to refute his claims – but the items he mentioned really come down to tuning. So essentially he’s saying that the HP equipment in the test wasn’t tuned properly, and I’m certainly not going to argue that point. As a sponsored test, the sponsor will be victorious.
And therein lies the problem. Sponsored tests are meaningless, from any vendor. I simply don’t believe that sponsored tests provide value to the technical community. But that’s ok – they’re not targeted at the technical community. They’re marketing tools, used by sales and marketing teams to sway the opinions of management decision makers with lots of “independent” results. If I want to know which server platform is better for my environment, I’m going to do my own research, and if necessary invite the vendors in for a bake-off. Prove it to me, with your tuning as necessary, and I’ll have the other vendors do the same.
My real problem with these tests, understanding that they’re not aimed at the technical community, is the that many in the technical community use them to “prove” that their platform is superior to whoever their competing against at the moment. Like politics, these kinds of arguments just make me tired. Anyone coming into the argument already has their side picked – no amount of discussion is going to change their mind. My point for blogging about UCS is not to sell it – I don’t sell gear. It’s because I believe in the platform and enjoy educating about it.
I happen to prefer Cisco UCS, yes. If you’ve ever been in one of my UCS classes, you’ll have also heard me say that HP and IBM – Cisco’s chief rivals in this space – also make excellent equipment with some very compelling technologies. The eventual “best” solution simply comes down to what’s right for your organization. I understand why these sponsored tests exist, but to me, they actually lessen the position of sponsor. They make me wonder, “if your product is so good, why stack the deck in the test?” The answer to that, of course, is that the engineers aren’t the ones requesting or sponsoring the test.
As came up in my UCS class today, for the vast majority of data center workloads, the small differences in performance that you might be able to get out of Vendor X or Vendor Y’s server is completely meaningless. When I used to sell/install storage, I used to get asked the question as to which company’s storage to buy if the customer wanted maximum performance. My answer, every single time was, “HP, or IBM, or HDS, or EMC, or…” Why? Because technology companies are always leapfrogging each other with IOPS and MB/s and any other metric you can think of. What’s the fastest today in a particular set of circumstances will get replaced tomorrow by someone else.
So what’s the solution? Well, true independent testing, of course. How do you do true independent testing? You get a mediator (Tolly, Principled Technologies, etc are fine), representatives from both manufacturers to agree on the testing criteria, and allow each manufacturer to submit their own architecture and tuning to meet the testing criteria. The mediator then performs the tests. The results are published with the opportunity for both manufacturers to respond to the results. Think any marketing organization from any company would ever allow this to happen? The standard line in the testing industry is “Vendor X was invited to participate, but declined.” Of course they declined. They’ve already lost before the first test was run. I wouldn’t expect Cisco to participate in a HP-sponsored Tolly test any more than I’d expect HP to participate in a Cisco-sponsored Principled Technologies test.
Don’t chase that 1% performance delta. You’ll just waste time and money. Find the solution that meets your organizational needs, provides you the best management and support options, and buy it. Let some other chump chase that magical “fastest” unicorn. It doesn’t exist. As in all things IT, “it depends.”
All comments are welcome, including those from testing organizations!
3 thoughts on “The Problem With Vendor Sponsored Testing”
Reposting with permission from Kevin Tolly:
“In any given test project, we state what we are trying to prove, how we are going about proving same and what we found. We don’t try to make decisions for users, It is up to them to decide whether a given test is relevant to their needs.
We clearly state the level of vendor involvement when running a competitive test. Yes, there are times when Cisco (and other vendors) get involved in tests sponsored by competitors. In fact, it is more often than not that they DO get involved. And, yes, we do conduct tests where HP (and other Tolly test sponsors) is the competitor. In our most recent LAN switch test, sponsored by LG-Nortel and involving Cisco, Cisco notified us that they had a new model switch that would be more appropriate for the test. Cisco shipped us their preferred switch, it was tested in place of the original and those results were published. Furthermore, that test also compared LG-Nortel vs two HP LAN Switches. HP participated in the test and also confirmed the accuracy of their results. Please see: http://www.tolly.com/DocDetail.aspx?DocNumber=210125.
I don’t disagree with you that, in a perfect world, each user should run their own bake-off. While that may be possible with certain technologies and products, there are many tests that we run that require million-dollar-plus test rigs, a lot of test tool expertise and sometimes weeks of lab time. Most prospective buyers just can’t run them. In fact, many publications that feature a “test lab” cannot even test gear properly as they, too, do not have the requisite time, gear or expertise.
Whenever possible, we run our tests according to “common test plans” that we have developed with input from interested technologists from the vendor and analyst/consultant community. They do not yet include server tests but do cover LAN Switch and VoIP “green” tests, Data Loss Prevention, App Switching (L4-7) and other areas. We maintain a Common Test Plan site (http://www.commontestplan.org/) and are always open to suggestions for further test plans.
Thanks for your interest in the matter. Please feel free to post and/or share with your readers as you see fit.
The Tolly Group”
Great post, and I commend you on taking an objective stance. The moral is always the test will always reflect the funding and therefore these tests must be taken with a grain of salt or thrown out completely.
My major complaint with tests like the ones your describing is the summaries of results and assumptions obtained from the data. Some of the recent reports in question start with a summary that is far from an independant assesment.
If as Kevin states:
“In any given test project, we state what we are trying to prove, how we are going about proving same and what we found. We don’t try to make decisions for users, It is up to them to decide whether a given test is relevant to their needs.”
Then they should be very wary of making emphatic statements about the results if the results were based on testing designed to highlight strengths compared to weaknesses.