The Slaughter of QA
We’ll start with a quiz that even readers outside of the tech industry should be able to answer:
Let’s say you run an enterprise of any sort, that produces just about anything, and you have a department devoted to Quality Assurance, staffed by testers whose job it is to validate the output of your enterprise. One day you decide to fire your entire QA department.
What will be the most likely affect on your enterprise’s output?
A. Overall quality will improve.
B. Overall quality will stay about the same.
C. Overall quality will decline.
If you answer C. you are of course correct, unless you are a software executive after around 2014, in which case you have your own answer, not on the list:
D. It does not matter.
No less an industry giant than Microsoft did this very thing in 2014, firing their QA staff. I have read different accounts, some saying they only fired certain categories of testers. But if you have used any Microsoft product since 2014, would you doubt the essential details of this story?
How could Microsoft, never known for high quality software to begin with, get away with that? In fact how could much of the software industry follow Microsoft’s lead and do the same?
I know this story from firsthand experience, at the company I worked for in 2017. My development team had a number of fine QA testers, and we all got along well and produced good work together. Then quite suddenly the company decided that Quality Assurance Tester was no longer a necessary job function, and eliminated the title wholesale. It was gut-wrenching. Professional colleagues I had known for years and worked closely with daily were just swept out the door.
It was worse than a “normal” layoff, where the implicit message from the company is, “We need to cut expenses and we can no longer afford to pay you.” It’s easier to receive that sort of message objectively. But with the QA title elimination the message was basically, “We were wrong all these years, it turns out your core professional skills are not really useful to us.” Now that is a real punch in the gut.
Why did executives believe it was okay to eliminate their QA departments? And with what did they replace it?
A Brief History of Software QA
Manual Testing – When I started in the industry in the early 1990s testing was mostly manual. A tester would interact with the software just as a user would—clicking around, entering text, hitting buttons and then observing the outcomes. The developer would first test their own software manually, and then hand it over to the QA department for further manual testing. This worked fine but could be inefficient in that human hands actually had to operate the software, and could do so only at a certain speed.
Programmatic Testing – In the late 1990s I started to see more programmatic testing, where a separate program was written to interact with the main software component in order to test it. For example, instead of a human tester inputting a text string into a field, the special testing program would insert that string into the software component. The advantage is one of speed. A testing program could insert hundreds of different strings into the software component in the same amount time as the human entering one string, and so many more scenarios could be tested, resulting in higher quality.
Unit Tests – A kind of programmatic testing that became common starting around 2000 was the unit test. This was a short test of a small area (a unit) of the main software component. These were usually written by the developer to validate their own code, enhancing the developer’s manual testing. There could be hundreds or thousands of these little tests, and all were supposed to pass before the software could be released. Often QA testers also wrote their own unit tests, to make the testing more comprehensive.
Unit tests were one of the last major improvements to software engineering practice. At first, as a developer, I may not have seen the benefit. I had already written the main software component, and now I had to write a separate program to confirm every expected behavior of the component? But the advantage was that if you make a change in a large system you can verify you didn’t inadvertently break anything by running the set of unit tests, obtaining peace of mind in a matter of minutes.
Developers and QA Testers
The ratio of developers to QA testers at a company was variable. I have seen 8 developers to 1 QA person, 5 to 1, 2 to 1 and even roughly 1 to 1. At times some developers perhaps did not get along with some QA testers. I have always welcomed QA because if they found a bug in my software it meant that it wasn’t found by an actual paying customer, which is much more embarrassing and could also affect the business. But it can also be a stressful situation, as a developer, to do difficult brain work and then to pass it to another educated person whose job it is to poke holes in all of your work.
The Slaughter
By the mid-2010s at my company, and apparently at Microsoft, the testing was done by:
1. Developers doing unit testing
2. QA testers doing manual testing
3. QA testers doing programmatic testing
Category 2. was what was eliminated, told that their job function was no longer necessary. At my company they were given the option to transform themselves into DevOps Engineers (see my prior essay about how this also encompasses the QA function among others), and had half a year to make the change, through additional training if needed. Some managed to do this, but most were let go.
Management deemed QA to be too costly, but they still desired quality. They thought they could still obtain it, without a separate QA job function, by enlisting the developers. The view was that developers have all the same technical skills as QA testers, so why not just use part of their time for QA? They were already doing unit tests. Why not have them also check the work of their peers? This view betrayed an astonishing lack of understanding of human nature.
The Painting Company
Imagine someone decides to form a house painting company. He hires four equally good painters, plus a fifth person as supervisor to keep an eye on their work and make sure it meets his standards. The four painters paint their individual rooms, and the supervisor pops in from time to time to point out flaws in their work. The quality level is high and the company thrives.
One day the owner decides that since the quality is good enough that he never gets customer complaints, the supervisor is no longer worth the money. He fires the supervisor and tells the painters that part of their job is now to ensure the quality of their peers. From time to time they should stop their own painting and go to other rooms and check the work of their fellow painters.
What possible incentive would a painter have to find fault in someone else’s work? For one, they are just trying to get their own room done, and to go to another room prevents them from progressing on their own. Also, how comfortable would they be in finding fault, especially since that other painter, if so inclined, could make extra effort to find fault in their own work? The only incentive would be that the boss said they’re supposed to do it. But what are the chances the boss will come by and figure out that they failed to find fault in someone else’s work?
Really, the only person who can check the painters’ work is the boss, or someone to whom the boss has delegated this checking function (i.e. a QA team).
Developers Shall Look After Themselves
That essentially was the plan my company devised to replace the lost QA team. We DevOps engineers would take turns testing and critiquing each other’s code. In practice I don’t recall anyone ever directly testing someone else’s code. Developers would write their own unit tests, and that was the extent of the testing.
Critiquing of code did occur. When developers checked in code at least one other developer would review the code for general quality, looking for problems. I have never seen a code review result in a bug caught. At most the code review would be a lot of pedantic comments about how the code, which was already shown to work via unit tests, could be even better in the opinion of the reviewer, by using a more recent version of some library, for example. These steps might improve quality in the abstract, but did not improve it concretely, as would be the case with a QA tester doing actual testing to find bugs.
Robot Testing
The QA teams were also partially replaced by a robot called SonarQube [sic]. This was a piece software that would analyze the developers’ code looking for potential problems. As with human code reviews I don’t recall SonarQube ever finding an actual bug, but it did find way, way more stylistic problems in the code, claiming things like bad variable names represented “critical” issues with the code. Again, this was contributing to quality in the abstract only, since variable names do not matter with the actually running software. Worse, the managers obsessed with the SonarQube reports of code quality, and believed that improving the SonarQube report represented actual improvement in code quality, which did not follow in practice. Bugs would still sneak by the human and robot code reviewers.
Canary Testing
The final technique that replaced the QA department was canary testing, a reference to the canary in the coal mine that dies sooner in bad air than the coal miners do. And the canary here is you, the end user of the software for which you have probably paid a lot of money. Canary testing is easy to do with an online system. You install the new software you want to test, and then divert some portion of the end user traffic, say 5%, to the new software, forcing that 5% of users to use the not fully tested software without their knowledge. You then look for signs of crashes and other errors. If you see too many you pull the release and bring it back to the shop. If you don’t you release the software to all of your users.
This essentially forces the end user to test the company’s software, and is a common industry practice.
Conclusion
Does the software industry still care about quality? Yes, to a certain extent, but to another extent, no, not to the level it used to. There is a feeling that the whole industry decided the quality levels were too high, and that they could save a lot of money by lowering the standards, while still keeping the same number of customers. Does anyone disagree with this cynical assessment? When is the last time you or your company dropped a software package you had purchased because it had quality issues? I have not seen this myself. The only time companies seem to switch software is to a lower cost alternative.
And let’s say you get sick of Microsoft Teams’ glitches. To what competing software product would you switch to get improved quality? Maybe there are better ones out there, but all the enterprise software I have been exposed to in the past ten years is of comparable quality. It’s almost like there’s a monopoly on unQuality, like they all agreed to create equally crappy products.
I don’t know the way forward. The way to improve quality is obviously to return to independent QA testing. But as long as companies aren’t held to higher standards by their buyers, they have zero incentive to spend more money on quality.
*About the picture: Those are the steps of the Parthenon temple, which are subtly curved upward. Viewed from the front of the building the top step is 101’ wide and the center curves up 2 3/8”. This was done to compensate for the fact that if the steps were totally parallel to the ground the center would appear to dip. The columns have similar, delicate swellings.