If there ever was any doubt that defective software could kill people - lots of people - then two catastrophic Boeing 737 MAX 8 jetliner crashes erased it.
But so far, there is scant evidence that these tragedies have revolutionised software development. That’s a tragedy itself, given the availability of methods and tools to eliminate defects in safety-critical software.
It’s been 10 months since a Lion Air 737 MAX 8 crashed into the Java Sea off Indonesia, killing all 189 passengers and crew, due to what investigators described as a “glitch” in the plane’s flight-control software.
Following the crash, the Federal Aviation Administration (FAA) issued an emergency notice to operators of Boeing 737 MAX 8 and 9 planes, warning that faulty angle of attack sensor readings “could cause the flight crew to have difficulty controlling the airplane.” This loss of control, the FAA said, in a euphemistic understatement, could lead to “possible impact with terrain.” (Or, in this case, the ocean.)
And then it happened again, a little more than four months later, when an Ethiopian Airlines 737 MAX 8 went down under similar circumstances, killing all 157 people aboard.
That incident prompted the grounding of all 737 MAX 8 jetliners worldwide. They remain grounded today, since US regulators found another software flaw in late June. Word from the FAA is that the planes may not be flying again until December or perhaps even into 2020.
All because of defects in safety-critical software.
Granted, as the efforts to fix the “glitch” have dragged on, inevitable media leaks have suggested that multiple factors are to blame. They include an alleged lack of pilot training, a failure to include instructions for the flight-control system known as MCAS (the manoeuvring characteristics augmentation system) in the flight manual, and the outsourcing of software development to low-paid workers from India.
Still, Boeing essentially acknowledged that software was the primary problem when an unnamed official told CNN Business, “We believe this can be updated through a software fix.”
How to build better safety-critical software
The situation is a stark reminder that modern society is increasingly dependent on the quality and security of software. We look to software not only to provide convenience but also to protect our lives and ensure our safety.
Organisations can ensure the quality and security of safety-critical software in a variety of ways. But they’re all related to the concept of building software integrity in during development. Many organisations rely on the alternative: bolting or patching it on at the end. But in the long run, “shifting left” makes it cheaper and faster to build a superior product.
Shifting left requires the use of multiple testing and analysis tools throughout the software development lifecycle, such as:
Architecture risk analysis during design
Static analysis during development
Interactive static application testing during testing/QA
Dynamic analysis prerelease
Another essential tool is software composition analysis, which helps developers uncover known bugs and vulnerabilities in open source components.
A similar message came from Eric Elliott, an author and distributed systems expert, in a post on Medium. Securing a “life-critical system,” he said, has to include “many lines of defence in order to assure quality control.”
Among the lines of defence he listed: requirement specification and review, risk analysis, test-driven development, static analysis, and software inspection/code review.
Slow is fast in software development
None of this is easy. Sammy Migues, Principal Scientist at Synopsys, said shortly after the Lion Air crash that before automation, “we had a pretty short list of things that could go wrong — metallurgy, engine, construction etc.”
“Now there are a million things that could go wrong, from software, software integration, software errors, software interfaces, unexpected conditions that software has to deal with, and so on. There are way more situations that can adversely impact passenger safety,” he said.
In any case, the result of building quality and security into safety-critical software is a better product. And a “better” product is less likely to be dangerously flawed or to threaten life and safety.
Elliott and numerous other experts warn that there is no such thing as software that is fast, cheap, and good. It’s a dangerous, and potentially lethal, fantasy. The irony is that trying to go “fast and cheap” in the development of software yields the opposite results.
“The phrase ‘slow is fast’ is well known in engineering circles,” he said. “It has origins in the military phrase ‘slow is smooth and smooth is fast,’ and it’s a form of uncommon sense.”
As in lifesaving common sense.
Taylor Armerding is a Security Expert at Synopsys Software Integrity Group.