Dying of Human Errors

A few months ago, the world mourned the loss of 157 global citizens to the ill-fated Ethiopian Airlines crash in Ethiopia. The plane lost contact with the control tower barely six minutes after take-off and all that was left of the aircraft were the pieces scattered at the scene of the crash.  A global response led to the grounding of the 737 Max aircraft. The black boxes of the aircraft were recovered and sent for analysis in Paris and not the United States. Ethiopian Airlines chose Paris due to reasons best known to them. However, a school of thought could be that Ethiopian Airlines is concerned that the analysis report could contain biased judgment because Boeing is an American Company. Hence, they believe they would get an independent unbiased report from Paris. Charlton (2019) cited suggestions made by Peter Goelz, a former M.D of the National Transport Safety Board. He suggested that sufficient evidence needed to be gathered against Federal Aviation Authority (FAA) and Boeing since the former certified the 737 Max jets to be airworthy. 

A preliminary report released by Ethiopia’s Minister of Transportation revealed that the pilots of the doomed Ethiopian Airlines Flight 302 followed all the procedures recommended by Boeing when the airplane nosedived but they could not prevent the plane from crashing. The minister affirmed that the preliminary report is based on findings from the black boxes of the aircraft.

A day after the release of the preliminary report, Drew Griffin, an investigative reporter at CNN revealed that current and former employees of Boeing blew the whistle on poor practices that exist within Boeing’s operations. The report showed that Boeing’s anti-stalling software could have caused the crash because the plane nosedived repeatedly despite the pilots’ effort in following Boeing’s specified procedures.

According to the U.S Federal Aviation Administration, the issues raised by the whistleblowers include the damage to the wiring of the angle of attack sensor by a foreign object and the anti-stall system called Maneuver Characteristics Augmentation System (MCAS).

It is clear that these employees could no longer stomach the persistent acts of wrongdoing they observed in the workplace. Though it was an anonymous tip-off, were they right to have reported to the regulatory body and did the report come too late after the Lion Air and Ethiopian Airline crashes? Their ethical stance, conscience and the values they believed in, might have motivated them towards violating their consent to the various agreement during and post-employment at Boeing.

The design of the 737 Max jets placed two sensors on each side of the fuselage but MCAS takes reading from just one. The attack sensor (Internet of Things) feeds data to the MCAS. The content of the data is what determines if the (software powered) MCAS will be engaged or not. Consistent feeding of incorrect data will cause MCAS to repeatedly push the plane into a nosedive, putting human lives at risk and casting a shadow of doubt over the expertise of pilots as they struggle to regain control.

Griffin reported that Boeing’s CEO acknowledged that the incorrect data that led to the MCAS malfunction is one link in the chain of events. This poses the question of why Boeing, with decades of experience in its portfolio, designed a plane with a single point of failure. Internet of Things is prone to generating bad data. Harel Kodesh, CTO of GE Digital stated that 40% of data generated from IoT networks is “spurious”. Environmental factors have an effect on the type of data sensors generate. The harsher the conditions, the more likely the amount of incorrect data the device would generate. Sometimes, these devices just begin to malfunction on their own and it could be as a result of poor configuration management. Lawson (2016) recommended that having a redundant IoT system would address Boeing’s single point of failure design. Moreover, a system with a single point of failure should have its risk profile updated and continuously evaluated.

A report published by Dominic Gates, Seattle Times Aerospace reporter shared details of how defective analysis by Boeing and failed oversight by the FAA affected the 737 Max flight control system. FAA delegated its responsibility of safety assessment to Boeing. Auditees do not take part in the assessment of their work. The probability that deficiencies in their work will be covered up is one (1).  A control lapse by a regulatory body led to significant losses.

"When regulations fail, standards diminish and a worst case scenario becomes acceptable."

Inadequacies in the report presented to FAA by Boeing team understated the potential functionality of 737 Max MCAS technology. Such an understatement would limit the ability to build accurate risk profiles for IoT and software development processes used in production. Hence, the inability to determine the catastrophic impact of these technologies and proffer adequate responses. Gates reported that Boeing’s system safety analysis was classified as “major failure”. This implies that system failure cannot lead to a loss of human life but distress and injuries whereas in flight operations, the action of the MCAS when activated was classified as “hazardous failure”. This means that the failure of the system could cause fatal injuries to a sizable no of passengers. Both classifications did not reflect the true state and potentials of the MCAS. Skybrary, an aviation safety data repository advocates for a structured approach to ensure that all potential hazards and likely case scenarios are identified and assessed to aid in the right classification of hazards. Table 1 is an operational safety assessment hazard classification matrix that is used to classify and evaluate the impact or severity of the occurrence of an aviation risk.

Table 1: Operational Safety Assessment Hazard Classification / Severity Matrix
Source: Skybrary Aviation Safety

 

Lion Air and Ethiopian Airlines crash recorded no survivors. The effect of the crashes on occupants of the aircraft led to multiple fatalities. Therefore, the potential impact of the MCAS ought to be classified as the most severe.  The likelihood or frequency of occurrence is another important factor to consider in assessing the risks posed by MCAS technology. The MCAS became activated during flight operations on the Lion Air aircraft in two separate incidents on two different days. Though, an off-duty pilot saved the day on the first day of occurrence before crashing off the coast of Java the next day. Ethiopian Airlines flight 302, on the other hand, did not overcome the technology and human-inflicted disaster the first time. In five (5) months, the events occurred thrice. The likelihood of occurrence is defined in table 2.

Table 2. Frequency/Likelihood Classification
Source: Civil Aviation Authority

Table 2. was developed by the Civil Aviation Authority, a safety regulation group that provides safety guidelines for aerodrome operators and air traffic service providers. Adopting table 2 as the classification model for this study, the frequency of MCAS activation can be classified as reasonably probable, considering that the gap between the last and most recent event is approximately between 130-150 days.

 Figure 3 is a risk tolerance matrix developed by the Civil Aviation Authority. Determining risk tolerance level involves measuring the probability of occurrence against severity and the appropriate measure of risk consequence is unacceptable.  

 Fig. 3. Risk Tolerability Matrix 
Source: Civil Aviation Authority

The crashes of new aircraft produced by a mature company like Boeing with CMMI level 5 certifications in different processes highlights the many overlooked inherent risks associated with new innovations and products or maybe poor practices and human errors which led to multiple fatalities and whistleblowing. 

Post Ethiopian Airlines crash, Boeing CEO at a news conference reiterated that the company followed all the steps in the processes in the design and certification of the aircraft. Development or adoption of new technology may be a good idea for a business. However, all the risks should be identified and accurately profiled. Boeing’s response to the MCAS activation malfunction is to provide a fix to the software which is in line with the recommendations proposed by Civil Aviation Authority. When the consequence of risk is unacceptable, a redesign of the system may be necessary to reduce the likelihood or severity of the consequences of the risk. Gates reported that the fix is expected to alter the MCAS functional design and allow MCAS to receive data from both attack sensors (Redundancy). While the risk remains part of the system, the fix is expected to give more control to pilots than the status quo.  

Counting the cost of grounding 737 max aircraft globally, project cost of software redesign, reputation damage, death benefit payout and loss of human lives, a diligent oversight function, and sound risk management practices would have saved Boeing from these blushes and families from eternal pains.

 

REFERENCES 

 Charlton, A. (2019). Why France is analyzing Ethiopian jet’s black boxes. Retrieved from 

https://www.seattletimes.com/business/why-france-is-analyzing-ethiopian-jets-black-boxes/

 

Civil Aviation Authority. (2006). Guidance on the conduct of Hazard Identification, Risk Assessment and the Production of Safety Cases. Retrieved from https://www.icao.int/safety/pbn/Documentation/States/UK CAA CAP760 Guidance on Conduct of Hazard Identif. Risk Ass. Production of Safety Cases .pdf

 

Gates, D. (2019). Flawed analysis, failed oversight: How Boeing and FAA certified the suspect 737 MAX flight control system. Retrieved from https://www.seattletimes.com/business/boeing-aerospace/failed-certification-faa-missed-safety-issues-in-the-737-max-system-implicated-in-the-lion-air-crash/

 

Gates, D. (2019). Facing Sharp Questions, Boeing CEO Refuses to Admit Flaws in 737 MAX Design. Retrieved from https://www.seattletimes.com/business/boeing-aerospace/facing-sharp-questions-boeing-ceo-refuses-to-admit-flaws-in-737-max-design/

 

Griffin, D. (2019). Source: Boeing whistleblowers report 737 Max problems to FAA. Retrieved from https://edition.cnn.com/2019/04/26/politics/faa-hotline-reports/index.html

 Lawson, S. (2016). Worm on the Sensor: What Happens When IoT Data is Bad. Retrieved from

 

https://www.cio.com/article/3151081/worm-on-the-sensor-what-happens-when-iot-data-is-bad.html

 

 Skybrary. Risk Assessment. Retrieved from https://www.skybrary.aero/index.php/Risk_Assessment#Severity_of_Hazards

Connect

Stay With Us

Subscribe to our news letter to get the lattst
new on Business