================================= NASA cultural failures on STS-107 ================================= by Andrew Main (Zefram) 2003-02-07T05Z abstract -------- On mission STS-107, the space shuttle Columbia (OV-102) suffered physical damage to its left wing during ascent. It is possible that this damage contributed to the subsequent breakup and loss of the orbiter during descent. During the entire flight, despite being aware that damage had occurred, NASA remained unaware of the extent of the damage, making inadequate efforts to determine the nature of the damage. This error is ascribed to three aspects of NASA's management of manned spaceflights: excessive reliance on checklists, cumbersome EVA procedures, and a lack of autonomy for astronauts in flight. 0. table of contents ==================== 0. table of contents 1. introduction 2. failure to ascertain the nature of the damage 3. excessive reliance on checklists 4. cumbersome nature of EVA 5. lack of autonomy for astronauts in flight 6. conclusion 7. references 1. introduction =============== With reports pointing to the Columbia breakup starting at the left wing, preceded by overheating in the same area [MCC-0203], it would now seem strange if the breakup were not related to the unprecedented damage sustained to the left wing during ascent. However, even if it is not the cause of the accident, in the handling of the damage we have witnessed a serious failure of NASA's philosophy for manned spaceflight. 2. failure to ascertain the nature of the damage ================================================ During ascent of the space shuttle Columbia on mission STS-107, a piece of insulating foam became detached from the main fuel tank, and collided with the left wing of the orbiter, causing physical damage. NASA ground crew observed the damage event during ascent, and scrutinised video frame by frame in an attempt to determine the extent of the damage sustained. Satisfied that the damage was unlikely to be critical, they allowed the mission to proceed as planned, through a sixteen day stay in orbit and into atmospheric re-entry. The information that could be gleaned concerning the damage from video shot from the ground was quite limited. Of particular concern was the question of how many heat-shield tiles were lost. The images examined by NASA had insufficient resolution to see individual tiles [MER]. Of course, at that stage of ascent the shuttle flies inverted, so the underside of the wing is not directly visible from the ground. The damage site was therefore not directly imaged at any resolution, and so any structural damage not causing fragments to fall off would leave no direct evidence on the video. Here's what the shuttle program manager, Ron Dittemore, had to say on this matter in a press conference following the accident [DIT-0201]: [I]n our review the following day of the launch films, we saw this piece of debris drop off, and it looked to us like it impacted the orbiter on the left wing. Where on the left wing it's very difficult for us to tell. Somewhere between the mid and outward span. Was it the leading edge? We don't know. Was it underneath the leading edge? We really don't know. There is a simple way in which better information could have been obtained concerning the damage. All it would take is for an astronaut -- of which there were seven available on the shuttle -- to go outside the vehicle and look. (As the damage site was not visible from the cockpit, it was not possible to make an examination without EVA [HQ-0205].) Even if no analysis could be made by the people in orbit, there were video cameras available, by which close-up images could have been sent to experts on the ground. Even if no cameras could be operated in EVA, the astronauts have scientific training and could surely have verbally given an adequate description of the damage. Yet at no point during the flight was there any attempt to make a direct examination of the damage site. This failure to do the obviously necessary can be directly attributed to the manner in which NASA manages manned spaceflights. There are three aspects of this that I wish to highlight, that contributed to the error discussed above. 3. excessive reliance on checklists =================================== Firstly, there is the slavery to the checklist. NASA staff seem to be stuck in a FORTRAN mindset: "if you run out of bullets, you continue anyway because you have no exception-processing ability". When failures occur that are anticipated on the checklist, everything is all right, because there's another checklist to follow to handle it. When the unanticipated happens, this approach fails, and they are lost. Even if a genius on the ground works out the right solution to a novel problem, the solution gets translated into a checklist before it goes to the astronauts. Apollo 13 was a case where a novel checklist worked, but creating and communicating the new checklist almost took more time and paper than was available to the astronauts. Later in the same press conference quoted above [DIT-0201], Ron Dittemore was asked whether an EVA was considered for examination of the damage. He said: We do not have the capability to perform a space walk and do tile repair. We do not have the capability -- as you know, when we go out of the spacecraft, we operate really within the confines of the payload bay. On this particular mission, ... all we had trained to do from a space[walk] perspective were those things that might be an emergency or a latch did not work and the payload bay door closing sequence or something like that. We can go outside and make sure they're closed. We have no capability to go over the side of the vehicle and go underneath the vehicle and look for an area of distress and repair a tile. This sounds more like a failure of imagination than a lack of physical capability. Granted, getting to the underside of the craft would be difficult, particularly without the manipulator arm (which was not carried on STS-107 [DIT-0201]). Obviously there's no checklist for it. But it's inconceivable that the crew would be incapable, in an emergency, of delivering an astronaut to any area of the vehicle exterior. Columbia was equipped, as usual, with the standard EVA equipment such as tethers [JSC-0205]. The tethers are sized for operations within the payload bay, but no mention is made in [JSC-0205] of possibilities such as linking tethers together. It is revealing that, in discussing their EVA capabilities in [DIT-0201] Dittemore spoke of "all we had trained to do". Dittemore's discussion of tile repair reveals a second failure of imagination. Apparently he sees no alternative to deorbiting the shuttle as planned, even if the heat shield were damaged to the point of being incapable of safe re-entry. The existence of the International Space Station, with a permanent crew, has added many possibilities to emergency operations. On 2003-02-02 a Progress vehicle launched as scheduled to resupply the ISS. This launch could have been repurposed to supply Columbia, buying the crew time in orbit while a launch of another shuttle on a rescue mission could have been arranged. (An earlier version of this paper proposed that Columbia could have modified its orbit to dock with the ISS, and shared supplies. However, Columbia was carrying insufficient fuel to reach the ISS orbit.) NASA has many bright people, both on the ground and in orbit, who between them could have worked out the details of these plans. Instead, NASA stuck to checklists and mission schedules that didn't cover the contingency at hand. NASA runs its missions in a tightly controlled military fashion, which works well as long as what happens has been foreseen. But, as any soldier will tell you, when the unexpected happens it's time to come up with a new plan. NASA has to be willing to do things that aren't in the manual, as were the NASA staff that rescued Apollo 13. 4. cumbersome nature of EVA =========================== Secondly, NASA makes EVA a big deal. They make it difficult. Critically, there's no simple way to perform what should be a simple procedure, such as making an examination of the vehicle's exterior. NASA's general EVA checklist [EVA-CHECK] occupies 169 pages, and (if this author is interpreting it correctly) describes a procedure taking a minimum of 120 minutes to exit the craft and 45 minutes to re-enter, plus many hours of `prebreathe'. There are hundreds of checklist items to deal with just to make a spacewalk, before considering the requirements of the specific activity at hand. This acts as a disincentive to making any EVA that is not directly critical to the mission, and makes emergency EVA impractical. EVA has to get easier, so that the mechanics of getting in and out of the vehicle don't get in the way of the work that needs to be done outside the craft. If it were possible to suit up and exit in, for example, twenty minutes, with no specific prior preparation, and re-enter the vehicle in a similar time, then it would be feasible to make ad hoc spacewalks. This would confer benefits for all missions, not just ones where emergency EVA needs arose. This would require not only radical changes to procedures, but also changes to equipment design to make it simpler to operate, and a rethink of the selection of gas mixtures and pressures used in order to eliminate the prebreathe requirement. 5. lack of autonomy for astronauts in flight ============================================ Finally, the decision on EVA was made by people on the ground, rather than the astronauts. NASA space missions are completely directed from the ground, by people who can't directly see what is happening in space, and who are primarily concerned with the science objectives and PR astronaut interviews. The astronauts, who are present where the action is, and who are most directly affected by the condition of their vehicle, have essentially no autonomy. This lack of autonomy is perhaps most strikingly demonstrated by the procedure for astronauts waking: their morning alarm consists of music sent over an audio channel from the ground, and the astronauts talk back to mission control just seconds later. It is difficult to imagine anything that would foster a greater feeling of dependence on the ground staff. Astronauts are being trained and conditioned not to think for themselves, but to refer all the thinking to mission control, who lead them through everything. This culture of helplessness needs to change. Concerning the waking procedure, it really isn't necessary for mission control to manage every minute of the astronauts' time. Astronauts should wake themselves, with their own alarm, and then be allowed to go through their personal waking procedure without greeting from mission control. Just like the rest of us, there is no need for them to report to the boss (mission control) until they are ready to work. It seems inhumane and degrading to deny astronauts even this level of autonomy in time management. Astronauts are intelligent people, with great problem-solving skills. Of course, they need support from the ground, where a wider range of expertise is available, but they should be encouraged to solve problems themselves where possible. This would give the astronauts valuable experience in manipulating their vehicle and other equipment, as well as building a more self-reliant attitude, both of which are essential to effective handling of emergencies. If the EVA procedure can be simplified as described in the preceding section, astronauts could gain a lot of independence in the choice to perform EVA and in the timing of EVA. Where a job will require more than one EVA, the astronauts (who have the only firsthand experience with EVA and with the job at hand) should be the ones to determine how many spacewalks to make, and how to divide the work between spacewalks. In fact, NASA should go further than this, and allow astronauts to make spacewalks that are not directly necessary to the mission, whenever the astronauts feel that it is useful. Astronauts should even be encouraged to make spacewalks for pleasure, which will give them valuable experience in EVA operations without the pressure of a specific task to perform. In an emergency, a crew to whom spacewalking is a familiar activity stand a much better chance of repairing their vehicle or doing whatever else is required than a crew that have only performed EVA when strictly necessary. 6. conclusion ============= Broadly speaking, NASA's manned missions are being managed in the same manner as the early exploratory flights, when constant contact with the ground was necessary, and nothing, especially EVA, was routine. These are no longer the circumstances in which manned space travel occurs, and the culture of the exploratory missions is no longer appropriate. In the case of STS-107, several aspects of NASA's mission management style contributed to a failure to remedy a potentially dangerous condition. Improved safety and effectiveness of future manned space missions depends critically on cultural changes being made within NASA. 7. references ============= [DIT-0201] NASA public briefing, 2003-02-01T15:31-05:00, and . [EVA-CHECK] NASA, "EVA Checklist: Generic, Rev. G", July 28, 2000, . [HQ-0205] NASA, "press briefing: STS-107 Columbia accident update, NASA headquarters, February 5, 2003", February 5, 2003, . [JSC-0205] NASA, "STS-107 Accident Response Briefing, Johnson Space Center, Houston", February 5, 2003, . [MCC-0203] NASA, "Mission Control Center Status Report #21: STS-107 Accident Response", 2003-02-03T08:00-06:00, . [MER] NASA, "STS-107 JSC MER Daily Reports", .