Feedback on Reviews of Hall et al Paper. HYPOTHESIS: Unusually, the main hypothesis of this paper is made explicit in section 1.3. The authors provide a reference model for the analysis of design-system-environment-operator interaction. They are careful in their claims for this reference model, saying only that (a) it will help in the identification and classification of events, not that it will yet be of practical value in real accident investigations. This hypothesis is made more concrete at the start of section 3, where the reference model is claimed to (b) describe *all undesirable states*. Note that (b) is *much* stronger than (a). EVIDENCE: One might expect experimental evidence to be presented to support one or the other form of this hypothesis. There is an immediate problem with form (b): it is impossible to identify all possible undesirable states that have or will occur. Thus it is impossible to fully support form (b). Form (a) is more amenable to experimental confirmation: we could present a wide-ranging but finite and feasibly small set of 'undesirable states' for which the reference model can be successfully used to identify and classify the events. Unfortunately, only one example is analysed in this way: the dead battery example in section 5. This is very insufficient as evidence -- even for (a). [Ironically, it is used as an example of predicting hazards rather than analysing accidents. More on this below.] Instead, section 3 is used to unpack the reference model to describe the space of analyses it is capable of. This is the wrong way round for supporting (a) or (b). They need to show that lots of examples can be successfully analysed, not to show the kinds of analysis that might be found. The unpacking in section 3 *would*, however, provide some partial support for an alternative hypothesis hinted at in the 3rd paragraph of section 5, namely that this reference model provides a good basis for hazard prediction, which is what they then proceed to illustrate in the dead battery example. (Although, you'd still want to see more discussion of actual accidents.) Here's an excellent example, then, of the kind of mess people get into when they don't properly marry up their stated hypotheses and evaluation: they carefully state a hypothesis they don't defend and then accidentally provide evidence for a hypothesis they barely hint at. FLAWS: * Note an implicit limitation of the reference model. In section 2.1 they describe a (STRIPS-like) cause/effect model consisting of chains discrete single actions. In the real world, however, interacting actions occur in parallel and some are continuous changes over a time interval. It's not clear they could deal with such complexities, despite claiming that this simplification is without loss of generality. * The discussion of formal proofs of the form W,S |- R is rather embarrassing and would have been better omitted. Their formalisation of aquaplaning, for instance, is clearly inadequate. If R is provable from W' and S in a monotonic logic like predicate calculus, then R will also be provable from W' u {aquaplaning} and S. * The frequent references to aquaplaning are a bit confusing to the novice reader. There *is* a famous aeroplane accident that involved aquaplaning, but this is *not* unfortunately one of the many accidents listed in the introduction. * In the conclusion, the authors' restate their hypothesis as "We hypothesise that our reference model brings with it formal, step-by-step, structure to discover, understand and describe human–computer–environment hazards". This seems to confuse the hypotheses (a) and (b) above with the implicit one on predicting hazards in section 5. They surely meant "accidents" rather than "hazards", which are only potential accidents. This looks to me like further evidence of their methodological confusion. * The diagrammatic notations from figure 7 onwards look seriously flawed. Why are both W and S represented twice? Why does the ws arrow connect only one pair and the sw arrow connect the other pair? Usually, this kind of 'commuting diagram' notation is used to make the point that one can take alternative routes with the same result, e.g. from W to W via w is equivalent to W to W via ws, s and then sw. Clearly, this is nonsense for this diagram, so confusing if you were expecting the standard interpretation. COMMON MISTAKES: Here are some particular points that people frequently got wrong: 1. Several of you stated the hypothesis as being that "all undesirable states are describable as discrepancies between the views of the operator and the designer". I admit the wording of the first paragraph of section 3, where you presumably got this from, is confusing, but they surely could not have intended this reading. They, for instance, discuss undesirable states as also arising from system implementation errors, which are due to neither designer nor operator. You were, however, right to highlight the designer and operator viewpoints as being the novel feature of this model. 2. You were very critical of the discussion of related work. However, section 6 discusses a lot of rival systems and points out how they differ from the one proposed here and (implicitly) in what ways they are worse. The discussion *does* assume some existing familiarity with these rivals, but this is fair enough in a specialist journal called Safety Science, where the average reader can be assumed to have some background knowledge. Also the compare and contrast could have been more explicit and based on test cases. Sadly, I've seen much worse than this and I thought your criticism was a bit OTT. 3. Several of you classified the contribution as including "an exploratory investigation to suggest a hypothesis". Given that there was no "exploratory investigation" and that the hypothesis was first stated in section 1.3, I don't think this can be correct. 4. Several of you classified the contribution as including "establishes a property of a technique or a relationship to other techniques", but provided as evidence that it built on previous similar techniques. The justification of this contribution should rather be some *contrast* on some dimension compared to rival techniques (or some absolute property on such a dimension). 5. Several of you described the work as "combining two or more techniques". There only seemed to be one technique to me: the proposed reference model, which extended earlier ones. The other stuff you listed only seemed to me to be part of this model. 6. Several of you classified the contribution as including "experimental or theoretical evidence supporting a hypothesis". You then went on to point out the absence of such evidence. You were right that there should have been such evidence.