Feedback on Reviews of Koehn et al Paper. FLAWS: The main criticisms of this paper that I was hoping you would find is a lack of explicit statement of a hypothesis and a corresponding absence of any evaluation to support some hypothesis. This is a classic "I built a system to do X" paper, for which the classic reaction is "So what?", i.e. "what is the scientific contribution?". This a pity, because the actual implementation looks quite promising and potentially useful, e.g. as part of a future teaching tool. There was some ambiguity as to whether VES has actually been built or merely designed and under construction. I have the advantage of having actually seen it in operation, so I happen to know it exists, but the paper is ambiguous and lacks a clear statement. The tutoring aspects mostly do *not* exist, however (see 2 below). In your own work, be especially careful of using the future tense, which you may intend to point forward in the paper, but can be misinterpreted as implying the work is unfinished. This error is illustrated in the abstract. The various design choices are not justified, e.g. the choice of the Protege approach to ontologies, the choices of the visualisation algorithms, etc. None of the various algorithms was described in sufficient detail for an expert reader to reproduce them or assess them in detail. Conference space limitations probably precluded this, which serves to emphasise the downside of the over reliance on conference publication in our field. In sections 4.1 and 4.2 two different methods of generating the visualisation are presented, but these are not compared or contrasted and it is unclear what conclusion to draw from this duplication. Nor is it clear which of these methods was incorporated into the final version of VES. HYPOTHESIS: I thought the closest the paper came to explicitly identifying a hypothesis is the last sentence of the abstract "In this way we show that VES serves as a case study for ontology based visualisation". I interpret this sentence to claim that they have developed an automatic mechanism for generating geometric visualisations of the heart from logical descriptions. To see that the visualisations are not just hard-wired, note that the meta-ontology is supposed to dynamically generate models of faulty hearts from the clinical findings and the healthy heart model, then these dynamically generated faulty heart models need to be visualised for use in the teaching tool. However, no evidence was presented that these faulty heart models were actually generated let alone visualised. Nor was there any discussion about how the visualisation process varied according to the heart model. Most seriously, there was no evaluation of the quality of the visualisations generated, especially those of faulty hearts. We might, for instance, have expected some medical experts to be asked to assess the visualisations. Another candidate for a hypothesis was that VES was faster than its rivals. However, no timing data was given and the speed of the other systems was neither described nor compared. The biggest missed opportunity, I thought, was some claim on the adaptability (cognitive science) dimension, i.e. a claim about the range of faulty heart models that could or could not be easily generated from the healthy heart model together with the clinical findings. It would be non-trivial to get the healthy heart model modularised in just such a way that all and only observed faulty heart models were easily generated from it. This issue is not even discussed, although it is critical if the mechanism is to be used as the basis of a teaching tool on diagnosing faulty hearts, especially the rare cases discussed in the introduction that inspired the whole project. COMMON MISTAKES: Here are some particular points that people often got wrong: 1. The authors were trying to model a natural system, namely the human heart. Not everyone noted this in the "kind of contribution" section. Some people justified the omission on the grounds either that this was not the main motivation or that the model was not at a fine enough level. I don't think either of these justifications is sufficient to omit work that clearly *did* model a natural system to a non-trivial level. Note that tutoring systems often require models of natural systems that are the subject of the tutoring. 2. The discussion of the proposed teaching aid misled some of you. This teaching aid has not yet really been built and is not, therefore, the main topic of this paper. As implied in the last paragraph of section 5, the tutoring system is intended to be built in future work. The, so-called, Tutoring Module is currently only an interface to the faulty heart models via the clinical findings ontology, as a careful reading of the penultimate paragraph of section 5 makes clear. In particular, that VES provides a successful tutoring system is *not* a claim to be evaluated in this paper. Rather, VES provides the precursor technology for such a tutoring system. 3. Despite what some of you claim, there are quite a lot of references to related work, e.g. [1], [2], [3] & [4]. However, the real problem with the related work discussion is that it fails to establish in what way the authors' work improves on the earlier work. 4. Some of your scores for the importance of the work were too high given your previous damning criticisms. It is not enough for work to be well motivated; it must also be good science to earn a high score. Don't be afraid to use the whole scale when assessing work. It's too easy to hedge your bets by overusing the middling grades; this is intellectual cowardice.