The authors show the NP-hardness of and present a new heuristic for the
following variant of the Quartet Puzzling problem -- namely, derive a 
tree that has the largest summed-weight set of consistent embedded 
quartets relative to a given complete set of weighted quartets. They
also give the performance of this heuristic relative to a number of
biological and non-biological datasets. 

Positive aspects of this paper are the explicit proof of NP-hardness of
this problem (which was claimed but not proven in previous literature)
and the presentation of a simple heuristic for quartet-based phylogeny 
reconstruction. Given the NP-hardness of phylogeny reconstruction in 
general relative to most commonly-used criteria as well as the 
non-trivial algorithmic and run-time complexity of all 
previously-proposed quartet-based heuristics, such a simple heuristic
is potentially of great use. Negative aspects of this paper are 
primarily matters of presentation: In addition to various notational 
inconsistencies and inaccuracies noted below, the paper is written to
present the heuristic as a general clustering method for data, and its 
specific applicability to biologically-motivated problems (and hence
relevance to readers of TCBB) is obscured. 

In summary, though the paper is not acceptable to TCBB in its current
form, it should not be rejected as there is good work here that the 
TCBB readership should know about. Hence, I suggest the following 
overall revisions:

    1) Revise title and introduction to highlight presented algorithm as
        a quartet-based tree reconstruction heuristic. 
    2) Delete Section 2.
    3) In Section 5, add explicit run-time comparisons (as well as
        comparisons of theoretical time complexity) to previously
	proposed quartet-based methods (simply claiming better
	performance is not sufficient).
    4) Revise Section 6 to put derivation of quartet-costs more in
        context of biologically-motivated applications rather than
	general clustering.
    5) Reorganize testing results reported in Sections 7-9 to highlight
        biologically-motivated applications (specifically, phylogeny 
	reconstruction).

This, in conjunction with the particular revisions suggested below,
should yield a paper of relevance to TCBB. As these revisions do
not appear to be minor, the revised paper should be re-submitted for
a second review.

SPECIFIC COMMENTS

Page 3, last sentence of paragraph 1

    Relative to what criteria is optimal-tree search impossible /
     NP-hard? References should also be given to support this.

Page 7, Definition 3.1

    This is not a definition, and should be part of the main text.

Page 7

    A more formal definition of quartet embedding may be appropriate
     here.

Pages 8-11

    There are two types of problems here: those that return
     optimal solutions and those that return boolean values,
     i.e., decision problems. Both types of problems relative to
     MQC and MQTC should be defined explicitly (using the Definition 
     environment or as problems in the style of Garey and Johnson 
     (1979)) and consistently, e.g., optimal-solution problems have 
     inputs and outputs, decisions problems have inputs and associated 
     questions. Defining a decision problem after one starts a proof of
     its NP-hardness (as in Theorem 4.5) is a bit confusing.

Page 9, paragraph 1

    Should function-application notation for C be the same as that
     for W defined on page 7, i.e., C(uv|wx) rather than C_{uv|wx}?

Page 10, Section 4.1, paragraph 1

    Should perhaps be a bit more careful about stating the 
     consequences of NP-hardness --- such problems are not
     known to have poly-time optimal-solution algs, and best
     known algs are probably exponential (though not always --
     Turing Halting problem is NP-hard, but is not computable).

Pages 13, end of Section 4.1

    PTAS inapproximability of MQC only implies PTAS non-approximability
     of MQTC if the reduction in Theorem 4.5 is also a poly-time
     approximation-preserving reduction, e.g., L-reduction (also,
     approximability is relative to optimal-solution, not decision,
     versions of same). This would actually be very nice to prove here, 
     and I suspect it is rather easy to do so (this would also be 
     another nice justification for giving an explicit reduction such 
     as that in Theorem 4.5). 

Pages 13-15, Section 5.1

     Perhaps the algorithm could be given in a table or figure and
      the explanatory remarks turned into main text? It would also
      be good to mention in the abstract and introduction to this
      paper that the algorithm is a randomized / Monte Carlo
      heuristic as opposed to a deterministic one, e.g., local
      search.
     The use of the probability distribution on k-mutations to ensure
      that the algorithm does not get trapped in a local optimum is
      interesting. The authors already note that this is somewhat
      like simulated annealing (SA), with the algorithm in this paper
      corresponding to an SA algorithm with a single-temperature
      cooling schedule. Perhaps one could use a parallel invocation
      scheme (like that underlying the agreement termination condition
      described in Section 5.3) to create an SA algorithm with a
      truly SA-like variable cooling schedule? For example, as the 
      degree of agreement increases, the probability of large-k 
      mutations could decrease. A comparison of this variant
      with the original algorithm would be nice, and bolster the
      technical content of the paper.

Page 19, paragraph 1

     The specifications of r-values relative to n are interesting;
      where did they come from? If empirically derived, say so; else,
      if there is theoretical justification, it should be given.