From: AI 2015 Date: Sat, Feb 28, 2015 at 9:11 PM Subject: AI 2015 notification for paper 58 To: Scott Watson Dear Scott, Following up on a previous email, we are sending you the reviewers' scores, to supplement the narrative reviews you already received. Hopefully you will find them useful. Please note that there was discussion among the PC before the decision was reached. It was more than a pure numerical exercise, as it depended on the confidence of the reviewers, and the completeness and degree of elaboration and justification of their review scores. Strong positive or negative scores without adequate justification were discounted during the deliberations. Regards, Evangelos & Denilson ----------------------- REVIEW 1 --------------------- PAPER: 58 TITLE: Exploring Options for Efficiently Evaluating the Playability of Computer Game Agents AUTHORS: Todd Wareham and Scott Watson OVERALL EVALUATION: -2 (reject) REVIEWER'S CONFIDENCE: 4 (high) Is the research objective clearly stated?: 3 (fair) Is the description of the methodology clear?: 2 (poor) How is the readability?: 2 (poor) How is the organization?: 3 (fair) Is related work adequately referenced?: 3 (fair) Are the results / claims compelling and well supported by experimental evaluation or proofs?: 2 (poor) How is the originality / novelty?: 2 (poor) Is the paper technically sound?: 2 (poor) ----------- REVIEW ----------- This paper studies game playability evaluation. It is intended to analyze when such evaluation is and is not tractable. However, no rigorous analysis is presented as I elaborate below. First, the types of games they consider are not clearly presented. Section 2 attempts to define the games where agents can only exchange items and facts. What practical games can be represented as such is not clearly articulated. The example given in page 4 does not illustrate the issue clearly. What is the objective of the game? What are the consequences of the actions? As is presented, the interaction sequences presented in Fig.2 make no sense to the reader. In Sections 3, 4 and 5, a total of 7 complexity results on playability evaluation are presented. None of them has a proof nor sufficiently convincing justification. The readers are supposed to just accept them as being correct. As a result, the paper contains no convincing contribution at all to the issue of playability evaluation. ----------------------- REVIEW 2 --------------------- PAPER: 58 TITLE: Exploring Options for Efficiently Evaluating the Playability of Computer Game Agents AUTHORS: Todd Wareham and Scott Watson OVERALL EVALUATION: -1 (weak reject) REVIEWER'S CONFIDENCE: 4 (high) Is the research objective clearly stated?: 3 (fair) Is the description of the methodology clear?: 2 (poor) How is the readability?: 3 (fair) How is the organization?: 3 (fair) Is related work adequately referenced?: 4 (good) Are the results / claims compelling and well supported by experimental evaluation or proofs?: 1 (very poor) How is the originality / novelty?: 3 (fair) Is the paper technically sound?: 2 (poor) ----------- REVIEW ----------- In this paper, Wareham and Watson present some theoretical results around the difficulty of essentially testing interactions between an NPC and a player for achieving given goals. As expected, without quite substantial limitations on the problem, this problem is intractable in the usual theoretical sense, i.e. being solvable in polynomial time. Since these results are rather expected, many people might think that this research is not interesting, especially given the fact that from a practical point of view (i.e. the view of game developers) general intractability has little impact, as long as the particular algorithm used to create NPCs comes with guarantees that the necessary interactions of a player with an NPC to achieve goals are possible (by, for example, requiring that limited searches lead to finding the right interactions). Despite that, I would be leaning to accept the paper, if the theory part would be stronger or, in other words, if the proofs or at least proof sketches were provided. To me, these proof sketches would be the contribution of the paper and naturally the contribution should be in the paper (and not requiring looking them up on the Internet). I am aware that 12 pages are a big challenge for this, but if the authors need more space, they should consider a conference allowing for more pages (or a journal). In its current form, the paper does not provide any interesting contribution to me although there is clearly potential. Hence my weak reject. ----------------------- REVIEW 3 --------------------- PAPER: 58 TITLE: Exploring Options for Efficiently Evaluating the Playability of Computer Game Agents AUTHORS: Todd Wareham and Scott Watson OVERALL EVALUATION: 0 (borderline paper) REVIEWER'S CONFIDENCE: 3 (medium) Is the research objective clearly stated?: 4 (good) Is the description of the methodology clear?: 3 (fair) How is the readability?: 3 (fair) How is the organization?: 3 (fair) Is related work adequately referenced?: 3 (fair) Are the results / claims compelling and well supported by experimental evaluation or proofs?: 2 (poor) How is the originality / novelty?: 2 (poor) Is the paper technically sound?: 3 (fair) ----------- REVIEW ----------- The paper presents proofs that in most cases, evaluating agent playability is NP-hard and not fixed-parameter tractable. The proofs appear to be correct, but there are some problems with the paper that need to be addressed: - The results appear to be quite straightforward, even though the setting used is not as general as other intractabilty results from multi-agent systems. The proof techniques used are standard. Therefore, it is not clear what is the novelty of this approach. - The results show that only unrealistic cases are tractable. the paper discusses this issue and suggests that the results provide ways of breaking a game into solvable pieces, but the argument presented is still quite weak. In practice, using simulations as well as human testers is a methodology which has been used quite successfully, and will likely to continue to be used. This is akin to the fact that in theory, proving that a program is correct or confirms to a specification is very hard, but in practice, people use test cases and establish correctness anyway (albeit not in a formal manner). Hence, it is likely that these results will have very little practical impact. The issue of significance of the results from a practical point of view needs to be discussed further. - While the paper is generally well written, it is troublesome that all the proofs are in the appendix. It would be better if the example agents would be significantly shortened and replaced with an overview of the main proof steps, as the proofs appear to be the main contribution.