Research Report 01-0024 Artificial Intelligence Programs The University of Georgia Athens, Georgia 30602 Available by ftp from aisun1.ai.uga.edu (128.192.12.9) Series editor: Michael Covington mcovingt@aisun1.ai.uga.edu 0 1 From English to Prolog via Discourse Representation Theory ACMC Research Report 01-0024 Michael A. Covington Donald Nute Nora Schmitz David Goodman Advanced Computational Methods Center University of Georgia Athens, Georgia 30602 April 1988 1. Introduction/Abstract This is a preliminary report on a set of techniques for translating the discourse representation structures (DRSes) of Kamp (1981) into semantically equivalent clauses in a slightly extended form of Prolog. Together with discourse representation theory (DRT) itself, these techniques yield a system for translating English into Prolog. A working prototype has been built using Quintus Prolog on a VAX workstation.1 2. Methodology Both computational linguistics and natural language semantics have reached the stage where precise solutions to small problems are of greater value than vague proposals for solutions to large problems. This paper is offered in that spirit. Discourse representation theory (DRT) covers only a subset of the semantic phenomena of English, not including, for example, plurals or definite noun phrases. Prolog, even with our extensions, is less powerful than classical logic, which in turn is less powerful than natural language. So any system that translates English to Prolog via DRT can accept no more than a subset of English. Nonetheless, this subset is large enough to display clearly the solutions to a variety of problems of semantic representation -- solutions that will carry over into a more comprehensive theory when one is developed. The following are among the design goals of our system: 1 This work was supported by National Science Foundation Grant Number IST-85-02477. Opinions and conclusions expressed here are solely those of the authors. 2 (1) To preserve the main advantage of Prolog -- its ability to perform inference more rapidly than a full resolution theorem prover by constraining its search space. This contrasts with the approaches taken by Guenthner (1986), Kolb (1985, 1987), and a group at Imperial College, London (Gabbay, personal communication), who built inference engines for discourse representation structures. (2) To preserve the main advantage of DRT -- the ability to represent a text as a single unit by focusing on its semantic and logical structure rather than on individual sentences or propositions, and hence incorporating the textual context that is necessary for interpreting each subsequent sentence. (3) To represent knowledge and its semantic structure in a form as close as possible to ordinary Prolog clauses. We extend Prolog by adding conditional queries (the ability to ask whether an "if-then" relation holds), explicit negation (rather than negation as failure), and a table of identity that enables non-unifiable terms to be treated as equivalent for some purposes. 3. Discourse representation theory 3.1. Overview of DRT Discourse representation theory (DRT), introduced by Kamp (1981), is a more satisfactory representation of natural language semantics than earlier representations such as classical logic or Montague grammar. The key properties of DRT include the following: (1) It is not sentence-based. DRT constructs representations of discourses, not sentences. These representations are called discourse representation structures (DRSes). Rather than building a DRS for each sentence, we build a single DRS to which each sentence contributes some material, using the previously present material as context. (2) It is not tied closely to the syntax of English nor to a particular theory of syntax. Any syntactic analysis suitable for determining meaning can be used in an implementation of DRT. (3) Structural restrictions on the accessibility of anaphoric antecedents are predicted correctly. (4) A theory of truth-conditions is built in. Truth is defined in terms of embedding the DRS in a model. Thus DRT preserves the advantages of predicate logic as a representation language while bringing the formalization closer to the structure of natural language. 3.2. Discourse representation structures 3 A discourse representation structure is an ordered pair where U is a universe of discourse, i.e., a set of discourse entities, and Con is a set of conditions, i.e., predicates or equations that these entities must fulfill. For example, the sentence A farmer owns a donkey. is represented by the DRS U={X1,X2} Con={farmer(X1),donkey(X2),owns(X1,X2)} or in more usual notation: Ŀ X1 X2 Ĵ farmer(X1) donkey(X2) owns(X1,X2) The truth of a DRS is defined by trying to embed it in a model. A model comprises a domain D, i.e., a set of entities that can be discussed, and an interpretation function I that maps every n-place predicate of the language onto a set of n-tuples of elements of D, and maps every logical constant of the language onto an individual in D. Intuitively, D is the set of things you can talk about (including imaginary as well as real entities); I(farmer) is the set of elements of D that are farmers; and I(owns) is the set of pairs of elements of D such that the first one owns the second one. To embed a DRS into a model, assign each discourse variable (X1 and X2 above) to an element of D. The DRS is true in the model if and only if it can be embedded in such a way that all of the conditions are satisfied. Thus, the DRS above is true in a particular model if, within that model, it is possible to assign X1 to a farmer and X2 to a donkey which is owned by that farmer. 3.3. Equations Our implementation treats is as a predicate that requires two discourse variables to refer to the same individual. For example, the sentence Pedro is a farmer. is represented as: Ŀ X1 X2 4 Ĵ named(X1,'Pedro') farmer(X2) X1 = X2 This is a temporary measure that will be replaced by a fully developed table of identity (section 6.1 below). 3.4. Conditionals ("if-thens")2 An "if-then" relationship between two propositions is expressed by a special type of DRS condition. Thus A farmer owns a donkey. If it is hungry he feeds it. is represented as: Ŀ X1 X2 Ĵ farmer(X1) donkey(X2) Ŀ Ŀ Ĵ Ĵ hungry(X2) ==> feeds(X1,X2) Crucially, one of the conditions of this DRS consists of two more DRSes joined by '==>'. In general, the condition DRS1 ==> DRS2 is satisfied (in a particular model) if and only if every embedding that satisfies DRS1 also satisfies, or can be extended to satisfy, DRS2. By saying 'extended' we leave open the possibility that DRS2 may contain new variables of its own, which will need to be assigned to individuals in the universe of discourse. This extends the embedding of DRS1 and of the surrounding larger DRS, which did not assign these variables. 2 We will call these "if-thens" rather than "conditionals" because the term "conditional" is too easily confused with "condition," and because some DRT "if-then" structures correspond to sentences such as All men are mortal, which are called universals rather than conditionals in ordinary philosophical discourse. 5 Note that the new variables of DRS1, if any, have implicit universal quantifiers, and the new variables of DRS2 have implicit existential quantifiers. To satisfy the whole condition, we must find some embedding of DRS2 to go with every embedding of DRS1. A universal sentence such as Every farmer owns a donkey. is actually a kind of if-then; it means "If X is a farmer, then X owns a donkey" and is represented as: Ŀ Ĵ Ŀ Ŀ X1 X2 Ĵ ==> Ĵ farmer(X1) donkey(X2) owns(X1,X2) That is: "Every embedding that assigns X1 to a farmer can be extended to form an embedding that assigns X2 to a donkey owned by X1." Or in ordinary language, "For every farmer X1, there is some donkey X2 owned by X1." Because the quantifiers are implicit and there are no sentence boundaries, some familiar problems of quantifier scope no longer arise. Consider for example the discourse A farmer owns a donkey. He feeds it. In classical logic the first sentence would be (Some X)(Some Y) farmer(X) & donkey(Y) & owns(X,Y). The second sentence must somehow have access to the same donkey and the same farmer, but if it is treated as a separate proposition, its variables cannot be under the scope of the same quantifiers. Any translation into classical logic will therefore need a special, explicit rule for combining two propositions into one. In DRT, on the other hand, this combining is implicit and automatic; sentences keep being added to the same DRS until there is a reason not to do so. 3.5. Anaphora and accessibility A DRS can use its own discourse variables and those of DRSes which are superordinate to it. Whenever a DRS contains another DRS, the outer DRS is superordinate to the inner one. Further, the left side of an if-then is superordinate to the right side, and superordinateness is transitive. 6 When one DRS is superordinate to another, we will say that the second DRS is subordinate to the first. This restriction makes correct predictions about the accessibility of antecedents to anaphors (Kamp 1981). 3.6. Negation DRT represents negated assertions as DRSes within DRSes, preceded by the negation operator (here written NEG). Thus Pedro owns a donkey. He does not beat it. is represented by: Ŀ X1 X2 Ĵ named(X1,'Pedro') donkey(X2) owns(X1,X2) Ŀ NEG:Ĵ beats(X1,X2) A condition of the form NEG DRSx is satisfied if and only if there is no extension of the current embedding that makes DRSx true. Thus the above DRS is true in a model if there is an assignment of X1 and X2 such that X1 is Pedro and X2 is a donkey owned by Pedro, and this assignment cannot be extended to satisfy the condition that X1 beats X2. We note in passing a logical problem: what if Pedro owns two donkeys? DRT allows us to choose the "wrong" donkey -- the one he does not beat -- and use it to satisfy the above DRS even though it is not clear that this move would be faithful to the meaning of the original English-language discourse. This is related to the problem that present versions of DRT do not address plurals at all. 3.7. Questions Kamp's DRT provides no way to distinguish statements from questions. We treat yes-no questions as DRSes within DRS-conditions, marked with the operator QUERY. Conditions of this type can appear only in the topmost DRS, not in subordinate DRSes. The most common use of questions in a natural language understanding system is to request information from the computer. Accordingly, we posit 7 that a question does not contribute to the knowledge base, but rather directs the hearer to test the truth of the included DRS and report the result. Thus the discourse: Maria is a woman. Is she happy? is represented as Ŀ X1 Ĵ named(X1,'Maria') woman(X1) Ŀ QUERY:Ĵ happy(X1) A more general representation of questions would also be able to specify variables whose values should be reported (e.g., "Who owns a donkey?"). 4. DRS construction The DRS-building algorithm used in this project is that described by Covington and Schmitz (1988), a unification-based grammar modeled closely on that of Johnson and Klein (1986). A DRS is represented by a Prolog term drs(U,Con) where U is a list of discourse variables and Con is a list of structures containing these variables. In Prolog, the fastest way to build a list is to add material at the beginning. As a result, the elements of U and Con appear in the opposite of the order in which they occur in the discourse; this obviously has no effect on their truth conditions. Further, Con contains additional predicates specifying the gender of every discourse variable, to make it possible to find the antecedents of anaphoric pronouns. In the Covington and Schmitz implementation, discourse variables were represented by unique integers, but here they are unique Prolog variables. This makes it possible to perform skolemization (described below) by simply instantiating heretofore uninstantiated Prolog variables. For example, A man owns a donkey is represented as drs([X2,X1],[owns(X1,X2), gender(X2,n), donkey(X2), gender(X1,m), man(X1)]. 8 Proper names are represented by the predicate named, as in named(X3,'Pedro') thus leaving open the possibility of two or more people with the same name, or two or more names for the same person. Special DRS-conditions of the following forms represent, respectively, negated assertions, questions, and if-thens: neg(drs(...,...)) query(drs(...,...)) ifthen(drs(...,...)) These are elements of Con just like ordinary conditions. The implementation has the ability to resolve anaphors, and for this purpose, it uses the gender predicate to specify the gender of each discourse variable. The antecedent of he, she, or it is the most recently mentioned discourse variable of the appropriate gender which is found in the current DRS or a DRS superordinate to it. The discourse variable for the anaphor and the discourse variable for the antecedent are unified by the anaphora resolver so they are for all practical purposes identical. 5. Translation into Prolog 5.1. Discarding irrelevant information The first step in translating a DRS into a set of Prolog clauses and/or queries is to "clean up" the output of the DRS-builder by discarding irrelevant information. This involves discarding all the DRS-conditions that describe gender. Further, the verb is introduces conditions that say that two discourse variables must be the same individual. For example, the sentence Pedro is a farmer. is represented (ignoring gender information) as: Ŀ X1 X2 Ĵ named(X1,'Pedro') farmer(X2) X1 = X2 9 In the present implementation, the clean-up routine simply unifies X1 with X2, giving: Ŀ X1 Ĵ named(X1,'Pedro') farmer(X1) We will see in Section 6.1 that in some cases, two equated discourse variables may not be unifiable after skolemization. A more adequate implementation would leave the equation in the DRS and later mark the equated discourse variables as equivalent in the table of identity. 5.2. Simple questions The easiest sentences to translate into Prolog are simple questions such as Does Pedro own a donkey? which are, in effect, instructions to test the truth of a DRS in the model defined by the current knowledge base. The queried DRS is: Ŀ X1 X2 Ĵ named(X1,'Pedro') donkey(X2) owns(X1,X2) We want to know whether there is an assignment of values to X1 and X2 such that named(X1,'Pedro'), donkey(X2), and owns(X1,X2) will all be satisfied. Ex hypothesi, the interpretations of named, donkey, and owns -- that is, the sets of argument tuples that satisfy them -- are given by the definitions of these predicates in Prolog. Thus our task is exactly equivalent to solving the Prolog query: ?- named(X1,'Pedro'), donkey(X2), owns(X1,X2). Variables in Prolog queries have implicit existential quantifiers; so do free variables in DRSes whose truth is being tested. For this type of sentence, then, the semantics of Prolog and of DRT match closely. 5.3. Simple assertions 10 Assertions -- statements of fact -- are slightly more complicated to handle. Clearly, the DRS for A farmer owns a donkey. -- namely Ŀ X1 X2 Ĵ farmer(X1) donkey(X2) owns(X1,X2) should be translated by adding something to the knowledge base. But what should we add? We can't just leave the variables free, generating the clauses farmer(X1). donkey(X2). owns(X1,X2). because free variables in Prolog facts and rules have implicit universal quantifiers. In satisfying a Prolog query, free variables match anything, so these clauses would mean "Anything is a farmer; anything is a donkey; anything owns anything." In fact, because like-named variables in different clauses are distinct, we haven't even succeeded in saying that the donkey is the same as the thing that is owned. Rather, we must provide "dummy names" for existentially quantified entities -- one-element lists with distinct integers inside. We will call the hypothetical farmer [1] and call the hypothetical donkey [2], and assert the facts: farmer([1]). donkey([2]). owns([1],[2]). Now if we "Is there a farmer that owns a donkey?" -- i.e., the query ?- farmer(X), donkey(Y), owns(X,Y). we get the answer that there is, and that the farmer is known as [1] and the donkey is known as [2]. This is a special case of skolemization, to be dealt with below. In a later implementation, we will use a table of identity to ensure that distinct names can be recognized as referring to the same individual when necessary. 11 Even proper names are not treated as logical constants. Thus, Pedro owns a donkey goes into Prolog as: named([1],'Pedro'). owns([1],[2]). donkey([2]). There would be little advantage in using proper names as logical constants because no proper names are available for most individuals. Further, by using the predicate named we allow for ambiguous names and for individuals with more than one name. 5.4. Simple if-thens In the simplest case, an if-then DRS condition is equivalent to a Prolog rule. Consider the DRS condition: Every old donkey is gray. Ŀ Ŀ X1 Ĵ Ĵ donkey(X1) ==> gray(X1) old(X1) This condition is true if and only if every embedding that satisfies its antecedent -- that is, every embedding that assigns X1 to an old donkey - - also assigns X1 to a gray donkey. If this if-then condition is known to be true, then, for any X, we can infer that X is gray if we can prove that X is old and is a donkey. This is exactly equivalent to the Prolog rule: gray(X) :- donkey(X), old(X). Here again Prolog semantics exactly matches DRT. 5.5. Distributing consequents A minor syntactic problem arises if the consequent contains more than one predicate, as in the following: Every donkey is furry and warm-blooded. Ŀ Ŀ X1 Ĵ Ĵ donkey(X1) ==> furry(X1) warm-blooded(X1) 12 A Prolog rule cannot have two predicates in its consequent; rules of the form furry(X1), warm-blooded(X1) :- donkey(X1). are not permitted. We deal with this by defining, for purposes of internal representation, an operator ::- which is like the usual 'if' (:- ) except that: (1) Both the antecedent and the consequent can be compound goals; (2) Queries can be headed by ::-, i.e., one can ask whether an if- then relation holds. Rules headed by ::- are never asserted directly into the knowledge base. Instead, when an if-then is to be asserted, the consequent is broken up and a series of ordinary Prolog rules is generated. A rule of the form a, b, c ::- d, e, f. is added to the knowledge base as the three rules: a :- d, e, f. b :- d, e, f. c :- d, e, f. This is known as distributing consequents. For very complex consequents, a more compact representation could be obtained by creating a new symbol x (taking as arguments are all the arguments of d, e, and f, if any) and asserting, instead, the four rules: x :- d, e, f. a :- x. b :- x. c :- x. This was not done in the current implementation. 5.6. Skolemization If the consequent of an if-then introduces new variables, these variables have implicit existential quantifiers, which Prolog cannot represent. Thus Every farmer owns a donkey. Ŀ Ŀ X1 X2 Ĵ Ĵ 13 farmer(X1) ==> donkey(X2) owns(X1,X2) is true if and only if, for every embedding that assigns X1 to a farmer, there is some embedding that assigns X2 to a donkey owned by X1. In Prolog (extended with ::-), if we were to say donkey(X2), owns(X1,X2) ::- farmer(X1). we would be saying that every farmer owns every donkey. If we gave the donkey a dummy name, say [92], we would not be much better off, because donkey([92]), owns(X1,[92]) ::- farmer(X1). would mean that every farmer owns the same donkey (this particular one identified as [92]). What we want to say is that every farmer owns a different donkey. This is achieved by giving the donkey a dummy name that contains X1 so that, in effect, the name depends on the assignment of X1:3 donkey([92|X1]), owns(X1,[92,X1]) ::- farmer(X1). Then if we assign X1 to an individual named [83] who is a farmer, we get a donkey named [92,83]; if we assign X1 to another farmer named [17], we get a donkey named [92,17]; and so on. The table of identity described below will enable us to equate [92,17] with a donkey already known under another dummy name if necessary. This is a form of skolemization, the method of eliminating existential quantifiers proposed by Skolem (1928). Skolemization replaces every existentially quantified variable with a function whose value is an individual that satisfies the formula. For instance, in classical logic, (All X) (Some Y) g(X,Y) can be replaced by (All X) g(X,f(X)) where f is a function which, given a value of X, yields a value of Y that would satisfy the original formula. The existential quantifier is nothing more than a claim that such a function exists. Its arguments are all the universally quantified variables in the scope of whose quantifier the existentially quantified variable occurs. Thus if every farmer owns a donkey, there is (or can be) a different donkey for each farmer, and the 3 Recall that | divides a list into head and tail in Prolog, so that [a|[b,c,d]] is equivalent to [a,b,c,d]. 14 Skolem function for the existentially quantified donkey must take as an argument the universally quantified farmer. In large formulae, there are often some universally quantified variables that can be shown to be irrelevant to a particular predicate; these need not be arguments of the Skolem function, though it does no harm to use them. Andrews (1986:123-127) compares various methods of skolemization and shows how to eliminate unneeded arguments. To skolemize an if-then structure in a DRS, we replace all the variables in the consequent with Skolem functions of the variables in the antecedent. A Prolog term such as [92|X1] has a value that is a function of the value of X1; this makes it an appropriate way to encode a Skolem function. Thus, for example, the DRS-condition Ŀ Ŀ X1 X2 Ĵ Ĵ farmer(X1) ==> donkey(X2) owns(X1,X2) is transformed into: Ŀ Ŀ X1 [92,X1] Ĵ Ĵ farmer(X1) ==> donkey([92,X1]) owns(X1,[92,X1]) which goes into extended Prolog as donkey([92,X1]), owns(X1,[92,X1]) ::- farmer(X1) and is asserted as the two Prolog rules: donkey([92,X1]) :- farmer(X1). owns(X1,[92,X1]) :- farmer(X1). The number 92 here is of course arbitrary, produced by a routine that returns a heretofore unused integer every time it is called. 5.7. Queried if-thens "Is every old donkey gray?" may mean either of two things. It may mean, "If you are told that something is an old donkey, can you deduce that it is gray?" Or it may mean "Are all the old donkeys that you know of gray?" We will call these the deductive and inductive approaches to querying an if-then. 15 The deductive approach is easily implemented using a technique suggested by Gabbay and Reyle (1984): invent a hypothetical individual, temporarily assert that it is an old donkey, and test whether it can be deduced to be gray. We put this into our implementation by defining ::- as a Prolog predicate so that queries of the form A ::- B can be answered: (A ::- B) :- skolemize(B,[]), asserta(B), test(A,Result), retract(B), Result = yes. That is, to test whether A ::- B holds, first replace all the variables in B with dummy names, then temporarily add B to the knowledge base,4 then try to deduce A. Here test(A,Result) is a metalogical predicate that instantiates Result to yes or no depending on whether A succeeds; test itself succeeds in either case. Note that this works only as long as A ::- B does not rely on negation as failure. If we were trying to test the validity of gray(X) ::- donkey(X) and the knowledge base contained only the rule gray(X) :- donkey(X), not young(X). we would get wrong results. We would assert something like donkey([27]) and then query gray([27]). The hypothetical donkey would be taken to be "not young" simply because we did not assert that it is young, and gray([27]) would therefore be derivable, leading to the mistaken conclusion that gray(X) is derivable from donkey(X) in all cases. We avoid this problem by using an explicit (positive) way of representing negation (Section 6.2). The inductive approach says that A ::- B is true if (a) there are no cases in the knowledge base that satisfy B without satisfying A, and (b) there are enough cases that satisfy B to warrant making an inductive generalization. That is, all donkeys are gray if the knowledge base contains a sufficient number of donkeys and none of them fail to be gray. The value of this "sufficient number" is open to question; in a human thinker it depends on, among other things, the size of the overall sample, the rarity of the phenomenon being described, the expected regularity, and the thinker's training in statistics. Here we will take 4 A practical implementation would allow B to be a compound goal and would replace asserta and retract with procedures that assert and retract all of the conjuncts of which B is composed. 16 it to be 1. Thus we add another rule to the definition of the predicate ::- as follows: (A ::- B) :- not (B, not A), /* No counterexamples */ B. /* One positive example */ This provides all that is needed to handle queries that ask whether an if-then is true. Note that queried if-thens are not skolemized. In the context of a query, free Prolog variables are taken to be existentially quantified, exactly as the DRT semantics requires. 5.8. Nested if-thens 5.8.1. Nested antecedents: Prolog subgoals Because if-thens can be queried on the Prolog level, they can appear as subgoals in a Prolog rule. This provides a way to handle an if-then within the antecedent of an if-then. Consider for example the DRS: If every farmer owns a donkey, a man is happy. Ŀ Ĵ Ŀ Ŀ Ŀ X1 X2 X3 Ĵ==> Ĵ ==> Ĵ farmer(X1) donkey(X2) man(X3) owns(X1,X2) happy(X3) ٳ Because the antecedent of an if-then is like a query, the inner if-then will not be skolemized.5 X3 is skolemized by a Skolem function with no arguments; its arguments should be the variables of the antecedent as a whole, and in this example there are none. So the resulting Prolog clauses are: man([57]) :- ( (donkey(X2), owns(X1,X2)) ::- farmer(X1) ). happy([57]) :- ( (donkey(X2), owns(X1,X2)) ::- farmer(X1) ). That is: [57] is a man and is happy if it is the case that for every farmer X1, there is a donkey X2 that is owned by X1. 5.8.2. Nested consequents: Exportation 5 The present implementation erroneously skolemizes the antecedents of all if-then structures, even in queried contexts. 17 Nesting in the consequent of an if-then is more complicated. Consider the sentence: If Pedro is brave then every woman admires him. The corresponding DRS is: Ŀ Ŀ X1 Ĵ Ĵ ==> Ŀ Ŀ named(X1,'Pedro') X2 brave(X1) Ĵ ==> Ĵ woman(X2) admires(X2,X1) Naively, this should go into Prolog as something like ( admires(X2,X1) ::- woman(X2) ) :- named(X1,'Pedro'), brave(X1). But this will not do. The above DRS can be used to infer that a particular woman -- Maria, for instance -- admires Pedro. The Prolog rule cannot; at best it would match a query asking whether the if-then relation admires(X2,X1) ::- woman(X2) holds. In cases like this we employ the procedure of exportation, familiar from classical logic, which enables us to transform if P then (if Q then R) into if P and Q then R. Actual DRSes are not as neat as the formulae of classical logic, and practical questions arise, illustrated by the following example: 18 Every farmer owns a donkey, and if it's hungry he feeds it. Ŀ Ŀ X1 X2 Ĵ ==> Ĵ farmer(X1) donkey(X2) owns(X1,X2) Ŀ Ŀ Ĵ Ĵ hungry(X2) ==> feeds(X1,X2) Like all if-thens, this is not a self-standing DRS, but rather a condition in a larger DRS. Exportation should break it into two conditions: Ŀ Ŀ X1 X2 Ĵ ==> Ĵ farmer(X1) donkey(X2) owns(X1,X2) Ŀ Ŀ X1 X2 Ĵ ==> Ĵ farmer(X1) feeds(X1,X2) donkey(X2) owns(X1,X2) hungry(X2) The first of these comprises the material in the original if-then that was not affected by exportation. The second is the result of exporting the inner if-then. Its consequent is the consequent of the inner if-then. Its antecedent is a combination of the antecedent of the outer if-then, the antecedent of the inner if-then, and, crucially, the consequent of the outer if-then, except for the part actually being exported. If donkey(X2) were not in the antecedent of the second structure, nothing would say what kind of animal the farmer was feeding. This procedure, if followed ruthlessly, leads to excessively complex formulas. Suppose the outer if-then has two if-thens in its consequent, and we are exporting one of them. The antecedent of the result should contain, inter alia, all the other conditions from the consequent of the original outer if-then. And one of these is itself an if-then. This is not forbidden -- after all, nesting of if-thens in the antecedent is permitted. But the two if-thens in the original consequent cannot use each other's variables, since neither is superordinate to the other. So 19 it seems unlikely, if not impossible, for one of them to play an essential role in identifying the entities used by the other. The actual implementation takes a more modest approach. The if-then structure created by exportation has in its antecedent all the variables and conditions of the antecedents of the inner and outer if-thens, and all the variables, but only some of the conditions, of the consequent of the outer if-then. Specifically, it has the conditions which (a) are simple, not involving nesting of if-thens, and (b) occurred prior to the inner if-then in the discourse. This seems to be adequate for natural langauge in actual use. 6. Remaining issues 6.1. The table of identity Individuals introduced into the discourse under different names or Skolem functions may later be discovered to be identical. For example: Thales observes Hesperus. Aristotle observes Phosphorus. Hesperus is (the same as) Phosphorus. The DRS for the first two sentences is: Ŀ X1 X2 X3 X4 Ĵ named(X1,'Thales') named(X2,'Hesperus') observes(X1,X2) named(X3,'Aristotle') named(X4,'Phosphorus') observes(X3,X4) Assuming incremental processing, each DRS condition will be skolemized and entered into the knowledge base as soon as its sentence is processed, if not sooner. Thus, at the end of the second sentence, we have: named([1],'Thales'). named([2],'Hesperus'). observes([1],[2]). named([3],'Aristotle'). named([4],'Phosphorus'). observes([3],[4]). But now we discover from the third sentence that Hesperus and Phosphorus are different names for the same entity. We would like to note this fact in such a way that we can later ask, 20 Does Aristotle observe Hesperus? ?- named(X,'Aristotle'), named(Y,'Hesperus'), observes(X,Y). and get "yes" as a reply. But Hesperus and Phosphorus have been skolemized as non-unifiable entities, [3] and [4], so we can't just unify them. We have two options. One is to go back through the knowledge base and change all occurrences of [4] to [3], or vice versa. The other, which we actually adopt, is to maintain a table of identity so that non-unifiable terms can be recognized as equivalent. We will then formulate the query as: ?- named(X,'Aristotle'), identical(X,X1), named(Y,'Hesperus'), identical(Y,Y1), observes(X1,Y1). Each argument -- X and Y -- is now passed through an extra level of indirectness. On the first attempt, the identical predicate simply unifies its two arguments, instantiating X1 to X and Y1 to Y, so that the query works exactly like our earlier proposal. But if this fails, it looks at the table of identity and tries to instantiate X1 and X to two different terms that refer to the same object, and likewise with Y1 and Y. Thus, if [3] and [4] have been entered in the table, they will be treated as equivalent. This query looks verbose, but in fact it is generated by a simple transformation of the original query: just insert calls to identical for any arguments that are used in more than one subgoal. An alternative would be to modify the inference engine so that it consults the table of identity when solving queries. The table of identity has been prototyped but not yet integrated with the main English-to-Prolog translation system. The problem is that identity is a symmetric (commutative) and transitive relation. Thus it would require the rules identical(X,Y) :- identical(Y,X). identical(X,Z) :- identical(X,Y), identical(Y,Z). which would cause loops under some conditions by calling themselves endlessly. Our solution is to store the table as an identity matrix represented by a set of Prolog facts. The predicate identical is defined by only facts, not rules, and hence cannot loop. All the appropriate facts are generated when an entry is made into the table. Initially, only the clause identical(X,X). 21 is in the knowledge base; this corresponds to the main diagonal of the matrix and ensures that any term will be treated as identical to itself. Additional entries are made by a predicate make_identical. Calling make_identical(a,b), for example, adds not only the facts identical(a,b). identical(b,a). but also any other facts called for by transitivity. For example, if identical(a,c) were already in the knowledge base, then make_identical(a,b) would add four facts: identical(a,b). identical(b,a). identical(c,b). identical(b,c). The Prolog code to do this reflects the 2x2 dimensionality of the identity matrix: make_identical(X,Y) :- setof(X1,identical(X,X1),XMatches), setof(Y1,identical(Y,Y1),YMatches), make_id_list_squared(XMatches,YMatches). make_id_list_squared([],_). make_id_list_squared([H|T],List) :- make_id_list(H,List), make_id_list_squared(T,List). make_id_list(_,[]). make_id_list(X,[H|T]) :- (identical(X,H) ; assert(identical(X,H))), (identical(H,X) ; assert(identical(H,X))), make_id_list(X,T). If the table of identity contains information for n individuals, it will contain at most n2 clauses for identical, and usually considerably fewer. 6.2. Negation The current implementation does not handle negation. We propose to handle negation as a metalogical predicate neg, as in d-Prolog (Nute and Lewis 1986:6-7). Thus it will be possible to assert neg donkey([2]). to say "Individual [2] is not a donkey." 22 Negative statements are not defined in terms of affirmative ones, nor vice versa. From the inference engine's point of view, donkey([2]) and neg donkey([2]) are completely unrelated facts and each must be queried separately. Crucially, neither of them follows from the inability to derive the other. The knowledge base could even contain both of them, though it would then express a contradiction. Special routines could of course be written to detect contradictions and identify the premises from which they arise. This solves the problem with queried if-thens that we mentioned earlier. In this system, the sentence "Every donkey that is not young is gray" translates to: gray(X) :- donkey(X), neg young(X). Now, in order to test whether all donkeys are gray, we assert the hypothesis donkey([23]). and see if we can prove gray([23]). Using the above rule, we cannot. The subgoal donkey([23]) succeeds, but the subgoal neg young([23]) does not, because we never asserted neg young([23]) or anything from which it is derivable. Thus the erroneous result does not occur. 6.3. Modal subordination The DRT account of pronominal anaphora in conditional sentences presupposes a simple discourse structure in which antecedent clauses are always superordinate to their consequent clauses. For example: (1) If a farmer owns a donkey, he loves it. (2) If a farmer owns every donkey, he beats it. In (1), it can refer to a donkey, which is in a DRS superordinate to it. But in (2), it cannot refer to every donkey, because the donkey is introduced in a subordinate DRS: If a farmer owns every donkey, he beats it. Ŀ Ŀ X1 Ĵ Ĵ farmer(X1) ==> beats(X1,X2) Ŀ Ŀ X2 Ĵ==> Ĵ donkey(X2) owns(X1,X2) 23 Hence the underlined occurrence of X2 is illegal and the anaphoric reference is not permitted. However, consider the following example (Roberts 1985, 1987): (3) a. If a farmer owns a donkey, he feeds it lots of hay. b. It soon grows fat on this diet. Fragment (3) cannot possibly be called ungrammatical or even dubiously grammatical. Clearly, sentence (3b) needs to go into the consequent of the if-then structure established by (3a). Neither the truth-conditions nor the anaphoric reference is handled correctly unless this is done. But the rules for DRT, as formulated so far, do not provide for this. Roberts (1985, 1987) refers to this phenomenon as "modal subordination" - - semantically, the mood (modus) of the second sentence makes it behave like a subordinate clause in the first sentence, even though syntactically it is an independent clause. Goodman (1988) independently noticed the phenomenon and called it "multi-sentence consequents." To handle modal subordination correctly, we must: (1) formulate rules for recognizing modally subordinate sentences, based on appropriate choices of verb moods and tenses, presence of anaphors, and other indicators; (2) build these rules into the DRS construction algorithm. Implicit in this refinement of DRT is the recognition that discourse is more complex than early versions of DRT admitted. Instead of one single topmost DRS into which all independent clauses are inserted, an adequate theory of discourse will need to provide for a complex hierarchical structure including such things as subplots within a main plot, different mainplots at the same level, dialogues, and the like. Present DRT is indeed discourse-oriented (as opposed to sentence- or proposition-oriented) when dealing with simple declarative sentences, but as soon as if-thens, negations or queries are involved, the DRS construction rules crucially rely on syntactic sentence boundaries (sentence final punctuation) as a trigger for DRS embedding. Non- syntactic intersentential links, for example modal subordination, are ignored. 6.4. Loop removal In the present implementation, a sentence such as Every gray donkey is an old donkey. goes into extended Prolog as donkey(X), old(X) ::- donkey(X), gray(X). 24 and is asserted as: donkey(X) :- donkey(X), gray(X). old(X) :- donkey(X), gray(X). The first of these clauses sends Prolog into endless recursion. A simple syntactic readjustment rule needs to be added to remove loops of this type. Bibliography Andrews, P. B. (1986) An introduction to mathematical logic and type theory: to truth through proof. Orlando: Academic Press. Covington, M. A. (1987) GULP 1.1: An extension of Prolog for unification- based grammar. ACMC Research Report 01-0021, University of Georgia. Covington, M. A.; Nute, D.; and Vellino, A. (1988) Prolog programming in depth. Glenview, Ill.: Scott, Foresman. Covington, M. A., and Schmitz, N. (1988) An implementation of discourse representation theory. ACMC Research Report 01-0023, University of Georgia. Gabbay, D. M., and Reyle, U. (1984) N-PROLOG: an extension of Prolog with hypothetical implications, I. Journal of Logic Programming 4:319- 355. Guenthner, F., et al. (1986) A theory for the representation of knowledge. IBM Journal of Research and Development 30:1.39-56. Johnson, M., and Klein, E. (1986) Discourse, anaphora, and parsing. CSLI Research Report 86-63, Stanford University. Kamp, H. (1981) A theory of truth and semantic representation. In J. Groenendijk et al. (eds.) Formal methods in the study of language, 277-322. University of Amsterdam. Reprinted in J. Groenendijk et al. (eds.), Truth, interpretation, and information (Groningen- Amsterdam Studies in Semantics, 2), 1-40. Dordrecht: Foris. Kleene, S. C. (1967) Mathematical logic. New York: Wiley. Kolb, H.-P. (1985) Aspekte der Implementation der Diskursreprsentationstheorie. FNS-Script 85-1, University of Tbingen. --- (1987) Diskursreprsentationstheorie und Deduktion, Linguistische Berichte 110:247-282. Nute, D., and Lewis, M. (1986) A user's manual for d-Prolog. ACMC Research Report 01-0017, University of Georgia. 25 Roberts, C. (1985) Modal subordination and pronominal anaphora in discourse. Manuscript, University of Massachusetts at Amherst. --- (1987). Modal subordination, anaphora and distributivity. Ph.D. Thesis, University of Massachusetts at Amherst. Skolem, T. (1928) ber die mathematische Logik. Norsk matematisk tidskrift 10:125-142. Cited by Kleene (1967). Spencer-Smith, R. (1987) Semantics and discourse representation. Mind and Language 2.1:1-26.