pl1res

Automated theorem prover for first-order predicate logic based on resolution

Usage no npm install needed!

<script type="module">
  import pl1res from 'https://cdn.skypack.dev/pl1res';
</script>

README

PL1RES

Find proofs in first-order predicate logic using resolution. Note that the current version has no built-in equality capabilities, and cannot parse infix equality or inequality literals. This may change in a future release.

Module Usage

var pl1res = require('pl1res');

pl1res exposes the functions

  • createProver: to create a prover for a given set of clauses (in CNF)
  • fofListToCnf: to convert first-order formulas (FOF) into an equivalent set of clauses
  • tptpToCnf: to convert a TPTP problem into a set of clauses

Details are explained below. Note that we refer to a clause as an active clause if all possible factors and all possible resolvents between that clause and all other active clauses were created and added to the set of non-active or passive clauses. At the start of the proof search (i.e., when starting from scratch), there are no active clauses, and the set of passive clauses consists of all given or input clauses. One by one, employing heuristic selection criteria, passive clauses migrate to the set of active clauses, or in other words are activated. Active clauses take part in resolution and factorization (factoring), which derives new clauses that are added to the set of passive clauses. This procedure is typically referred to as saturation. Hence the prover presented here is a saturation-based theorem prover.

createProver (clauses, options)

This function creates a prover for the given clauses. The clauses are supplied as an array of strings. Each string represents a clause. The syntax is based on a subset of the TPTP syntax for formulas in CNF.

In a nutshell, TPTP syntax adopts Prolog conventions. Hence, variable names start with an upper case letter (from the Latin alphabet). Function and predicate names start with a lower case letter (from the Latin alphabet). All names may contain digits, but not at the first position. We deviate slightly from the TPTP syntax in that we allow predicate symbols to begin with an upper case letter, and variables to begin with the underscore character '_'. The underscore character may also occur in predicate and function names except at the first position. Single-quoted predicate and function names are supported.

The logical disjunction that connects the literals of a clause is expressed through character '|', while negation is expressed through '~'.

As an example, take this set of three clauses: ['p(X_men) | q(X_men,Y)', '~q(a1,a2)', '~p(a1)']. There are two predicate symbols p and q, and two constants a1 and a2. The first clause has two variables X_men and Y.

A clause can be associated with a role. This can be achieved by providing a clause as an object instead of a string. The object must have the two properties clause and role, the former holding the clause as described above. The value of property role is also a string and must be one of the following TPTP roles: 'axiom', 'hypothesis', 'lemma', 'theorem', 'corollary', 'negated_conjecture'. Apart from these roles the role 'sos' is also recognized.

An alternative to the object is to add a role as a prefix to the string representing a clause, preceded by the character # and separated from the clause by at least one SPACE character, e.g.

'#axiom ~p(X) | p(s(X))'

which is equivalent to

{
  clause:'~p(X) | p(s(X))',
  role:'axiom'
}

Roles are employed to facilitate usage of the so-called Set of Support (SOS) strategy. See option sos of prover function run described below.

Parameter options is optional. If supplied it must be an object that can have the following property:

  • logging: its value is an object that provides a logger and a log level through these properties:

    • logger: a function that takes one string argument and consumes the provided information in a suitable way, e.g. console.log

    • level: the log level that determines the amount or detail of logged information; it should be a an integer between 0 and 5 (inclusive). The default is 0 and means no logging.

The return value of createProver is an object that acts as a prover. It offers the following properties, all of which are functions:

  • run (options): starts or resumes a proof run. It returns the state of the prover, which is one of the following strings:

    • 'consistent': the set of clauses is consistent

    • 'inconsistent': the set of clauses is inconsistent, i.e. the empty clause was derived

    • 'stopped': the limit set through option maxClausesActivated or timeout was exceeded (see below)

    Parameter options is optional. If provided it must be an object that can have the properties listed below. Note that all options except maxClausesActivated and timeout are taken into account only when starting from scratch, i.e. the very first execution of run or any execution of rerun.

    • logging: see option logging of createProver options; use this option to overwrite the currently used logger or log level.

    • maxClausesActivated: the maximal number of clauses that can be activated before the prover enters state stopped. Accepted values are integers greater than 0. Any other value grants the prover an unlimited number of clause activations. The default is no limit.

    • timeout: the timeout in milliseconds. Accepted values are non-negative integers. Any other value means no timeout. The default is no timeout.

    • fctWeight: the weight of predicate and function symbols used when computing the heuristic weight of a clause. It must be an integer greater than 0. The default is 2.

    • varWeight: the weight of variables used when computing the heuristic weight of a clause. It must be an integer greater than 0. The default is 1.

    • depthWeight: the weight of the depth component of the heuristic weight of a clause. It must be a non-negative integer. The default is 0, i.e. the depth of a clause is irrelevant. The depth of an input clause is 0. The depth of a derived clause is the maximum of the depths of its parent clauses (or clause in case of factorization) plus 1.

    • minWeightGiven: a flag that when set to a value equivalent to true assigns the minimal weight of 1 to all input clauses, with the effect that all input clauses will be activated before any other (derived) clause. The default is false.

    • sos: a specification for employing the Set Of Support (SOS) strategy. If provided it must be an object with some or all of the following properties: ownSelectionWeight, otherSelectionWeight, roles. See appendix Set Of Support for details.

    • useFVIndex: a flag that when set to a value equivalent to true enables the use of the feature vector indexing technique to speed up subsumption activities by reducing the number of failing subsumption attempts. See for instance this paper for details. The default is false.

    • patterns: defines a set of patterns employed to modify the weights of clauses exhibiting at least one of these patterns, typically to increase the weight and hence delay the selection of clauses deemed to be an impediment or distraction during proof search. See appendix Patterns for details.

  • rerun (options): starts or restarts a proof run. It always starts from scratch and discards any data or results of a previous run. Parameter options is optional. For details see the description of parameter options of run. The return value of this function is the state of the prover. See run for details.

  • getState (): returns the state of the prover as one of the strings 'consistent', 'inconsistent', or 'stopped'. See return value of run above for more details.

  • getGivenClausesCount(): returns the number of given or input clauses

  • getGivenClauses(withRole): returns the given or input clauses as an array of strings. If parameter withRole evaluates to true the role of each clause (that was assigned a role) is prepended as #<role>.

  • getActiveClausesCount(): returns the number of active clauses

  • getActiveClauses(): returns the active clauses as an array of strings

  • getStatistics(): returns statistics as an object (properties with numerical values)

  • getStatisticsAsString(): returns statistics as a formatted string

  • getTimeElapsed(): returns the time elapsed during the most recent run as a human-readable string

  • getHrtimeElapsed(): returns the time elapsed during the most recent run in process.hrtime format

  • getProof(): returns the proof found. If no proof was found then null is returned. Otherwise the proof is returned as an object with the following properties, all of which are functions:

    • length(): returns the length of the proof, or in other words the total number of clauses involved in the proof (including the empty clause)

    • depth(): returns the depth of the proof, which is maximal depth of a leaf node when viewing the proof as a directed graph with the empty clause as its root, and input clauses as its leaves

    • efficiency(): returns the quotient between the proof length and the total number of clauses activated during proof search. This quotient is greater than 0 and not greater than 1. A quotient of 1 signifies a 'perfect' search in the sense that every activated clause contributed to the proof found.

    • efficiencyAsString(fl): returns the proof efficiency as a string, limiting the number of fractions to a maximum of 3, or as per fl if supplied as an integer number greater than zero

    • toString(): returns a string representation of the proof, which is a list of the clauses that constitute the proof, listed in the order in which they were activated, including information on their origins (i.e., whether a clause is an input clause, or whether it was derived from which clause or clauses through factorization or resolution); a clause ID preceded by the character '*' signifies that the clause is a member of the set of support.

  • nonProofClausesCount(): returns the number of active clauses that did not play a role in the proof found. If no proof was found then the return value is undefined.

  • nonProofClauses(): returns the set of active clauses that did not play a role in the proof found. If no proof was found then the return value is undefined. Otherwise the set of unused active clauses is returned as an array of strings. Each string represents an unused active clause. Note that the character '*' preceding the ID of a clause signifies that the clause is a member of the set of support. Furthermore we want to point out that the origin of a clause may refer to clauses that are neither in the proof found nor in this set of clauses. Such clauses were removed on account of backward subsumption.

  • getSosOptions: returns the set-of-support options that were employed; the return value is null if the set-of-support strategy was not active. Otherwise it returns an object with the properties ownSelectionWeight, otherSelectionWeight, roles (cp. option sos of run and rerun; see appendix Set Of Support for details).

Example

The following lines of code create a prover for a simple propositional problem, prove the inconsistency of the set of clauses with little output to the console, and then rerun the problem with more output to the console, and finally show the proof of inconsistency on the console:

const pl1res = require('pl1res'); // or whichever way the module can be loaded
let prover = pl1res.createProver(['p1', '~p1 | p2', '~p2'],{logging:{logger:console.log, level:1}});
prover.run();
prover.rerun({logging:{level:2}});
console.log(prover.getProof().toString());

fofListToCnf (fofList)

This function converts a list of first-order formulas into a logically equivalent set of clauses (in CNF). The syntax of a first-order formula is based on a subset of the TPTP syntax for logic formulas. It is acceptable to omit the brackets surrounding the list of variables following a quantifier. A list of formulas is passed through parameter fofList as an array. Each element of the array represents a formula. Similar to clauses (cp. parameter clauses of function createProver) a formula can be specified as a string with the optional role prefix, e.g.

#conjecture ?X:(isCretan(X) & ~isLiar(X))

or an object (representing the same formula as the one above):

{
  formula: '?X:(isCretan(X) & isLiar(X))',
  role: 'conjecture'
}

The following TPTP roles are accepted: 'axiom', 'hypothesis', 'lemma', 'theorem', 'corollary', 'conjecture', 'negated_conjecture'. Apart from these roles the role 'sos' is also recognized.

tptpToCnf (tptpProblem)

This function converts a TPTP problem (given as a string) into a set of clauses that can be passed to createProver. In other words the return value of this function is an array that can be used as parameter clauses of createProver.

All include statements of the TPTP problem (if any) must have been resolved prior to calling this function. A given TPTP problem may contain only the TPTP languages cnf and fof. The roles of clauses or formulas in the original TPTP problem are inherited by the clauses in the returned array. Formulas with the role conjecture are transformed into clauses with the role negated_conjecture.

If you wish to download a TPTP problem (with all include statements resolved) we recommend using package tptp, and in particular function getProblemWithAxiomSetsIncluded of the tptpClient.

Run From Command Line

Use run.js to run the prover from the command line. run.js is a convenience program that utilizes the index module to tackle problems in CNF or FOF. Execute

node run

for usage information.

Problems can be specified in a simple format (see files in directory samples), or in the TPTP format. If a file name fits the TPTP problem name format, it is assumed to be a TPTP problem. File names for the simple format should use the extensions .cnf and .fof if CNF or FOF is used, respectively.

It is worth noting that it is possible to pipe problems, and in particular TPTP problems, to run, which comes in particularly handy if a TPTP client such as the client provided by package tptp is available. Assuming package tptp is installed side-by-side with package pl1res the following command line downloads TPTP problem MSC001-1.p, lists the initial set of clauses and proves it to be inconsistent:

node ../tptp/tests/tptpClient -pi MSC MSC001-1.p | node run -li -l1 =tptp

Appendix

Set Of Support

The set of support strategy, or its variant implemented here, proceeds as follows. In the beginning, it labels certain initial clauses as members of the set of support. They nonetheless start out as passive clauses in the sense outlined above. Members of the set of support, and other passive clauses, are selected based on a weighting scheme, e.g. two passive clauses from the set of support for each other passive clause.

A selected clause of the (passive) set of support becomes an active set of support clause. A clauses derived with the help of at least one clause from the set of support will itself also become a clause of the (at first passive) set of support. Should the passive set of support run dry at some point the weighting scheme is suspended until such clauses become available again. In other words the prover will not stop because there are no more set of support clauses, and simply continue with whatever clauses are at its disposition.

The weighting scheme is specified through the properties ownSelectionWeight and otherSelectionWeight. Thus, if you want the scheme to be two set of support clauses for one other clause, then specify option sos to be

sos:{
  ownSelectionWeight: 2,
  otherSelectionWeight: 1
}

If not supplied each of these weights defaults to 1. Note that it is possible to choose the weight 0 in both cases. Weight 0 implies that the respective type of clause is selected only if there are no more passive clauses of the other kind. Choosing 0 for both weights is effectively the same as

sos:{
  ownSelectionWeight: 1,
  otherSelectionWeight: 0
}

Keep in mind that a weight of 0 can easily destroy completeness of the prover in the sense that a proof will never be found, even if one exists.

The initial assignment to the set of support is accomplished through the roles of the clauses. The special role sos always puts the respective clause into the set of support. Other roles are treated like role sos with the help of property roles, an array of role names. For instance, the following set of support specification puts all negated conjectures and lemmas into the set of support (as well as all clauses with roles sos), and sets up a weighting scheme of 3:1 in favor of set of support clauses:

sos:{
  ownSelectionWeight: 3,
  otherSelectionWeight: 1,
  roles:['negated_conjecture', 'lemma']
}

Note that the set of support strategy is active only if option sos is present. Hence having clauses with role sos does not suffice. Option sos must be supplied, if only as an empty object.

Finally, a set of support clause is shown with character '*' preceding its ID when logging at level 2 or greater during a proof run, or when obtaining the list of clauses that did or did not contribute to a proof eventually found.

Patterns

A pattern is specified through an object with the properties pattern and mod. The set of patterns is an array of such objects. The value of property pattern is a string that defines a pattern. A pattern is a term construed with functions symbols and the following special symbols:

  • *: matches any term

  • +: matches any term except variables

  • ?: matches any variable

As an example consider sub(sub(*)), for instance. This pattern matches any term of the form sub(sub(t)), where t is any arbitray term. A pattern matches a clause if it matches any of the terms occurring in that clause, or more precisely in arguments of the literals of that clause.

The first pattern of the set of patterns to match a clause determines the weight modification of that clause. The modification of the weight is controlled through the second property mod. It can be a number or a function that accepts one argument, namely the original weight of the clause, and returns the modified weight for that clause. Providing a number n is a shortcut for the function w => w + n.

Let us take a look at TPTP problem PUZ005-1.p from domain PUZ. With the standard weighted symbol count this problem is rather elusive to classic resolution. It is well known that the appearance of numerous clauses with nested occurrences of the function symbol yesterday uses up resources to no avail. The following set of patterns (consisting of just one pattern) counters this problem:

patterns: [{
  pattern:'yesterday(yesterday(*))',
  mod:30
}]

Combine this pattern with the set of support strategy:

sos:{
  roles:['negated_conjecture'],
  ownSelectionWeight:1,
  otherSelectionWeight:1
}

and PUZ005-1.p can be proved to be inconsistent in no time at all. See also samples/options.json or execute

node run samples/PUZ005-1.p -O samples/options.json

The concept of patterns in its current simple form can and may be extended and enhanced in the future.