PLT Redex FAQ

2017-09-25 :: tutorial, PLT Redex

A short guide to Redex concepts, conventions, and common mistakes.

To contribute to this FAQ, submit issues and pull requests to: https://github.com/nuprl/website/

Q. What is Redex useful for?

declaring regular tree grammars
defining pattern-based judgments and relations on terms
testing properties of the above

More generally, Redex is helpful for experimenting with a programming language design, and helping you decide what you might want to prove about a language.

Q. What is Redex not useful for?

Proving theorems about a grammar, judgment, or relation.

Q. What is a term?

Informally, a term is:

a Redex “atom”, or
an object that represents a sequence of characters.

More formally, a term is the result of evaluating (term X), where X is any syntactically-correct Racket expression.

Examples:

$ racket
Welcome to Racket v6.10.0.3.
> (require redex/reduction-semantics)
> (term 42)
42
> (term (+ 2 2))
'(+ 2 2)
> (term ("hello" world (#false)))
'("hello" world (#f))

Some terms may look strange. That’s OK, because a term by itself has no meaning.

Terms can refer to previously-defined values using the unquote escape.

> (define x (term 42))
> (term (+ 2 x))
'(+ 2 x)
> (term (+ ,x (unquote x)))
'(+ 42 42)

Q. What is a Redex model?

A Redex model is collection of tools for working with terms. The tools may include:

languages, to define a grammar for terms
judgments, to describe properties of terms or relations between terms
metafunctions, to transform terms
reduction relations, to define a term rewriting system

The goal of these tools is to encode a “real thing” (maybe, a programming language) using Redex terms.

Q. What is a language?

A Redex language is a named set of non-terminals, patterns, and binding forms. For example, a Redex model of the natural numbers might start with this language:

(define-language nat
  [N ::= Zero
         (Plus1 N)])

the name of the language is nat
the non-terminal N is associated with two patterns: Zero and (Plus1 N)
there are no binding forms

Each pattern describes a syntactic category of terms. Each non-terminal gives a name to the union of the patterns that follow it.

The non-terminal N describes all terms that are either:

the symbol Zero
lists of the form (Plus1 N), where N is either Zero or another “Plus1”

For example,

(term Zero)
(term (Plus1 Zero))
(term (Plus1 (Plus1 Zero)))
(term (Plus1 (Plus1 (Plus1 Zero))))
;; .... and so on

If a language has binding forms, then some terms can introduce names. See the question on binding forms (below) for an example.

Q. What is a pattern?

A pattern is a sequence of characters and variables. If you have: (1) a language, and (2) a pattern that contains named non-terminals from the language, then you can ask whether a Redex term matches the pattern.

A named non-terminal for a language L is an identifier made of: (1) a non-terminal from L, (2) an underscore (_), and (3) any other identifier. See the FAQ entry below.

For example, (redex-match? L p t) returns #true if the term t matches the pattern p relative to the language L.

(define-language nat
  [N ::= Zero (Plus1 N)])

(redex-match? nat N_some-name (term Zero))
;; #true
(redex-match? nat (Plus1 N_a) (term Zero))
;; #false
(redex-match? nat (Plus1 N_0) (term (Plus1 (Plus1 Zero))))
;; #true

If (redex-match? L p t) is #true, then (redex-match L p t) shows how named non-terminals in the pattern bind to subterms of t.

(redex-match nat N_0 (term Zero))
;; (list (match (list (bind 'N_0 'Zero))))
(redex-match nat (Plus1 N_0) (term Zero))
;; #f
(redex-match nat (Plus1 N_0) (term (Plus1 (Plus1 Zero))))
;; (list (match (list (bind 'N_0 '(Plus1 Zero)))))

Q. What is a named non-terminal?

A named non-terminal in a language L is an identifier of the form X_Y, where:

X is a non-terminal from L
Y is any identifier

The name helps when one pattern contains multiple occurrences of the same non-terminal. If you want the two occurrences to bind the same term, then use the same name.

(define-language trees
  [binary-tree ::= Leaf
                   (Node binary-tree binary-tree)])

(redex-match trees
  (Node binary-tree_left binary-tree_right)
  (term (Node Leaf (Node Leaf Leaf))))
;; (list
;;  (match
;;   (list (bind 'binary-tree_left 'Leaf)
;;         (bind 'binary-tree_right '(Node Leaf Leaf)))))

Q. What else can patterns express?

Redex patterns may contain special identifiers to guide pattern-matching. For instance:

The _ pattern matches any term (and does not bind).
A pattern (p …) matches any sequence whose elements match the pattern p. If the pattern p is a named non-terminal, then the non-terminal binds a sequence of subterms.

Examples:

(redex-match? nat (Plus1 _) (term (Plus1 9)))
;; #true
(redex-match? nat (N_0 ...) (term ()))
;; #true
(redex-match? nat (N_0 ...) (term (Zero)))
;; #true
(redex-match nat (N_0 ...) (term (Zero Zero Zero)))
;; (list (match (list (bind 'N_0 '(Zero Zero Zero)))))

See the Redex reference for the full pattern language, including the named and unique non-terminals of the form X_!_Y.

Q. What can patterns not express?

Disjunctions of patterns, e.g., “number or boolean”. (Use a language non-terminal.)
Negations of patterns. (Compose not with redex-match?.)
Some non-regular patterns. (Named dots ..._N or define-language with a side condition may be able to help.)

Q. What is a judgment?

A Redex judgment form defines a relation on terms. The relation is defined by a set of inference rules.

Programming languages papers use inference rules all the time. Redex can express many of the judgments in papers; for example:

well-formedness conditions (i.e., whether a term contains free variables)
type checking rules
type inference rules
evaluation relations

Every judgment needs (1) a language (2) a mode (3) a contract (4) a set of inference rules. For example, the following judgment defines an equality relation on natural numbers.

(define-language nat
  [N ::= Zero (Plus1 N)])

(define-judgment-form nat
  #:mode (N= I I)
  #:contract (N= N N)
  [
   --- Zero=
   (N= Zero Zero)]
  [
   (where (Plus1 N_0--) N_0)
   (where (Plus1 N_1--) N_1)
   (N= N_0-- N_1--)
   --- Plus1=
   (N= N_0 N_1)])

the language is nat; Redex uses the language to interpret patterns
the mode is (N= I I); this means N= is the name of a judgment that expects two input terms (or, N= is a binary relation on terms)
the contract is (N= N N); in other words, N= expects two terms that match the N non-terminal from the nat language
there are two inference rules, named Zero= and Plus1=
the Zero= rule states that (N= Zero Zero) always holds
the Plus1= rule states that (N= N_0 N_1) holds if N_0 and N_1 are both Plus1 terms whose contents are related by N=

The where clauses are guards. When Redex tries to apply a rule with premises of the form (where pattern term), it checks that each pattern matches the corresponding term. If not, Redex stops applying the rule. See the Redex reference for more.

(judgment-holds (N= Zero Zero))
;; #true
(judgment-holds (N= (Plus1 (Plus1 Zero)) (Plus1 (Plus1 Zero))))
;; #true
(judgment-holds (N= (Plus1 Zero) (Plus1 (Plus1 Zero))))
;; #false

Note: the inference rules form a set, not a sequence. So when you ask Redex whether (judgment-holds (N= Zero Zero)), it applies all rules that match (N= Zero Zero). For N= this is just one rule, but in general it could be many rules.

Q. What is a judgment form #:mode?

A #:mode declaration expects a list of the form (id pos-use …), where id is an identifier and each pos-use is either I or O. These declarations say four things:

id is the name of a new judgment form
id expects N arguments, where N is the number of pos-use symbols
id expects an input at each position where the mode contains an I
id produces an output at each position where the mode contains an O

For example, a type inference judgment may take an expression as input and output a type. Here’s a fast and easy type inference judgment for arithmetic expressions. Given any term e_0, the judgment outputs the type Int.

(define-language Arith
  (e ::= integer (e + e))
  (τ ::= Int))

(define-judgment-form Arith
  #:mode (infer-type I O)
  #:contract (infer-type e τ)
  [
   --- T-Int
   (infer-type e_0 Int)])

Q. What can judgments not express?

Redex does not support inference rules that require guessing.

One example of this is a transitivity rule: "τ_0 is related to τ_2 if there exists a τ_1 such that τ_0 is related to τ_1 and τ_1 is related to τ_2". The following example tries to define a transitive subtyping (<:) relation, but Redex rejects the S-Trans rule.

(define-language SomeTypes
  (τ ::= (→ τ τ) Integer))

(define-judgment-form SomeTypes
  #:mode (<: I I)
  #:contract (<: τ τ)
  [
   (<: τ_0 τ_1)
   (<: τ_1 τ_2)
   --- S-Trans
   (<: τ_0 τ_2)]
  [
   --- S-Refl
   (<: τ_0 τ_0)]
  [
   (<: τ_dom-1 τ_dom-0)
   (<: τ_cod-0 τ_cod-1)
   --- S-Arrow
   (<: (→ τ_dom-0 τ_cod-0) (→ τ_dom-1 τ_cod-1))])

The error is that in the rule S-Trans, the named non-terminal τ_1 appears in an input position but is not bound to a term.

Q. What is a metafunction?

A metafunction is a term-level function on terms.

Every metafunction needs: (1) a language (2) a name (3) a contract (4) a sequence of guarded input/output cases.

Here is a metafunction that returns #true when given two equal natural numbers. The definition is similar to the N= judgment form.

(define-metafunction nat
  N=? : N N -> boolean
  [(N=? Zero Zero)
   #true]
  [(N=? N_0 N_1)
   (N=? N_0-- N_1--)
   (where (Plus1 N_0--) N_0)
   (where (Plus1 N_1--) N_1)]
  [(N=? N_0 N_1)
   #false])

the metafunction is named N=?
its contract is N N -> boolean, this means N=? expects 2 terms that match the N pattern and returns a term that matches the pattern boolean
there are three cases; the second case is guarded by two where clauses

Any occurrence of (N=? ….) in any term is evaluated.

(term (N=? (Plus1 (Plus1 Zero)) (Plus1 (Plus1 Zero))))
;; #true
(term ((N=? Zero Zero) Zero))
;; '(#true Zero)
(term (N=? (Plus1 Zero) (Plus1 (Plus1 Zero))))
;; #false

Any occurrence of N=? outside a term is an error.

Metafunction where-clauses are analogous to judgment form where-clauses. See the Redex reference for more.

Note: the cases in a metafunction form a sequence, not a set. To evaluate a metafunction application, Redex tries each case in order and returns the result of the first case that (1) matches the call-site (2) for which all guards succeed.

Q. Should I use a metafunction or a judgment form?

Use a judgment form.

Metafunctions are handy, but judgments are easier to read and debug and maintain.

Q. What is a reduction relation?

A reduction relation is a set of term-rewriting rules.

Every reduction relation needs: (1) a language (2) a domain (3) a set of rewrite rules.

The language tells Redex how to interpret patterns.
The domain is a pattern. Input to the reduction relation must match the pattern, and output from the reduction relation must match the pattern.
The rewrite rules have the form (—> term term guard …). The term on the left is the input, the term on the right is the output.

See the Redex reference for a full description of the guards.

The preferred way to define a reduction relation is to define a language that includes three non-terminals:

a non-terminal for the domain of the reduction relation
a non-terminal for a subset of the domain that cannot reduce further
a non-terminal for evaluation contexts

An evaluation context is a term that contains a hole. A reduction relation can match a term against an evaluation context to find a sub-term to rewrite — in particular, the sub-term that matches the hole.

In the following example, bexp is the domain of a reduction relation. A bexp term represents a boolean expression, which can be #true or #false or a conjunction of expressions or a disjunction of expressions. The boolean expressions #true and #false are also values (val); these cannot reduce further. The non-terminal E is for evaluation contexts.

(define-language Bool
  (bexp ::= #true #false (bexp ∧ bexp) (bexp ∨ bexp))
  (val ::= #true #false)
  (E ::= hole (E ∧ bexp) (val ∧ E) (E ∨ bexp) (val ∨ E)))

(define step
  (reduction-relation Bool
    #:domain bexp
    [--> (in-hole E (val_lhs ∧ val_rhs))
         (in-hole E val_new)
         ∧-step
         (where val_new ,(and (term val_lhs) (term val_rhs)))]
    [--> (in-hole E (val_lhs ∨ val_rhs))
         (in-hole E val_new)
         ∨-step
         (where val_new ,(or (term val_lhs) (term val_rhs)))]))

The function apply-reduction-relation applies a reduction relation to a term and returns a list of ways that the term can step.

(apply-reduction-relation step (term #true))
;; '()
(apply-reduction-relation step (term (#true ∧ #true)))
;; '(#true)
(apply-reduction-relation step (term (#true ∧ #false)))
;; '(#false)
(apply-reduction-relation step (term ((#true ∨ #false) ∧ #true)))
;; '((#true ∧ #true))

Three things about the reduction relation step:

Using in-hole on the first argument of —> searches a term for a subterm that Redex can apply a reduction rule to.
Using in-hole on the second argument of —> puts a new value back into the hole in the evaluation context.
The unquote operator (,) escapes to “Racket mode” (see below) to evaluate a conjunction or disjunction.

A judgment or metafunction is a formal alternative to “escaping to Racket”, but escaping can be convenient.

Note: the cases in a reduction relation form a set, not a sequence. If more than one case matches, Redex applies them all.

Q. What is “Racket mode”? What is “Redex mode”?

Code in a Redex model is sometimes evaluated in “Racket mode” and sometimes evaluated in “Redex mode”. Racket mode evaluates Racket syntax to Racket values. Redex mode evaluates Racket syntax (possibly containing metafunction names) to terms.

Key points:

A Redex program starts in Racket mode.
The X in (term X) is evaluated in Redex mode …
… unless X contains unquoted sub-expressions. Unquoting escapes to Racket mode …
… and terms inside an unquoted sub-expression are evaluated in Redex mode.

In other words, term enters Redex mode and unquote (,) escapes back to Racket.

Redex implicitly switches to Redex mode in a few other places, for example:

the right-side of a where clause is in Redex mode
the result of a metafunction is in Redex mode

When in doubt, try using an unquote. Redex will raise an exception if it finds an unquote in Racket mode.

Q. Are side-conditions evaluated in “Racket mode” or “Redex mode”?

A (side-condition e) sometimes evaluates e as a Racket expression and sometimes evaluates e as a Redex expression.

reduction relations and metafunctions expect a Racket expression
judgments expect a Redex expression

Q. What is a binding form?

In the lambda calculus, λ-terms bind variables. A term (λ x M) means that any free occurrence of x in the sub-term M refers to the x from the λ-term.

Redex can express this idea with a binding form.

(define-language Λ
  [e ::= (e e) x (λ x e)]
  [x ::= variable-not-otherwise-mentioned]
  #:binding-forms
  (λ x_0 e_0 #:refers-to x_0))

Note: all the non-terminals in a language must be defined before the #:binding-forms keyword. If a non-terminal definition appears after the #:binding-forms keyword, then Redex will interpret the “definition” as a binding form.

Binding forms work together with Redex’s functions for substitution and alphabetic equivalence.

(alpha-equivalent? Λ
  (term (λ x x))
  (term (λ y y))))
;; #true

(define-metafunction Λ
  test-substitute : e -> e
  [(test-substitute (λ x_0 e_0))
   (substitute e_0 x_0 y)])
(term (test-substitute (λ z (z z))))
;; '(y y)

Q. What is …? What is ….?

Three dots (…) is for building patterns. If p is a pattern then (p …) matches any list whose elements all match p.

(define-language L)
(redex-match? L (number ... boolean ...) (term (1 2 #true #true)))
;; #true

Four dots (….) may be used in define-extended-language to extend a previously-defined non-terminal.

(define-language C
  (keyword ::= auto break case))
(define-extended-language C++
  C
  (keyword ::= .... class))

(redex-match? C keyword (term auto))
;; #true
(redex-match? C keyword (term class))
;; #false
(redex-match? C++ keyword (term auto))
;; #true
(redex-match? C++ keyword (term class))
;; #true

Q. Where to learn more about Redex?

“Critical path” resources:

Code examples for this post: https://github.com/nuprl/website/blob/master/blog/static/redex-faq.rkt
Redex documentation: http://docs.racket-lang.org/redex/index.html
An Introduction to Redex with Abstracting Abstract Machines by David Van Horn
Playing the Game with PLT Redex by Leandro Facchinetti
Experimenting with Languages in Redex by William J. Bowman

“Procrastination” resources:

Tree Automata Techniques and Applications
Should your Specification Language be Typed?
Redex source code (see redex-lib/): https://github.com/racket/redex

Inside PRL