6 Match
Due: Wednesday, February 5, 11:59pm.
6.1 Faster matching with continuation-passing style
In lecture we saw one implementation of a subset of Racket’s match. In this homework you will create another. This version will avoid much of the performance overhead of the previous implementation. The key is to use continuation-passing style. Rather than return a value representing success or failure, each matching procedure will receive functions to call in the case of success or failure.
In the runtime, patterns are implemented by procedures that receive the value to scrutinize, a on-success continuation, and an on-fail continuation. The success continuation receives as arguments any sub-values extracted in the match. For example, the cons pattern’s success continuation takes in the car and cdr of a pair. The failure continuation receives no arguments and is called when the value doesn’t match the pattern, like if the value isn’t a pair for the cons pattern.
6.1.1 The runtime functions
The following starter code sketches the signatures of the runtime functions we will use:
; A FailureK is a (-> Any) ; FailureK (define (error-on-fail) (error 'minimatch "all patterns failed to match")) ; — — — — — — — — — — — — — — — — — — - ; Any Any (-> Any) FailureK -> Any ; Matches values that are equal? to val. The success ; continuation is called with no arguments. (define (match-== target val on-success on-fail) (error 'match-== "not implemented")) ; — — — — — — — — — — — — — — — — — — - ; Any (Any -> Any) FailureK -> Any ; Matches any value. The success continuation is called ; with the matched target value. (define (match-var target on-success on-fail) (error 'match-var "not implemented")) ; — — — — — — — — — — — — — — — — — — - ; Any (Any Any -> Any) FailureK -> Any ; Matches a pair value. The success continuation is called ; with the car and cdr of the pair. ; Examples: ; (match-cons ; (cons 1 2) ; (lambda (car-val cdr-val) (+ car-val cdr-val)) ; error-on-fail) ; Evaluates to: ; 3 ; ; (match-cons ; 4 ; (lambda (car-val cdr-val) (+ car-val cdr-val)) ; error-on-fail) ; Raises the error with message "minimatch: all patterns failed to match" (define (match-cons target on-success on-fail) (error 'match-cons "not implemented"))
Put all these definitions in a file called runtime.rkt. Based on the provided signatures and examples, implement match-==, match-var, and match-cons.
6.1.2 Examples in the runtime
We will expand a subset of Racket’s match syntax into compositions of these procedures. For example, the following:
(define (f x) (match x [(cons '1 b) (* b 2)]))
will translate to:
(define (f x) (match-cons x (lambda (car-val cdr-val) (match-== car-val '1 (lambda () (match-var cdr-val (lambda (b) (* b 2)) error-on-fail)) error-on-fail)) error-on-fail))
Following this pattern, hand-translate the following uses of match into compositions of the matching procedures. Add them as part of unit tests in a module+ test submodule in your runtime.rkt.
(match (cons 1 2) [(cons a b) (+ a b)]) (match (list 3 4 5) [(cons a (cons b (cons c '()))) (+ a b c)])
6.2 A macro layer
Translating complex patterns to use our runtime matching functions is painful. Our code ends up getting very nested and indented. Gross! But it is an efficient way to structure the code for pattern matching, because the calls to the success and failure continuations jump directly to the next step of the pattern match. Also notice that the match-var success continuations bind the pattern variables such that they are visible in the clause action. We don’t need to accumulate a list of matched values like we did with the embedding in class.
Now that we’ve done this tedious translation ourselves, we’re ready to write a macro to automate it! Implement a macro called do-match:
syntax
(do-match target-id pat success-body-expr on-fail-id)
Match the target-id against the pat. If the match succeeds, evaluate the success-body-expr with the pattern variables from the pattern bound. If the match fails, invoke the failure continuation specified by the on-fail-id.
In the notation we have been using in lecture, the grammar of the form is:
Fixed from the original release, where the cons production was (cons <id> <id>)
(do-match <id> <pat> <expr> <id>) |
<pat> := (quote <datum>) |
| <id> |
| (cons <pat> <pat>) |
The match-like form we implement later will use this macro to generate code for each clause. For example, a match like:
(let ([x (cons 1 2)]) (match x [(cons a b) (+ a b)]))
will expand to a use of do-match like this:
(let ([x (cons 1 2)]) (do-match x (cons a b) (+ a b) error-on-fail))
That should evaluate to 3.
Your macro should generate code that uses the continuation-passing style runtime functions you implemented earlier. You can assume that the same variable is not used more than once in a pattern.
Note that the success-body-expr is an expression (like (+ a b) above) that the macro will place inside a success continuation, but it is not a procedure or continuation itself. The on-fail-id should be a variable that is bound to a failure continuation procedure.
Also remember to use ~datum to parse literal symbols like cons.
Place your implementation in a file called minimatch.rkt that requires the runtime functions from runtime.rkt. Write tests including scenarios of succesful matches, runtime match failures, and syntax errors for malformed patterns (using convert-compile-time-error).
6.3 Multiple clauses
do-match only handles a single clause.
With match, if a pattern fails to match, the next clause/branch is tried. If the last clause’s pattern fails to match, we error. We can implement this behavior using our failure continuation!
For example, consider the following match expression:
(define (f v) (match v [(cons 1 (cons x '())) "result 1"] [anything "result 2"]))
It can expand to two uses of do-match, chained together via the failure continuations:
Fixed from the original release, where the lambda mistakenly received an argument v
(define (f v) (let ([on-clause1-fail (lambda () (do-match v anything "result 2" error-on-fail))]) (do-match v (cons 1 (cons x '())) "result 1" on-clause1-fail)))
6.3.1 Example expansions
Translate the following use of match into a similar chain of do-match uses:
(define (g v) (match v [(cons a '()) "one thing"] [(cons a (cons b '())) "two things"] [(cons (cons a b) (cons c d)) "binary tree"]))
Add this translation as part of a test in minimatch.rkt.
6.3.2 Implementation
Now that you’ve gone through that process yourself, automate it! Implement a macro minimatch, with syntax similar to match:
syntax
(minimatch target-expr [pat expr] ...)
Match the target-expr against each pattern in turn, and return the result of evaluating the expr of the matching clause. If no pattern matches, raises an error.
Remember to avoid duplicate evaluations of the target expression. I found it helpful to create a helper macro minimatch-val that is like minimatch, but that requires an identifier rather than an expression for the target.
Write unit tests in a module+ test submodule to validate that your minimatch selects the correct clause and raises the appropriate failure error when no clause matches.
6.4 Inlining the runtime
We implemented do-match in terms of the runtime functions from runtime.rkt, which kept the macro simple. Sometimes it’s a good idea keep the expansion of a macro small and place as much of the logic your DSL as possible in the runtime procedures. But in other situations, it might actually be better to just inline the bodies of those procedures and expand to the logic code directly. Think about the implementation of your match-cons procedure. It only needs simple operations like if, let, car, and cdr. Those operations compile to very efficient machine code! By comparison, the procedure call to invoke match-cons could be more expensive, adding performance overhead.
Create a second implementation in a new file, inlined.rkt. In this implementation, do-match should not expand to calls to the procedures from runtime.rkt but instead should expand to code equivalent to the bodies of those procedures. You should be able to copy over the minimatch macro from minimatch.rkt without modification.
For example, revising the translation from earlier,
(define (f x) (minimatch x [(cons '1 b) (* b 2)]))
should expand to something like:
(define (f x) (if (pair? x) (let ([car-val (car x)] [cdr-val (cdr x)]) (if (equal? car-val '1) (let ([b cdr-val]) (* b 2)) (error-on-fail))) (error-on-fail)))
Copy over your unit tests from minimatch.rkt to ensure that inlined.rkt works correctly as well.
6.5 Extra credit: additional pattern forms
Added February 2
Racket’s match supports many additional pattern forms, as described in its documentation. The extensions described below are relatively easy to add to our implementation. For each you choose to implement, add it to your inlined.rkt implementation along with module+ test tests.
In the grammars below, I write <pat> := .... to indicate that the grammar is extending the existing language of patterns defined earlier with the subsequent pattern forms.
6.5.1 list patterns
Grammar:
<pat> := .... |
| (list <pat> ...) |
Example:
(define (g v) (minimatch v [(list a) "one thing"] [(list a b) "two things"] [(list a b c) "binary tree"])) (g '(1 2)) ; => ; "two things"
list patterns are equivalent to cons patterns with a '() in the final cdr position, but are much more convenient. (The real Racket match also supports additional features like ellipsis patterns; we are omitting those here.)
One easy way of implementing list patterns is to translate them into cons patterns. Your do-match macro can match on a list pattern and recursively call itself with an equivalent cons pattern.
6.5.2 and patterns
Grammar:
<pat> := .... |
| (and <pat> <pat>) |
Example:
(minimatch '(1 2) [(and l (list a b)) (format "the list ~a has elements ~a and ~a" l a b)]) ; => "the list (1 2) has elements 1 and 2"
The pattern (and p1 p2) matches when both p1 and p2 match against the target, and variables from both sub-patterns get bound. The pattern fails to match if either sub-pattern fails.
This can naturally generalize to any number of sub-patterns if you wish.
6.5.3 or patterns
Grammar:
<pat> := .... |
| (or <pat> <pat>) |
Example:
(minimatch (cons (cons 1 2) 3) [(or (cons (cons a b) c) (cons a (cons b c))) (list a b c)]) ; => '(1 2 3)
The pattern (or p1 p2) matches when either p1 or p2 match against the target. The pattern variables in the body receive the values from the first of the patterns to match. The match fails if both sub-patterns fail.
or patterns have a wrinkle that makes them tricky. Consider a program like the following:
(define (f v) (minimatch v [(or (cons a b) c) a]))
What should (f 5) return? The a only gets bound when v is a pair! To avoid this problem, Racket’s match requires that the set of bound variables in the two branches of the or must be identical. Your implementation should make this same static check. In this situation you can compare identifiers for equality using bound-identifier=?, which will be discussed in class Monday.
A naive implementation of or patterns would end up duplicating the body of the match clause. Your implementation should avoid this. To do so, you will need to compute the set of variables bound by the pattern and create a let-bound lambda that accepts these arguments. It may help to refer to the implementation from lecture. Make sure that your helper fuction to compute bound variables supports all of the pattern forms you implement.
This can naturally generalize to any number of sub-patterns if you wish.
6.5.4 quasiquote patterns
Racket’s quasiquote makes it easy to construct complex S-expressions with varying elements inserted via unquote.
For example, the following code constructs an X-expression representation of an HTML page based on the provided title and content:
(define (build-page title content) |
(quasiquote |
(html (head (title (unquote title))) |
(body |
(h1 (unquote title)) |
(p (unquote content)))))) |
Just as quote expressions such as (quote (1 2)) can be written with the shorthand '(1 2), expressions using quasiquote and unquote can be written in shorthand using ` and ,. For example, the above code can be written:
(define (build-page title content) `(html (head (title ,title)) (body (h1 ,title) (p ,content))))
Analagously, Racket’s match supports a quasiquote pattern form for deconstructing complex S-expressions. Here are a few examples:
> (match (list 1 2) [`(3 ,a) "nope"] [`(x ,a) "nope"] [`(1 ,a) a]) 2
> (match '(((1))) [`(((,x))) x]) 1
> (match 2 [`,x x]) 2
For this portion of the extra credit, augment your minimatch with a subset of this functionality based on the following grammar:
<pat> := .... |
| (quasiquote <quasipat>) |
<quasipat> := <datum> |
| <id> |
| (<quasipat> ...) |
| (unquote <pat>) |
(We are omitting many features of real quasipatterns, such as unquote-splicing, ellipses, and dotted pairs.)
Note that Racket’s quasiquote patterns treat unquote within nested quasiquotations differently than quasiquote expressions do. quasiquote patterns don’t have a notion of quotation depth, so the following should match:
(minimatch ``1 [``,x x]) ; => 1
As with list patterns, this subset of quasiquote patterns can be expanded into uses of the other pattern forms.
6.6 Submission
Submit to Gradescope, in a group submission with your partner: https://www.gradescope.com/courses/941020/assignments/5705213
Upload your runtime.rkt, minimatch.rkt, and inlined.rkt files.