8 syntax-spec DSLs A
Due: Wednesday, February 26, 11:59pm.
I expect this assignment to be more challenging than the previous ones. Please start early, consult the syntax-spec docs and tutorial as needed, and ask questions on Piazza or at office hours if you get stuck.
8.1 Match
8.1.1 A syntax-spec definition of the match core language
As we saw in lecture, syntax-spec allows us to define the grammar for the core language of a DSL, and then create macros that extend it. You will create such a definition for a variant of the match DSL from the last homework assignment.
The grammar of the core language is as follows:
(macromatch <expr> <clause> ...) |
|
<clause> := [<pat> <expr>] |
<pat> := (== <expr>) |
| <id> |
| (cons <pat> <pat>) |
The version of the match DSL you implemented previously included a quote form that only allowed literal data. By contrast, this DSL uses an == pattern. This pattern matches if the value produced by the Racket expression within the pattern is equal to the value being matched.
Here is a tweaked version of the inlined, CPS version of minimatch from the last assignment that implements a compiler for this core language, calling it minimatch: inlined.rkt.
Write a syntax-spec definition for this language and compile it to the provided minimatch.
Hints: You will need to define a nonterminal/exporting for patterns and a host-interface/expression for the macromatch form. Use racket-var as the binding class for pattern variables, and racket-expr as the nonterminal for clause bodies. Declare that the pattern forms are defined in a new binding space with the #:binding-space option on the nonterminal definition in order to avoid collisions with Racket’s built-in cons form.
This initial implementation doesn’t achieve anything more than minimatch does. However, thanks to syntax-spec it is easy to make macro-extensible! Add an extension-class for pattern macros to your syntax-spec definition, and declare that the pattern nonterminal allows macros of that extension class using #:allow-extension. In the extension class definition, use the same #:binding-space option as you used for the pattern nonterminal. This will avoid later collisions between the list and quasiquote pattern macros and the standard Racket meanings of those names.
Next, use define-dsl-syntax to implement the following syntactic extensions to the macromatch language without modifying the syntax-spec definition or compiler. Include unit tests of each extension that cover the different cases of each macro in a module+ test submodule.
8.1.2 quote patterns as a pattern macro
<pat> := .... |
| (quote <datum>) |
Example:
(define (g v) (macromatch v ['(1) "one thing"] ['(1 2) "two things"]))
8.1.3 list patterns as a pattern macro
Grammar:
<pat> := .... |
| (list <pat> ...) |
Example:
(define (g v) (macromatch v [(list a) "one thing"] [(list a b) "two things"])) (g '(1 2)) ; => ; "two things"
list patterns are equivalent to cons patterns with a '() in the final cdr position. (The real Racket match also supports additional features like ellipsis patterns; we are omitting those here.)
8.1.4 quasiquote patterns as a pattern macro
Racket’s quasiquote makes it easy to construct complex S-expressions with varying elements inserted via unquote.
For example, the following code constructs an X-expression representation of an HTML page based on the provided title and content:
(define (build-page title content) |
(quasiquote |
(html (head (title (unquote title))) |
(body |
(h1 (unquote title)) |
(p (unquote content)))))) |
Just as quote expressions such as (quote (1 2)) can be written with the shorthand '(1 2), expressions using quasiquote and unquote can be written in shorthand using ` and ,. For example, the above code can be written:
(define (build-page title content) `(html (head (title ,title)) (body (h1 ,title) (p ,content))))
Analagously, Racket’s match supports a quasiquote pattern form for deconstructing complex S-expressions. Here are a few examples:
> (match (list 1 2) [`(3 ,a) "nope"] [`(x ,a) "nope"] [`(1 ,a) a]) 2
> (match '(((1))) [`(((,x))) x]) 1
> (match 2 [`,x x]) 2
Create a pattern macro that augments your macromatch with a subset of this functionality based on the following grammar:
<pat> := .... |
| (quasiquote <quasipat>) |
<quasipat> := <datum> |
| <id> |
| (<quasipat> ...) |
| (unquote <pat>) |
(We are omitting many features of real quasipatterns, such as unquote-splicing, ellipses, and dotted pairs.)
You can use a ~datum syntax-parse pattern to recognize the unquote form.
Note that Racket’s quasiquote patterns treat unquote within nested quasiquotations differently than quasiquote expressions do. quasiquote patterns don’t have a notion of quotation depth, so the following should match:
(macromatch ``1 [``,x x]) ; => 1
As with list patterns, this subset of quasiquote patterns can be expanded into uses of the other pattern forms.
8.2 Route
In a lecture motivating the benefits of macros, I introduced a DSL for expressing routes in a web application. In contrast to a regular pattern matcher, this routing DSL features a static check specialized to the task of recognizing when a route specification has an unreachable route. For this homework, you will implement a more sophisticated version of this DSL using syntax-spec, including the static check.
8.2.1 The router language
(router <expr> <expr> <route> ...) |
<route> := (<method> <path> <expr>) |
<method> := get | post | put | delete |
<path-pat> := [<path-segment-pat> ...] |
<path-segment-pat> := <id> |
| (number-parameter <id>) |
| (string-parameter <id>) |
The pattern variable names provided as part of string and number parameters of a route should be bound within the Racket expression that is the action of the route.
A given pattern variable name must only occur once in a path pattern.
A use with a given method and path dispatches to a route when the method of the route matches and the path matches the path pattern. A path pattern matches when each segement of the path matches each segment of the path pattern. Path segment patterns match as follows:
A literal identifier matches a path segment that is a string with the same contents as the symbol’s string.
A string parameter matches any path segment. The pattern variable is bound to the string containing that path segment.
A number parameter matches a path segment that can be parsed to a number via string->number. The pattern variable is bound to the number.
It is an error to specify an ordering of routes such that one of the routes can never be dispatched to because a previous route handles all the same paths.
8.2.2 Examples
This is a valid use of router:
> (define (route1 method path) (router method path (get [blog] 'list-articles) (get [blog (number-parameter article-no)] (list 'numbered article-no)) (get [blog latest] 'latest) (get [blog (string-parameter article-name)] (list 'named article-name)))) > (route1 'get '("blog")) 'list-articles
> (route1 'get '("blog" "23")) '(numbered 23)
> (route1 'get '("blog" "language-oriendted-programming-is-awesome")) '(named "language-oriendted-programming-is-awesome")
> (route1 'get '("blog" "latest")) 'latest
Notice that in the route with the number parameter, the input path contains a string with characters "23" that can parse as a number, but when the DSL matches the path, it binds pattern variable article-no to the parsed number value 23.
A variety of other orderings of the same routes are invalid because some routes become unreachable. The static check raises a compile-time error in these cases.
The latest route is unreachable if it is placed after the route that accepts a string parameter which can be any string:
> (define (route1 method path) (router method path (get [blog] 'list-articles) (get [blog (number-parameter article-no)] (list 'numbered article-no)) (get [blog (string-parameter article-name)] (list 'named article-name)) (get [blog latest] 'latest))) eval:9:0: router: unreachable route
in: (get (blog latest) (#%host-expression (quote latest)))
The route with the number parameter is also unreachable if it is placed after the route that accepts a string parameter, because the path segments matched by a number parameter are a subset of those matched by a string parameter:
> (define (route1 method path) (router method path (get [blog] 'list-articles) (get [blog latest] 'latest) (get [blog (string-parameter article-name)] (list 'named article-name)) (get [blog (number-parameter article-no)] (list 'numbered article-no)))) eval:10:0: router: unreachable route
in: (get (blog (number-parameter article-no))
(#%host-expression (list (quote numbered) article-no)))
8.2.3 Implementing the router language
8.2.3.1 The syntax-spec
Start by defining the syntax with a syntax-spec. My definition includes two nonterminals, one nonterminal/exporting, and one host-interface/expression. Remember to include binding rules for the pattern variables in the path patterns to make them visible in the route bodies.
8.2.3.2 The code generator
Next, implement a code generator for the DSL via a define-syntax macro called compile-routes using syntax-parse. In your host-interface form for the router language, expand to this code generator. My implementation of compile-routes expands to a match form that dispatches to the appropriate route based on the method and path. Here is an example of that compilation:
(compile-router method path (get [blog] 'list-articles) (get [blog (number-parameter article-no)] (list 'numbered article-no)) (get [blog latest] 'latest) (get [blog (string-parameter article-name)] (list 'named article-name))) ; Expands to: (match (list method path) [(list 'get (list "blog")) 'list-articles] [(list 'get (list "blog" (app string->number (? number? article-no)))) (list 'numbered article-no)] [(list 'get (list "blog" "latest")) 'latest] [(list 'get (list "blog" article-name)) (list 'named article-name)])
Notice that literal path segments like blog expand to string literal patterns like "blog". Number parameters expand to an app pattern that first parsers the path segment to a number using string->number, then matches the result of that parse with a (? number? pvar) pattern. That pattern will fail to match if the string was not parseable as a number; if it was parsed as a number, the pvar is bound to the number. String parameters expand to a pattern that matches anything and binds the pattern variable.
My compile-routes relies on a compile-time helper function compile-path-fragment-to-pattern to accomplish these translations.
At this point, your router DSL should be usable, though it does not yet implement the static semantics. Write tests that cover the basic behavior of the router DSL in a module+ test submodule.
8.2.3.3 The static check
Finally, implement a static check that raises an error when a route is unreachable. This check should be implemented as a separate compile-time function that is called by the router host-interface form, check-unreachable-routes!. Here is its definition:
(begin-for-syntax ; (ListOf RouteSyntax) -> Void or error (define (check-unreachable-routes! routes) (define unreachable-route (first-unreachable-route routes)) (when unreachable-route (raise-syntax-error 'router "unreachable route" unreachable-route))))
This checking function relies on a helper function first-unreachable-route with signature:
;; (ListOf RouteSyntax) -> (or RouteSyntax #f) |
(define (first-unreachable-route routes) #| TODO |#) |
This function returns the first unreachable route in a list of routes.
I needed about 50 lines of code to implement the functionality of first-unreachable-route. You will have a much easier time if you follow the design recipe to break this functionality into several helper functions, following the structure of the grammar of routes, paths, and path segments.
You can use phase1-eval from syntax/macro-testing to write unit tests for these compile-time helpers, and to explore their behavior in the interactions window. For example, these unit tests should pass:
(check-equal? (phase1-eval (first-unreachable-route (list #'(get [blog latest] (latest-article)) #'(get [blog (string-parameter article-name)] (show-named-article article-name))))) #f) (check-equal? (phase1-eval (first-unreachable-route (list #'(get [blog (string-parameter article-name)] (show-named-article article-name)) #'(get [blog latest] (latest-article))))) '(get [blog latest] (latest-article)))
Write tests for the static check in a module+ test submodule. You may write these tests in terms of your first-unreachable-route helper or in terms of convert-compile-time-error checks for the overall router form. Your tests should cover all interesting cases of the static check.
8.2.4 Submission
Submit your macromatch implementation as a file named macromatch.rkt. This file should contain the syntax-spec definition and the define-dsl-syntax macro definitions of the quote, list, and quasiquote pattern macros.
Submit your router implementation as a file named router.rkt. This file should contain the syntax-spec definition, the define-syntax macro definition of compile-routes, compile-time functions for the static check, and tests for the static check.
Submit to Gradescope, in a group submission with your partner: https://www.gradescope.com/courses/941020/assignments/5830858