RSS Feed

March, 2015

  1. Optimize a clojure macro

    March 14, 2015 by xudifsd

    Note, if you’re a macro guru, you can ignore this post, this is just used to record my learning path to master clojure macro.

    Many times, I find myself using format to generate string from string template, and values filled in are always derived from local variables, so I’m wondering if I can use macro to generate this kind of code for me.

    So I wrote the first version:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    (ns before-opt
      (:require [clojure.string :as str]))
    
    ; matches {a} {a-b} {a_b} but not {:a} {a/b}
    (defonce template-var-re #"\{((?:\w|\-)+)\}")
    
    (defmacro gen-from-tpl
      "gen-from-tpl should invoked with \"exception {ex} while {method} {url}\",
      and we'll generate new string based on var value in current context env.
      For example (let [foo 1 bar 2] (gen-from-tpl \"{foo}:{bar}\")) => \"1:2\""
      [tpl-str]
      {:pre [(string? tpl-str)]}
      (let [matches (re-seq template-var-re tpl-str)
            str-to-replace (mapv (fn [[orign key]]
                                   [orign (symbol key)])
                                 matches)]
        `(reduce (fn [acc# [orign# val#]]
                   (str/replace acc# orign# (str val#)))
                 ~tpl-str
                 ~str-to-replace)))
    

    with this macro I can use

    (gen-from-tpl "{scheme}://{subdomain}.{host}/foo/bar/{token}")

    instead of

    (format "%s://%s.%s/foo/bar/%s" scheme subdomain host token)

    Yes, this doesn’t eliminate much typing, but I don’t need to check if the number of “%s” matches the number of args, and template looks more pretty. Anyway I happily used it in some places, but after some days, I realized the code generated from it is not optimal: it generate a reduce call with string replace in reducer fn, and this is runtime cost, not compile time cost, which means it will call reduce to generate wanted string from template at runtime!

    I want generated code to be more efficient, to do this, I have to move some computation from runtime to compile time, and generate something like

    (str scheme "://" subdomain "." host "/foo/bar/" token)

    this will be much better.

    To achieve this, I need one util function that partition string with regular expression, it will return matched string and unmatched string sequentially, instead of just return matched like re-seq do. After some searches, I found it doesn’t exist, so I have to implement it on my own, and finally finished optimized version:

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    (ns after-opt)
    
    ; matches {a} {a-b} {a_b} but not {:a} {a/b}
    (defonce template-var-re #"\{((?:\w|\-)+)\}")
    
    (defn- re-partition
      "partition string by regex"
      [re string]
      {:pre [(string? string)]}
      (letfn [(construct-not-matched [start end]
                {:match? false
                 :result (subs string start end)})
              (construct-matched [start end]
                (let [matched-str (subs string start end)]
                  {:match? true
                   :result (re-matches re matched-str)}))]
        (let [matcher (re-matcher re string)
              str-count (count string)]
          (loop [last-index 0
                 result []]
            (if (.find matcher)
              (let [start (.start matcher)
                    end (.end matcher)
                    cur-result (construct-matched start end)]
                (if (= start last-index)
                  (recur end (conj result cur-result))
                  (recur end (conj result
                                   (construct-not-matched last-index start)
                                   cur-result))))
              (if (= last-index str-count)
                result
                (conj result (construct-not-matched last-index str-count))))))))
    
    (defmacro gen-from-tpl
      "gen-from-tpl should invoked with something like
      \"exception {ex} while {method} {url}\"
      and we'll generate new string based on var value in current context env.
      For example: (let [foo 1 bar 2] (gen-from-tpl \"{foo}:{bar}\")) => \"1:2\""
      [tpl-str]
      {:pre [(string? tpl-str)]}
      (let [partitions (re-partition template-var-re tpl-str)
            string-and-symbol (map (fn [{:keys [match? result]}]
                                     (if match?
                                       (-> result second symbol)
                                       result))
                                   partitions)]
        `(str ~@string-and-symbol)))
    

    re-partition is indeed a little ugly, but it involves java object, and have to handle some corner cases, this is the best I can do.

    The problem with this macro is the argument must be string literal, variable that have string value won’t works. We can get around this by probing its argument type, and generate optimal code on string or code that doing re-partition at runtime on other types, but since I don’t need that ability, I didn’t do that, maybe you can have a try.

    Best thing about macro: you can use Lisp function at both runtime and compile-time. I’m very appreciate its homoiconicity now.