Transducers

Posted on January 3, 2016 by Josh

TL;DR They’re not tough. Play around with the following functions in a repl, and you’ll master this super useful pattern:

reducing fns: [conj str]
transducers: [(map inc) (filter even?)]
appliers: [transduce eduction]

Why Transducers? Composition.

A good solution to a problem is one that you can easily pair with other solutions, and blissfully ignore extraneous details, such as what specifically the current solution is getting applied to.

This is, of course, composition. And different shapes of solutions compose in different ways. For example, plain functions compose with clojure’s (comp ...).

ex: (comp (partial + 1) (partial * 2)) returns a function that now acts like (fn [x] (-> x (* 2) (+ 1)))

Transducers are a pattern for comping modifications that you want to make to a reducing function.

Reducing Functions (or rfs)

You’ll find over your career that a lot of solutions you write can reduce to something that looks like this:

(fn reducing-fn
  [accumulator piece] ...) ;; => new-accumulator

Examples of reducing fns:

(conj [1 2] 3) ;; => [1 2 3]
(str "accumulating string, " "new piece") ;; => "accumulating string, new piece"

Note: reduce is called fold in haskell, and you’ll see various flavors of fold, but which stay close to these semantics.

Transducers, aka xforms (or xfs)

When using reducing functions, it’s often important (in the name of composition!) to be able to manipulate the stuff flowing into our function. I may want to filter it, or apply a fn to it, for example.

So, there are another class of functions called transducers, which you will see called xforms, or xfs. Their signature looks like this:

(fn [rf] ...) ;; => new-rf

So, xfs manipulate rfs.

Examples of xforms:

(def xf (map inc)) ;; => an xform that `inc`s inputs before they reach the reducing function (eg: `conj`, or `str`)
((xf conj) [] 1) ; => [2]
((xf str) "" 1)  ; => "2"

xform composition

Notice how xforms take rfs and return rfs? We can chain them together using (comp ...).

(comp xf-1 xf-2) ;; => new xf

Example of composition:

;; Composition Order: top-most happens first
;;   (that's opposite of normal fn comp, fyi. 
;;    There's a good reason for it.)

(((comp (map inc)
        (map (partial * 2)))
  conj) [] 1) ; => [4]

(((comp (map (partial * 2))
        (map inc))
   conj) [] 1) ; => [3]

And put em all together

  (let [xform (fn xform [rf] ...)       ;=> new rf
        rf    (fn rf [c x] ...)]        ;=> new c
    ((xform rf) collection value)))     ;=> new c

transduce

Consider the signature of reduce: (reduce rf collection) ; => new-collection

transduce just adds a transducer/xform into the mix: (transduce xform rf collection) ; => new-collection

So, transduce processes things (like a vector, or an async channel) with xform, and then reduces them with rf

(def xf (comp (map inc)
              (filter even?)
              (map str)))
(transduce xf conj [1 2 3]) ;; => ["2" "4"]
(transduce xf str  [1 2 3]) ;; => "24"

Notice, the xform says to take numbers, inc em, keep only even? ones, turn em to str.

That same xform works with both conj and str just fine! And it’d work great with, say, an async/pipeline too!

eduction

An eduction basically runs a transducer over some traversable thing, like a vector:

(eduction (map inc) [1 2 3])            ; => (2 3 4)
(eduction (filter even?) (range 5))     ; => (0 2 4)
(eduction (comp (filter even?)
                (map inc))
          (range 5))                    ; => (1 3 5)

Custom xforms and rfs

I want to leave you with a bigger example that draws from everything we just learned:

This transducer duplicates inputs.

(let [new-concat (fn new-concat
                   ([] [])                   ; 0-arity
                   ([c x] (concat c x)))     ; 2-arity
                   
      xform      (fn xform [rf]
                   (fn duplicate
                     ([] (rf))               ; 0-arity
                     ([x] x)                 ; 1-arity
                     ([c x] (rf c [x x]))))] ; 2-arity

  ((xform new-concat) [] 1)                  ; => (1 1)
  (transduce xform new-concat (range 3))     ; => (0 0 1 1 2 2)
  (eduction xform (range 3)))                ; => ([0 0] [1 1] [2 2])

Arities for Extra Credit

You’ll get far with what you now know. But you’ll notice in the example above the multiple arities of the transducer and the reducing function, and understanding this will round out your understanding, and allow you to write your own reducing functions and transducers.

The multiple arities basically completes the “plumbing” needed to make transducing work.

If you want something far more technical than what I’ve dissected here, I used this blog post to help me figure things out.

Arities of rfs

  • 0-arity - empty objects of the correct type

    Reducing functions can declare the sort of data they work with by calling their 0-arity version:

    eg: (conj) ; => [] (str) ; => ""

    And transduce needs that information, which you can see in it’s 0-arity version which calls (rf)

  • 1-arity - completion function

    This one’s a little confusing. This arity is called when the reduction is nearly complete, and allows some final work to be done.

    Perhaps, take a look at partition-by in the clojure core.

    Notice the let [a (java.util.ArrayList.) ...] which holds state throughout the reduction. The 1-arity version of partition-by’s transducer pulls out the left-over state to stick it in the final return value, whereas it would have been lost otherwise.

  • 2-arity - step function

    This is the muscle of the reducing function. It is this arity version which gets passed an accumulator, and an object, and must return a new accumulator.

Arities of xforms

  • 0-arity - initial collection

    This often simply calls the 0-arity version of the reducing function to get an empty object to work with.

  • 1-arity - returns a new rf

    Here are the guts of a transducing fn - it takes an rf and returns a modified rf.

Reference

The following functions are related to transducers, and therefore fun to learn. I leave it as a reference since it frustrated me in the beginning of this academic journey to know what worked with transducers and what didn’t. This was my “bingo” card while studying them :)

cat
completing
dedupe
distinct
drop
drop-while
eduction
filter
interpose
keep
keep-indexed
map
map-indexed
mapcat
partition-all
partition-by
random-sample
remove
replace
sequence
take
take-nth
take-while
transduce