Introduction

Clojure’s documentation describes Transducers as composable algorithmic transformations. They are transformations which are independent of input or output and they compose without awareness of input or creation of aggregate values.

When talking about transducers, we should look at the original thing that gave rise to this abstraction. The reduction

A reducing function is essentially any function that you can pass to a reduce call. This function is supposed to take the result so far, generally known as the accumulator and the new value and generate a new result. So a reducing function is

accumulated-result, input -> accumulated-result

While a transducer is the more general transformation defined as something which takes a reducing function and returns a new reducing function

(accumulated-result, input -> accumulated-result) -> (accumulated-result, input -> accumulated-result)

The primary power of transducers comes from their fundamental decoupling - they don’t care (or know about):

  • The ‘job’ being done (what the reducing function does)
  • The context of use (what ‘whatever’ is)
  • The source of inputs (where input comes from)

Now all this sounds really fancy, but I always have a hard time wrapping my head around abstract terminology like this before I understand the nuts and bolts of it. So in this post, we will walk through a very simple and elegant way of building up your thinking about transducers. I am in no way an expert at this but this has definitely helped me understand the concepts thoroughly and find joy in applying these techniques in my day to day work.

Down the rabbit hole

As described in the official documentation and this talk, the origin of the idea of transducers is the reducer. So lets start by writing a very simple function

How to write map in terms of a reduction

(defn rmap [f coll]
  (reduce (fn [acc v]
            (conj acc (f v)))
          []
          coll))

Simple enough, reduce over the collection, conj a value into the accumulator. This value is the result of applying the function f on the value which gives us the map

How to write filter in terms of a reduction

(defn rfilter [f coll]
  (reduce (fn [acc v]
            (if (f v)
              (conj acc v)
              acc))
          []
          coll))

Again, reduce over the collection, but now only conj a value if the predicate returns true.

Let us look at these functions closely and try to identify where there is tight coupling. One of the first things that should jump out is the use of conj and the empty vector passed as the initial value. These are sequence building operations and strictly speaking nothing to do with the tranformations i.e the map and filter operations.

Let’s try and isolate out the reducing functions of map and filter from the sequence building. We can focus on just the internal anonymous functions above which take an accumulator and a value and return a new accumulated result


(defn mapping [f]
  (fn [acc v]
    (conj acc (f v))))


(defn filtering [f]
  (fn [acc v]
    (if (f v)
      (conj acc v)
      acc)))

This allows us to pull out the reducing logic from everywhere and put it in one place like this


(def data (vec (range 20)))

(reduce (mapping inc)
        []
        data)

This is a step in the right direction. Now our transformation is independent and we can supply an initial value of the result as we desire. But the coupling of the conj and the map is still a concern.

Parameterize the sequence builder

Lets take the function which builds the sequence as an argument ! How would that look ?


(defn trans-mapping
  "Take f as argument and return a function which takes
  conj as an argument and applies f"
  [f]
  (fn [xf]
   (fn [acc v]
     (xf acc (f v)))))


(defn trans-filtering [f]
  (fn [xf]
     (fn [acc v]
       (if (f v)
         (xf acc v)
         acc))))

Let’s take minute to see exactly what is happening here. trans-mapping takes a function just like a map would. But instead of returning a value, it returns another function which can take any sequence building function like conj ! That’s promising right ? Now our initial value and the sequence building operations are both independent of the actual transformation. Let’s update our test code from earlier

;; Let's build a mapping which squares its input value
(def rfn (trans-mapping #(* % %)))

(reduce (rfn conj)
        []
        data)

But if you look very closely, you realise that conj isn’t really very special here. Nor is the [] initial value. We can just as easily substitute both of them for an operation which adds something and an initial value to start with. So why not do this ?

(reduce (rfn +)
        0
        data)

What do you know ! This works too :D It will add squares of all numbers in the data sequence.

Composing transformations

Consider a case of a composite reduce function

(def rfn (comp (trans-mapping inc)
               (trans-filtering even?)))

Now if we look at the rfn function what is it’s arity and what are the arguments ? rfn is the function returned by trans-mapping and/or trans-filtering and so it has an arity of 1. It takes the sequence building function and returns the transformation function. (rfn +) -> transformation function.

The transformation function takes 2 arguments, the first argument is the accumulated result and second argument is the new value.

;; So this is a very valid way of using rfn
((rfn +) 42 5)

What do you think this returns ? comp applies it’s functions from right to left, so first it will check whether the value is even and if it is, it will increment it. So in our case, this should fail at the filter stage since 5 is odd. But if you run this code, you will see the result to be 48!

This happens because rfn will first do a map and then a filter. comp is still doing the application right-to-left. But the way our trans-mapping function works nests the mapping inside the filtering. In its expanded form, the comp form actually turns into

(((trans-mapping inc)
  ((trans-filtering even?) +)) 42 5)

Now the trans-filtering form takes the + fn and embeds it inside its own transformation as the xf argument, and returns a function which adds to its accumulator if the input is even. This function is passed in turn to the trans-mapping form which embeds it inside its own returned function as the xf and applies inc to its argument. Let’s see a couple of expanded forms

(((fn [xf] ;; trans-mapping
    (fn [acc v]
      (xf acc (inc v)))) ((fn [acc v] ;; trans-filtering
                            (if (even? v)
                              (+ acc v)
                              acc)))) 42 5)

;; Now if we replace the xf in the trans-mapping form
;; with the function returned by trans-filtering

((fn [acc v]
   ;; (trans-filtering even?) is the function being mapped
   ;; with + as the sequence builder
   ((fn [acc v]
      (if (even? v)
        (+ acc v)
        acc)) acc (inc v)))
 42 5)

So as you can see, due to the way the trans-mapping and trans-filtering functions nest its arguments, comp's RTL application results in actual application to the inputs in LTR order, i.e inc first and even? later.

Keep this in mind. This is important when you build transformation pipelines with comp and clojure’s built in tranducers.

Custom reducing functions

Till now we have seen some built-in reducing functions like conj and + but we are not really restricted to using only built-in functions. Let’s try and write a string-rf function which essentially builds a string instead of a vector.

(defn string-rf
  [^StringBuilder acc ^Character c]
  (.append acc c))

Simple enough, it takes an accumulator and a character, calls .append on it to return a StringBuilder.

Now let’s see how we can use such a reducing function.

(def xform
  (comp
    (map int)
    (filter odd?)
    (map char)))

(str (reduce (xform string-rf)
             (StringBuilder.) ;; Initialise
             "Hello world"))

Here we call reduce with the transformation which takes the string-rf instead of the conj or + functions as above. So you can imagine how it will use the string-rf function to append to the sequence, essentially used as the xf argument.

For the reduction to work correctly, we have to give it a sane initial value so a new StringBuilder instance is as good a value as any. And as seen above since string-rf returns another StringBuilder, we have to call str on it to get an actual string.

A multi-arity reducing function

In the code above, the initialization and final realisation of the value is still too tighly coupled in the reduce call. It has to be a StringBuilder and we have to call str on it. But if you look closely these are the requirements of the string-rf function which are leaking out ! Can we do better ? Ofcourse we can… What if we let the same string-rf function tell us how to initialise the sequence. That could be achieved by having another arity which takes no args and returns a new StringBuilder, right ?

(defn string-rf
  ;; this is to build a new string-builder
  ([] (StringBuilder.))
  ;; this is the construction
  ([^StringBuilder acc ^Character c]
   (.append acc c)))

Now, string-rf can also tell us how to realise / get the final value out of the transformation. In this final step, string-rf actually needs only 1 arg, right ? So let’s use that arity.

(defn string-rf
  ;; this is to build a new string-builder
  ([] (StringBuilder.))
  ;; this is to convert the return value
  ([^StringBuilder s] (.toString s))
  ;; this is the construction
  ([^StringBuilder acc ^Character c]
   (.append acc c)))

And voila! What do you know, you have yourselves a transducer. So let’s use it with the transduce function !

;; first arg is the transformation
;; second arg is how you build the collection
(transduce xform string-rf "Hello World")

This form is equivalent to

;; This is equivalent to
(let [f (xform string-rf)]
  (f (reduce f (f) "Hello World")))

If you notice above, I used map and filter to create the xform instead of trans-mapping and trans-filtering. Why do you think that is ?

Well, that’s because similar to the reducing function, now the transformation also needs to handle multiple arities. So let’s change our mapping and filtering functions to do that !

(defn mapping
  [f]
  (fn [xf]
    (fn
      ([] (xf))
      ([acc] (xf acc))
      ([acc v]
       (xf acc (f v))))))


(defn filtering
  [f]
  (fn [xf]
    (fn
      ([] (xf))
      ([acc] (xf acc))
      ([acc v]
       (if (f v)
         (xf acc v)
         acc)))))


;; and now the xform can look like
(def xform
  (comp (mapping int)
        (mapping inc)
        (filtering even?)
        (mapping char)))

Now if you peer into the implementations of functions which returns transducers, you will see something exactly like these functions above ! And that’s all there is to transducers really. The rest of the magic, is as always, in we the programmer decide to use it !

Finally, let me leave you with a simple template for any custom transducers that you may want to write.

(defn template-transducer [xf]
    (fn
      ([] (xf)) ;; Setup
      ([result input] (xf result input)) ;; Process
      ([result] (xf result))))

But given the power of clojure’s built-in functions, it’s unlikely that you will ever need to write your own tranducers, but it’s always fun to know how you can !

Use-cases for transducers

By now, hopefully you understand what goes on behind the scenes of transducers. But understanding where to use them and where to not use them is still a skill we can only build with time as we see and write more and more code. But it’s always nice to start with some pointers there, and there’s no better place than Clojure’s own docs to get you started.

Credits Transducers talk by Rich Hickey Pivotshare tutorial by Tim Baldrige