 Up |  Home

I Snuck Clojure Into Work

Date: [2025-08-02 Sat]

A few months ago, I received an interesting task to retrieve some user data that was missing from one of our web applications at work. In order to retrieve the data we had to build an "off the side of my desk," one-off tool to do the job. Because this tool wasn't going to be a critical piece of software that absolutely everyone needed to know how to support, I could use whatever I wanted to build it with.

The project requirements

The upstream authentication server where our missing data lives is old. So old that it uses the SOAP protocol and required structured XML post body objects. To solve my problem I needed to make a couple of careful considerations:

  1. Use as few third party dependencies as possible and stick to a trusty set of core tools
  2. Have some existing support for XML parsing and serialization.

Since we are a Java shop in my ministry and Java has first-class support for XML, the choice was easy: Use Java Clojure.

While I have respect for our Java stack and admire how well it works, I don't want to write more Java unless I have to. I've been looking for an excuse to write some "real" Clojure for a long time.

I received a large CSV file that contained rows of user IDs with columns of missing data. My tool should be able to consume that CSV, construct a call-out to the upstream web service, receive the response XML with the complete data and generate a new CSV output with that data. On the rare but expected case where some user ID is bogus, we'll need to retain the original user IDs and insert them back into the CSV as not found.

Moreover, the SOAP web service can only handle 100 batched requests at a time so I'm going to have to chunk my requests and make sure it all stays concurrent.

Anything that I write for the ministry is highly visible to my peers, so I needed to be able to at least explain why I chose Clojure. Aside from the fact that it satisfies the two requirements stated above, it's also really easy to write, is data driven and uses immutable data by design. It also has excellent concurrency, is nearly as fast as Java and runs on the comfortably familiar Java Virtual Machine. So, without asking for permission, and being ready to beg for forgiveness, I started writing.

Implementation

The first thing I needed was some way to compose an XML SOAP body. I also needed a way to de-serialize XML into Clojure key-value maps. Thankfully Clojure has me covered with clojure/data.xml.

(ns soap-sync.xml-soap
  (:require [clojure.data.xml :refer [element emit-str parse-str]]))

;; Re-export these for use in core.clj
(defn to-string [xml-object] (emit-str xml-object))
(defn to-xml-map [xml-string] (parse-str xml-string))

;; Use-case specific XML generator for a single account detail item request
(defn create-account-detail-item [ids]
  (let [[guid user-id] ids]
    (element :AccountDetailListRequestItem {}
             (cond (seq guid) (element :userGuid {} guid)
                   (seq user-id) (element :userId {} user-id)
                   :else (throw (Throwable. "GUID or userId must be supplied")))
             (element :accountTypeCode {} "Business"))))

;; Generate a "list" of account-detail items for a batched request
(defn create-account-detail-list [ids]
  (let [xmlns (System/getenv "ACTION_ROOT")]
    (element :getAccountDetailList {:xmlns xmlns}
             (element :accountDetailListRequest {}
                      (element :onlineServiceId {}
                               (System/getenv "SERVICE_ID"))
                      (element :requesterAccountTypeCode {} "Internal")
                      (element :requesterUserGuid {}
                               (System/getenv "ACCOUNT_GUID"))
                      (element :requestItemList {}
                               (map create-account-detail-item ids))))))

;; Generic, multi-purpose wrappers
(defn create-soap-body [body-content]
  (element :soapenv:Body {} body-content))

(defn create-soap-envelope [body]
  (element :soapenv:Envelope {:xmlns:soapenv
                              "http://schemas.xmlsoap.org/soap/envelope/"}
           (element :soapenv:Header)
           body))

The data.xml library comes with an element function that will return a map of a single branch of an XML AST (abstract syntax tree) that is ready for serialization. My intention was to make it really simple to re-use these functions to perform any kind of SOAP request later on.

The response data from the upstream web service is very verbose and filled with a lot of useless & redundant XML element trees. To access the child elements, you could know exactly where you're going and path down the tree with something like (first (:content (nth (first (:content xml-tree) 2)))). What if things aren't in perfect order or change order? Since I am a web developer first, I opted to write some web familiar functions to help me out.

(defn get-elements-by-tag-name
  "Get all elements by tag name."
  [tag-name xml-map]
  (filter #(= (:tag %) tag-name)
          (tree-seq #(seq? (:content %)) #(:content %) xml-map)))

(defn get-element-by-tag-name
  "Get the first element by tag name."
  [tag-name xml-map]
  (first (get-elements-by-tag-name tag-name xml-map)))

(defn get-tag-value
  "Many objects have a nested 'value' tag that actually holds the tag value.
  This makes extracting that nested value feel less repetative."
  [xml-tag]
  (->> xml-tag :content first :content first))

The get-elements-by-tag-name function does most of the heavy lifting here with the help of the incredible tree-seq function. It traverses the entire syntax tree and returns a flattened sequence of tree branches that match a specific XML tag name.

With these functions I can use them in core.clj to find the data from the massive response body that I need and disregard the rest. Here are a couple of the functions I use to map over <AccountDetailResponseItem /> elements and construct a CSV list of data rows.

(defn get-detail-list-presponse-item [[response, ids]]
  (map vector
       (xs/get-elements-by-tag-name :AccountDetailListResponseItem
                                    (xs/to-xml-map response))
       ids))

(defn extract-csv-row [[detail-list-item [guid bceid]]]
  (let [individual (first (xs/get-elements-by-tag-name
                            :individualIdentity detail-list-item))]
    [(->> detail-list-item
          (xs/get-element-by-tag-name :userId)
          xs/get-tag-value)
     (->> detail-list-item
          (xs/get-element-by-tag-name :guid)
          xs/get-tag-value)
     (->> individual
          (xs/get-element-by-tag-name :firstname)
          xs/get-tag-value)
     (->> individual
          (xs/get-element-by-tag-name :surname)
          xs/get-tag-value)
     (->> detail-list-item (xs/get-element-by-tag-name :failureCode)
          :content first)
     bceid
     guid]))

We're also going to need some way to consume and output a CSV file. The clojure/data.csv library had me covered. The work to write the CSV module was minimal.

(ns soap-sync.csv
  (:require [clojure.data.csv :as csv]
            [clojure.java.io :as io]))

(defn get-input-data [& resource-path]
  (let [input-file (or (when resource-path
                         (or (io/resource (first resource-path))
                             (first resource-path)))
                       (System/getenv "INPUT_CSV"))
        data (with-open [reader (io/reader input-file)]
               (doall
                (csv/read-csv reader)))]
    (map zipmap
         (->> (first data)
              repeat)
         (rest data))))

(defn write-output-data [csv-data & out-path]
  (with-open [writer (io/writer (or (first out-path) "out.csv"))]
    (csv/write-csv writer csv-data)))

Putting it all together

The end result was a 90 line core.clj file that handled the entire process: Get the input data, break it into chunks of 100, prepare the chunks to be used as POST XML bodies, process each chunk with clojure/core.async, extract the useful data from the chunked responses, flatten them, then generate a CSV file.

(ns soap-sync.core
  (:gen-class)
  (:require [clojure.core.async :refer [chan thread <!! >!! close!]]
            [soap-sync.csv :as csv]
            [soap-sync.xml-soap :as xs]
            [soap-sync.utils :refer [chunk-rows]]
            [clj-http.client :as client]))

(defn send-soap-request [action soap-envelope]
  (let [url (str (System/getenv "SERVICE_SITE")
                 "/webservices/client/V10/BCeIDService.asmx?WSDL")
        username (System/getenv "ACCOUNT_NAME")
        password (System/getenv "ACCOUNT_PASSWORD")
        headers {"SOAPAction"
                 (str (System/getenv "ACTION_ROOT") action)}]
    (client/post url
                 {:basic-auth [username password]
                  :headers headers
                  :body soap-envelope
                  :content-type "text/xml;charset=utf-8"
                  :throw-exceptions false})))

(defn extract-csv-row [[detail-list-item [guid bceid]]]
  (let [individual (first (xs/get-elements-by-tag-name
                            :individualIdentity detail-list-item))]
    [(->> detail-list-item
          (xs/get-element-by-tag-name :userId)
          xs/get-tag-value)
     (->> detail-list-item
          (xs/get-element-by-tag-name :guid)
          xs/get-tag-value)
     (->> individual
          (xs/get-element-by-tag-name :firstname)
          xs/get-tag-value)
     (->> individual
          (xs/get-element-by-tag-name :surname)
          xs/get-tag-value)
     (->> detail-list-item (xs/get-element-by-tag-name :failureCode)
          :content first)
     bceid
     guid]))

(defn prepare-callout [chunk]
  (let [ids (map #(list (get % "GUID")
                        (get % "BCeID")) chunk)]
    (-> ids
        xs/create-account-detail-list
        xs/create-soap-body
        xs/create-soap-envelope
        xs/to-string
        (vector ids))))

(defn process-api-callouts [callouts]
  (let [channel (chan)]
    (thread (doseq [[envelope ids] callouts]
              (>!! channel (vector (send-soap-request "getAccountDetailList" envelope)
                                   ids)))
            (close! channel))
    (loop [responses []
           callout 0]
      (if-let [[res ids] (<!! channel)]
        (do
          (println "Received response from callout " callout)
          (println "API Response Code: " (:status res))
          (recur (if-not (= 200 (:status res))
                   responses
                   (conj responses (vector (:body res) ids)))
                 (inc callout)))
        responses))))

(defn get-detail-list-presponse-item [[response, ids]]
  (map vector
       (xs/get-elements-by-tag-name :AccountDetailListResponseItem
                                    (xs/to-xml-map response))
       ids))

(defn generate-output-data [response-body]
  (into [] (concat
             [["User ID" "GUID" "First Name" "Last Name" "Failure Code"
               "Source BCeID" "Source GUID"]]
             (map extract-csv-row response-body))))

(defn -main []
  (let [api-responses (->> (chunk-rows 100 (csv/get-input-data))
                           (map prepare-callout)
                           process-api-callouts
                           (mapcat get-detail-list-presponse-item))]
    (csv/write-output-data (generate-output-data api-responses))))

Unfortunately this is such a niche use case that I doubt this tool will be useful for anybody but myself and my colleagues. If you're curious or want to steal pieces of it for your own project, the source code is on Github: https://github.com/bcgov/bceid-soap-sync

Closing thoughts

I really enjoy REPL (read-eval-print loop) driven development. It made the XML syntax tree easy to traverse in Emacs. You just have to evaluate a function call, then use cider-inspect-last-result and go play in the return value. I could also traverse the responses of my helper functions and figure out which adjustments I needed to make. Clojure really made this project go as smoothly as it could have.

In terms of CI/CD, with cloverage, I could write tests and get a pretty cool test coverage reports.

I really enjoyed working with Clojure and hope I get another important reason to use it again.