Playing with Chain.jl
Introduction
DataFrames.jl was designed to support chaining of operations well. For a long time my favorite package that helped with this was Pipe.jl. It is very easy to understand how it works and is clear visually.
There are many alternative packages that support chaining, but they all required much higher mental effort from the developer to master them. However, in November 2020 Chain.jl was created and it really is as simple as Pipe.jl, but at the same time more powerful. In this post I briefly investigate what it has to offer.
This post was written with Julia 1.5.3, Chain.jl 0.4.2, and Combinatorics 1.0.2.
Experimenting with Chain.jl
The Chain.jl README.md does a really great job of explaining why and how of the package so I refer you to the website to read the details. In short it introduces:
- a macro
@chain
and an extra annotation, - an
@aside
annotation that can be used inside a@chain
block to produce side effects, _
is used to signal where the value of the previous expression should be inserted (unless it is a first argument in which case_
can be omitted).
Let me give one exemplary usage of the @chain
macro. Assume we have a 6
element set and want to get all permutations of its 4 element subsets (if you
ever try to implement a Mastermind solver you might need it).
Here is how you can generate it using Chain.jl:
julia> using Chain
julia> using Combinatorics
julia> @chain 1:6 begin
combinations(4)
@aside println("# of combinations: ", length(_))
collect # we could skip this step
@. permutations
mapreduce(collect, vcat, _)
end
# of combinations: 15
360-element Array{Array{Int64,1},1}:
[1, 2, 3, 4]
[1, 2, 4, 3]
[1, 3, 2, 4]
⋮
[6, 4, 5, 3]
[6, 5, 3, 4]
[6, 5, 4, 3]
Note that in the first call combinations(4)
Chain.jl has put _
implicitly
as the first argument of the combinations
function, so the actual call is
combinations(1:6, 4)
.
In line collect
result of combinations(1:6, 4)
is passed as a single
argument to collect
(you do not need to write parentheses). Similarly in line
@. permutations
we use the same pattern but this time we broadcast the
permutations
function over a collection passed from the previous step of the
chain. If we want to pass other than the first argument then _
is used as
shown in the mapreduce(collect, vcat, _)
line.
In the second line @aside
is executed but is ignored in the pipeline. Note
that it would be tempting to write
@aside println("# of combinations: ", length)
instead of
@aside println("# of combinations: ", length(_))
The reason is that length
takes only one argument. However, in this case a
call to length
is nested so you have to pass _
explicitly.
You can see that Chain.jl has two key features:
- everything is wrapped in
begin
-end
block, - there is no visual separator (like standard
|>
in e.g. Pipe.jl) signaling an end of the expression.
Many people will find that it exactly fits their needs, but here is an alternative syntax that I have found to be potentially usable with Chain.jl:
@chain 1:6 (
combinations(4);
@aside println("# of combinations: ", length(_));
collect; # we could skip it
@. permutations;
mapreduce(collect, vcat, _);
)
which produces the same result.
The difference here is that I replace begin
-end
block with (
and )
, so
it is a bit less typing. In this case one has to separate the expressions with
;
. I also added ;
at the end of the last expression, though it is not
strictly necessary, as in this way you can safely add/remove lines in @chain
without changing the remaining lines.
If having to add ;
in this style is good or bad is a matter of taste. On one
hand it adds typing, but on the other hand it clearly shows the end of one
expression (which in begin
-end
style is not explicit, sometimes it might
be confusing if someone needed to add a line break in an expression, and e.g.
indentation should be used to signal line continuation then).
Also (
-)
style has an additional benefit that in some editors it is easy to
select the code block enclosed in the parentheses if you would need to
copy-paste the contents of @chain
.
Conclusions
I think Chain.jl is excellent. If you like chaining function calls in your code I really recommend you to check it out.