Introduction

It is a well known performance recommendation in the Julia language that avoiding allocations matters and using immutable objects is a good practice. In this post, in the context of creation of agent based models, I want to show two examples of a toy codes that highlight the key aspects of this issue.

All the examples are tested under Julia 1.5.2. The Setfield.jl package version used is 0.7.

Avoiding allocations

Consider the following code:

using Statistics

struct AgentI
    loc::Int
end

mutable struct AgentM
    loc::Int
end

fI(n) = mean(x -> x.loc, [AgentI(i) for i in 1:n])
fM(n) = mean(x -> x.loc, [AgentM(i) for i in 1:n])

The only difference between functions fI and fM is that they respectively work on immutable and mutable objects. I use mean for aggeration to make sure that the compiler does not optimize out too much.

Let us benchmark these codes:

julia> fI(1); fM(1); GC.gc(); # force compilation and collect garbage

julia> @time fI(10^8)
  0.385592 seconds (2 allocations: 762.940 MiB, 0.73% gc time)
5.00000005e7

julia> @time GC.gc()
  0.091111 seconds, 100.00% gc time

julia> @time fM(10^8)
  3.498990 seconds (100.00 M allocations: 2.235 GiB, 60.79% gc time)
5.00000005e7

julia> @time GC.gc()
  0.295949 seconds, 100.00% gc time

We see that working with mutable objects was ten times slower and also it lead to a higher garbage collection cost after fM terminated. In particular note that over 60% of run time of fM was spent on garbage collection.

The reason is the following. Allocating objects has three costs:

cost of actual allocation
cost of having to store and use object references instead of objects directly in the container
cost of cleaning-up (i.e. running garbage collector that frees unused memory)

The cost of data movement

A huge advantage of mutable objects is, well, that they are mutable. This makes it simple to update only their selected fields. One might wonder if the cost of having to create the immutable object anew each time in such cases is important. Let us investigate. First we set up our experiment:

using Statistics
using Setfield

struct Agent2I
    loc::Int
    junk::NTuple{100, Int}
end

mutable struct Agent2M
    loc::Int
    junk::NTuple{100, Int}
end

const REF_TUP = ntuple(i -> 0, 100)

function gI(n)
    x = [Agent2I(i, REF_TUP) for i in 1:n]
    for i in 1:n
        xi = x[i]
        x[i] = @set xi.loc += 1
    end
    return mean(x -> x.loc, x)
end

function gM(n)
    x = [Agent2M(i, REF_TUP) for i in 1:n]
    for i in 1:n
        x[i].loc += 1
    end
    return mean(x -> x.loc, x)
end

Note that we used REF_TUP to make the Agent2I object have a larger memory footprint. The size of agent state I used is usually more than enough in practice. As you can see in the code the Setfield.jl package makes it easy to update only selected field of an immutable object using the @set macro.

It is time to start benchmarking:

julia> gI(1); gM(1); GC.gc();

julia> @time gI(10^7)
  2.714677 seconds (2 allocations: 7.525 GiB, 0.10% gc time)
5.0000015e6

julia> @time GC.gc()
  0.277477 seconds, 100.00% gc time

julia> @time gM(10^7)
  7.287978 seconds (10.00 M allocations: 7.674 GiB, 59.53% gc time)
5.0000015e6

julia> @time GC.gc()
  0.965113 seconds, 100.00% gc time

However, as rightly pointed out by George Datseris this benchmark is incorrect, as it mixes cost of creation and operation on the objects. Therefore we should not pay to much attention to it (I am leaving it as it was originally posted).

The correct benchmark is as follows:

julia> using BenchmarkTools

julia> function gIx(x)
           for i in 1:length(x)
               xi = x[i]
               x[i] = @set xi.loc += 1
           end
           return mean(x -> x.loc, x)
       end
gIx (generic function with 1 method)

julia> function gMx(x)
           for i in 1:length(x)
               x[i].loc += 1
           end
           return mean(x -> x.loc, x)
       end
gMx (generic function with 1 method)

julia> x = [Agent2I(i, REF_TUP) for i in 1:10^6];

julia> @benchmark gIx($x)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     24.361 ms (0.00% GC)
  median time:      49.750 ms (0.00% GC)
  mean time:        48.954 ms (0.00% GC)
  maximum time:     50.191 ms (0.00% GC)
  --------------
  samples:          103
  evals/sample:     1

julia> x = [Agent2M(i, REF_TUP) for i in 1:10^6];

julia> @benchmark gMx($x)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     23.198 ms (0.00% GC)
  median time:      44.621 ms (0.00% GC)
  mean time:        44.114 ms (0.00% GC)
  maximum time:     45.086 ms (0.00% GC)
  --------------
  samples:          114
  evals/sample:     1

Note that this time we benchmark only mutation, as object creation was moved out of the function. As you can see (and this is what I initially expected by creating a very wite agent representation) using a mutable agent is a bit faster. The reason is that there is much less data movement in this case.

What would happen if agent had a narrower representation. Let us see by making the tuple have only 10 elements, and not 100 as previously (start a fresh Julia session):

julia> using Statistics

julia> using Setfield

julia> struct Agent2I
           loc::Int
           junk::NTuple{10, Int}
       end

julia> mutable struct Agent2M
           loc::Int
           junk::NTuple{10, Int}
       end

julia> const REF_TUP = ntuple(i -> 0, 10)
(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)

julia> using BenchmarkTools

julia> function gIx(x)
           for i in 1:length(x)
               xi = x[i]
               x[i] = @set xi.loc += 1
           end
           return mean(x -> x.loc, x)
       end
gIx (generic function with 1 method)

julia> function gMx(x)
           for i in 1:length(x)
               x[i].loc += 1
           end
           return mean(x -> x.loc, x)
       end
gMx (generic function with 1 method)

julia> x = [Agent2I(i, REF_TUP) for i in 1:10^6];

julia> @benchmark gIx($x)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     12.480 ms (0.00% GC)
  median time:      16.633 ms (0.00% GC)
  mean time:        16.645 ms (0.00% GC)
  maximum time:     17.387 ms (0.00% GC)
  --------------
  samples:          301
  evals/sample:     1

julia> x = [Agent2M(i, REF_TUP) for i in 1:10^6];

julia> @benchmark gMx($x)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     12.923 ms (0.00% GC)
  median time:      18.289 ms (0.00% GC)
  mean time:        18.201 ms (0.00% GC)
  maximum time:     20.137 ms (0.00% GC)
  --------------
  samples:          275
  evals/sample:     1

And we see that now immutable approach is a bit faster as we move less data.

Conclusions

In general working with mutable code is easier than working with immutable code. However, if you are working with performance critical code, avoiding using mutable objects is one of the first recommendations to consider when trying to optimize it. Also Setfield.jl (and upcoming Accessors.jl) make working with immutable objects relatively smooth.