How much do collections of allocated objects cost?
Introduction
In my recent comment on StackOverflow I have said that using a vector of vectors is: a) slow, b) uses more memory than needed, and c) puts much more stress on garbage collection. Having read it one of my students asked me to expand on this issue. In this post I want to give you some examples that were designed to clarify this issue.
The post was tested under Julia 1.6.1 on Linux with a machine having 16 GB of RAM (the last point affects the frequency of triggering GC).
In the post we will compare the same operations for vector of vectors and vector of tuples defined using the following functions:
test1(n) =[rand(2) for _ in 1:n]
test2(n) =[(rand(), rand()) for _ in 1:n]
The difference is that test1
creates a vector of references to dynamically
allocated objects, while in test2
the tuples are stored directly in the vector.
Let us test all three claims. In the comparisons I use a fresh session in each code block (as the examples given use a lot of memory). This means that the timings will include compilation time, but the size of the computations is large enough so that this is relatively negligible.
Examples
Start with a vector of vectors:
julia> test1(n) =[rand(2) for _ in 1:n]
test1 (generic function with 1 method)
julia> @time t1 = test1(10^8);
21.907014 seconds (100.00 M allocations: 9.686 GiB, 67.93% gc time)
julia> @time sum(x -> x[1], t1);
0.768779 seconds (179.18 k allocations: 11.105 MiB, 7.44% compilation time)
julia> @time Base.summarysize(test1(10^7))
262.832256 seconds (50.06 M allocations: 2.864 GiB, 4.58% gc time, 0.02% compilation time)
640000040
julia> @time GC.gc()
4.631474 seconds (100.00% gc time)
julia> @time GC.gc()
2.647404 seconds (100.00% gc time)
julia> @time GC.gc(false)
0.004821 seconds (99.89% gc time)
julia> @time GC.gc(false)
0.005526 seconds (99.89% gc time)
And now vector of tuples (fresh Julia session):
julia> test2(n) =[(rand(), rand()) for _ in 1:n]
test2 (generic function with 1 method)
julia> @time t2 = test2(10^8);
1.458052 seconds (25 allocations: 1.490 GiB, 0.36% gc time)
julia> @time sum(x -> x[1], t2);
0.164072 seconds (178.23 k allocations: 11.036 MiB, 35.07% compilation time)
julia> @time Base.summarysize(test2(10^7))
0.190496 seconds (24.27 k allocations: 153.937 MiB, 3.25% gc time, 17.60% compilation time)
160000040
julia> @time GC.gc()
0.070696 seconds (99.99% gc time)
julia> @time GC.gc()
0.102222 seconds (99.99% gc time)
julia> @time GC.gc(false)
0.000517 seconds (99.13% gc time)
julia> @time GC.gc(false)
0.000523 seconds (98.97% gc time)
As you can see:
- creation of vector of vectors is much slower; in particular a lot of small allocations happens (which is expensive) and in total also more memory is allocated.
- a simple aggregation with
sum
is also slower because for vector of vectors we have to go through references to objects (which takes time) which also means that this is less CPU cache friendly. - With
Base.summarysize
we can check that using vectors also uses up much more memory; also as a side issue we learn that functions likeBase.summarysize
which traverse the tree of object references are much, much slower for a vector of vectors. - Finally both full sweep
GC.gc()
and incremental sweepGC.gc(false)
are slower with vector of vectors; this is especially visible for full sweep case (fortunately it is triggered less often in normal usage). The important thing to note here is that for garbage collection time to be affected it is enough that the vector of vectors is somewhere in the memory; it does not have to be used in some operation you do.
Conclusions
The conclusion is something that seasoned Julia developers know very well: avoid having many small allocated objects in your Julia programs. Having read this post I hope you now have a better understanding what aspects of the performance of your code can be affected when you have to use such data.
Before I finish let me add that collections of any mutable objects will be
affected by the same issue, and even some special immutable objects (like
String
, or to some extent e.g. Symbol
) can cause issues like presented
in this post.