Introduction

Julia users often want to squeeze-out maximum performance from their programs. In the search for efficiency, they soon discover the vec and reshape functions that allow for changing of the shape of the input array without copying data. In this post I want do discuss how these functions work and share with you the rules I use when deciding if I want to use them.

The post was written under Julia 1.7.2.

The contract

When you first learn some function you must look up its contract in its docstring. Let us check vec and reshape (I abbreviated the docstrings to focus on the key parts):

help?> vec
  vec(a::AbstractArray) -> AbstractVector

  Reshape the array a as a one-dimensional column vector.
  Return a if it is already an AbstractVector.
  The resulting array shares the same underlying data as a,
  so it will only be mutable if a is mutable,
  in which case modifying one will also modify the other.

help?> reshape
search: reshape promote_shape

  reshape(A, dims...) -> AbstractArray

  Return an array with the same data as A,
  but with different dimension sizes or number of dimensions.
  The two arrays share the same underlying data, so that the result is mutable
  if and only if A is mutable,
  and setting elements of one alters the values of the other.

In short, both functions allow you to change the shape of some array without copying of the data. vec always returns a vector, while reshape is more flexible and allows you to produce an array of any dimension.

Let me show you some use cases of these functions. First, assume I want to produce a cartesian product of two collections:

julia> collect(Iterators.product('a':'b', 1:3))
2×3 Matrix{Tuple{Char, Int64}}:
 ('a', 1)  ('a', 2)  ('a', 3)
 ('b', 1)  ('b', 2)  ('b', 3)

By default the collect function produced me a matrix. If for some reason I needed a vector instead I could write:

julia> vec(collect(Iterators.product('a':'b', 1:3)))
6-element Vector{Tuple{Char, Int64}}:
 ('a', 1)
 ('b', 1)
 ('a', 2)
 ('b', 2)
 ('a', 3)
 ('b', 3)

The important benefit of this operation is that vec is non-copying so adding this step is efficient. Let me give you another example, this time using broadcasting:

julia> string.(['a' 'b'], 1:3)
3×2 Matrix{String}:
 "a1"  "b1"
 "a2"  "b2"
 "a3"  "b3"

julia> vec(string.(['a' 'b'], 1:3))
6-element Vector{String}:
 "a1"
 "a2"
 "a3"
 "b1"
 "b2"
 "b3"

Now let us have a look at reshape:

julia> reshape(1:6, 2, 3)
2×3 reshape(::UnitRange{Int64}, 2, 3) with eltype Int64:
 1  3  5
 2  4  6

Why reshape would be useful? Consider for example a simple function changing pairs of consecutive elements of a vector into a tuple. One of the ways (for sure not the only way) to implement this would be:

julia> totuples(x::AbstractVector) = Tuple.(eachcol(reshape(x, 2, :)))
totuples (generic function with 1 method)

julia> totuples(1:6)
3-element Vector{Tuple{Int64, Int64}}:
 (1, 2)
 (3, 4)
 (5, 6)

The dangers

While the vec and reshape functions can be useful there are some risks of using them. Let me discuss some common pitfalls.

The first is that when you reshape a collection you may leave a permanent mark in the source that it was used in reshape (even though reshape has no ! as its suffix). This can lead to hard to catch bugs. Let us check the following code:

julia> x = [1, 2, 3, 4]
4-element Vector{Int64}:
 1
 2
 3
 4

julia> totuples(x)
2-element Vector{Tuple{Int64, Int64}}:
 (1, 2)
 (3, 4)

julia> push!(x, 5)
ERROR: cannot resize array with shared data

As you can see, although the use of reshape was done in the totuples function and the reshaped matrix we created there with reshape(x, 2, :) is already out of scope the fact that we used reshape on x permanently disallows its resizing.

The second risk is that vec and reshape may, or may not, create a new object, as they might just return a source object. Let us check the following code that extends the original totuples function to accept any AbstractArray. In the code I write y = vec(x), but the same behavior would be present with y = reshape(x, :).

julia> function totuples2(x::AbstractArray)
           y = vec(x)
           isodd(length(x)) && push!(y, last(y))
           return totuples(y)
       end
totuples2 (generic function with 1 method)

julia> x = [1;;]
1×1 Matrix{Int64}:
 1

julia> totuples2(x)
1-element Vector{Tuple{Int64, Int64}}:
 (1, 1)

julia> x
1×1 Matrix{Int64}:
 1

julia> x = [1]
1-element Vector{Int64}:
 1

julia> totuples2(x)
1-element Vector{Tuple{Int64, Int64}}:
 (1, 1)

julia> x
2-element Vector{Int64}:
 1
 1

As you can see our totuples2 function left [1;;] unchanged, since it is a matrix, but [1] was updated. The reason is that vec(x) in this case just returned its argument.

Finally, as an another application of the same rule, note that that the vec function (and similarly reshape when reshaping to a vector), may or may not produce a vector that can be resized:

julia> totuples2([1 2; 3 4])
2-element Vector{Tuple{Int64, Int64}}:
 (1, 3)
 (2, 4)

julia> totuples2(reshape(1:4, 2, 2))
2-element Vector{Tuple{Int64, Int64}}:
 (1, 2)
 (3, 4)

julia> totuples2([1;;])
1-element Vector{Tuple{Int64, Int64}}:
 (1, 1)

julia> totuples2(reshape(1:1, 1, 1))
ERROR: MethodError: no method matching resize!(::UnitRange{Int64}, ::Int64)

As you can see, this is tricky, as the function you use, like totuples2 in our case, might throw an error only in some cases, but work in other cases. In the totuples2 case the reason is that we use push! only if the length of the collection is odd.

The point of all these examples is that code using vec or reshape can lead to hard-to-diagnose errors. The reason is that you might notice the problems caused by using them much later in the code than when you used them.

Conclusions

The vec and reshape functions are nice utilities and I use them quite often. However, to safely use them I always follow the following two rules:

  • when writing a function never use reshape on an array that is an argument of the function; the reason is, that in some cases reshape will silently leave the “no resize” mark on the source vector; if I use reshape I make sure that its source is always some short lived object from a local scope;
  • do not resize the vector produced by vec/reshape as such resizing may, or may not affect the source and keeping a mental record if this is the case is hard (and most likely readers of such code will not be able to easily know it); as a softer rule I generally avoid any mutation of the output of vec/reshape as it is not guaranteed that it will be mutable.

In short: the best uses for vec and reshape are situations when their source is a short lived object and you do not want to mutate their output in any way.