Introduction

Today I want to write about a topic that is a quite tricky element of design of Julia. The issue is that it is sometimes hard to predict the element type of the output collection produced by an operation that transforms an input collection.

The description above looks complicated, but the problem is encountered in practice, so let me explain it by example.

The post was written under Julia 1.9.0-rc1.

A basic example

Assume you have some input collection [1, 2, 3] and you want to compute square root of all its elements.

Let us consider three standard ways how you can do it:

julia> x = [1, 2, 3]
3-element Vector{Int64}:
 1
 2
 3

julia> sqrt.(x)
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

julia> map(sqrt, x)
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

julia> [sqrt(v) for v in x]
3-element Vector{Float64}:
 1.0
 1.4142135623730951
 1.7320508075688772

As you can see in each case a proper element type, that is Float64, was determined for the returned collection.

This behavior is useful, as the user does not have to think about specifying the output element type. In fact, in combination with the transformation using the identity function, this behavior can be used to conveniently narrow down element type of some collection:

julia> y = Any[1, 2, 3]
3-element Vector{Any}:
 1
 2
 3

julia> identity.(y)
3-element Vector{Int64}:
 1
 2
 3

julia> map(identity, y)
3-element Vector{Int64}:
 1
 2
 3

julia> [v for v in y]
3-element Vector{Int64}:
 1
 2
 3

This pattern comes handy if we have input data that does not have a known element type, but later we want to perform element type narrowing when processing it (one of the major benefits of such narrowing is that processing vectors of Any values is slow so we typically want to avoid it).

The hard case

Automatic output element type detection works nice most of the time. Unfortunately, when we work with empty collections, it becomes hard to predict. Here is a simple example:

julia> String.([])
Any[]

julia> map(String, [])
String[]

julia> [String(v) for v in []]
Any[]

julia> string.([])
AbstractString[]

julia> map(string, [])
Any[]

julia> [string(v) for v in []]
Any[]

As you can see from it broadcating, map, and comprehension use a different set of rules to automatically determine the produced element type. These rules of course exist and could be learned, but the point is that the issue is non-trivial.

The problem is that when you are writing production code (e.g. you are developing a package) you want to be sure what the element type of the collection you produce will be, as often you cannot know upfront if the input collection the user is going to provide will be empty or not.

The solution I use

In situations when it matters what the element type of the collection produced by some transformation is going to be I use comprehensions with output element type annotation:

julia> [string(v) for v in []]
Any[]

julia> String[string(v) for v in []]
String[]

Such annotation has an additional consequence that it is going to perform conversion of the produced elements to the target type if needed:

julia> using Test

julia> s = ["a", GenericString("a")]
2-element Vector{AbstractString}:
 "a"
 "a"

julia> [string(v) for v in s]
2-element Vector{AbstractString}:
 "a"
 "a"

julia> typeof.([string(v) for v in s])
2-element Vector{DataType}:
 String
 GenericString

julia> String[string(v) for v in s]
2-element Vector{String}:
 "a"
 "a"

julia> typeof.(String[string(v) for v in s])
2-element Vector{DataType}:
 String
 String

Note that in the example prefixing the comprehension with String made sure that the result of the operation has String element type and all produced values have this type.

Element type widening

Let me comment on one common related operation. What if we want to initialize some container with a given value but we want its element type to be wider? This is not an artificial case - it often happens with missing (where we initialize some container with this value only to later replace missing with proper values).

Using the fill function is a first thing we might try:

julia> fill(missing, 3)
3-element Vector{Missing}:
 missing
 missing
 missing

However, the produced container has Missing element type which is not useful if we e.g. wanted to later also store integers in it.

One can use a comprehension annotated with a proper output element type instead:

julia> Union{Int, Missing}[missing for _ in 1:3]
3-element Vector{Union{Missing, Int64}}:
 missing
 missing
 missing

The pattern with missing is needed often enough that we have a custom function in the Missings.jl package that can be used to get the desired result more conveniently:

julia> using Missings

julia> missings(Int, 3)
3-element Vector{Union{Missing, Int64}}:
 missing
 missing
 missing

Conclusions

Fortunately, in interactive use the problem with setting of the proper element type for some collection does not occur often. However, when I write production programs I make sure to always think if I need to use comprehension with element type specification to ensure type stability of my code.