Introduction

As I have recently written in this post the Julia for Data Analysis book that I am preparing is already available in MEAP preview. Manning is releasing its chapters sequentially (approximately one chapter per several weeks). I thought that when some chapter gets released in MEAP I will write on the blog about related things that are not included in the book.

This week chapter 3 was released. It includes discussion of multiple dispatch in Julia. Therefore, in this post I thought that I will comment on some specific cases of method dispatch that I found important to understand.

This post was tested under Julia 1.7.2.

Dispatch with Unions

In the Julia Manual you can read in the Methods section that:

When a function is applied to a particular tuple of arguments, the most specific method applicable to those arguments is applied.

It is important to remember that this rule also applies to Unions. Why is this relevant?

Consider that you want to add a method to a function that already has the following methods defined:

fun(x, y) = "slow"
fun(x::Int, y::Number) = "faster"

Your additional method is very fast, but expects y to be Int and x to be either Int or Float64. The first approach would be to add the following definition:

fun(x::Union{Int, Float64}, y::Int) = "fastest"

However, it will lead to dispatch ambiguity:

julia> fun(1, 1)
ERROR: MethodError: fun(::Int64, ::Int64) is ambiguous. Candidates:
  fun(x::Int64, y::Number) in Main at REPL[2]:1
  fun(x::Union{Float64, Int64}, y::Int64) in Main at REPL[3]:1
Possible fix, define
  fun(::Int64, ::Int64)

The reason is that we have two conflicting methods that could be applied to this set of arguments and neither of them is more specific. This is an expected behavior from a dispatch perspective. However, it means that you need to duplicate code and instead of fun(x::Union{Int, Float64}, y::Int) = "fastest" write two definitions:

fun(x::Float64, y::Int) = "fastest"
fun(x::Int, y::Int) = "fastest"

and now all will work as expected. The point is that using Union is not the same as writing several separate definitions “unrolling” the Union from a dispatch perspective.

However, there is one slight problem. In our case the definition of fun was quite short, but what if it were 100 LOC? Doing copy-paste would lead to a lot of code duplication (which is not recommended). One of the solutions in such cases is to use the @eval macro and do the required definitions like this:

for T in (Int, Float64)
    @eval fun(x::$T, y::Int) = "fastest"
end

Dispatch with keyword arguments

In the Julia Manual you can read in the Note on Optional and keyword Arguments section that:

Keyword arguments behave quite differently from ordinary positional arguments. In particular, they do not participate in method dispatch. Methods are dispatched based only on positional arguments, with keyword arguments processed after the matching method is identified.

The reality is that in some cases keyword arguments are taken into account when deciding about dispatch. This situation happens if you have some methods that define keyword arguments and some other methods that do not define them.

Again, let me illustrate it with an example. I define the following methods:

julia> g(x::Int) = "int"
g (generic function with 1 method)

julia> g(x; kwarg) = "any"
g (generic function with 2 methods)

julia> g(1)
"int"

julia> g(1; kwarg=nothing)
"any"

In this case the fact that one method had no keyword arguments and the other had a keyword argument influenced dispatch.

However, consider the following scenario, which is similar:

julia> h(x::Int; kwarg2) = "int"
h (generic function with 3 methods)

julia> h(x; kwarg) = "any"
h (generic function with 4 methods)

julia> h(1; kwarg=nothing)
ERROR: UndefKeywordError: keyword argument kwarg2 not assigned

This time, since the first method defined some keyword argument dispatch selected it, and next produced an error since the name of the keyword argument was incorrect.

Let me give you one more example that is related to methods design with keyword arguments. It happens in practice so it is worth remembering.

julia> f(args...) = "positional"
f (generic function with 1 method)

julia> f(; kwargs...) = "keyword"
f (generic function with 2 methods)

julia> methods(f)
# 2 methods for generic function "f":
[1] f(; kwargs...) in Main at REPL[2]:1
[2] f(args...) in Main at REPL[1]:1

As you can see these definitions indeed result in two separate methods for f. Now, a question to you is what will happen if you write f()?

As you might have predicted the keyword argument method is invoked because it is more specific in terms of positional arguments:

julia> f()
"keyword"

Conclusions

The examples I have chosen are not artificial. All of them were relevant (and lead to bugs in development) when I was working on DataFrames.jl.

I hope you will find them useful. I omit them in my book as I feel they would distract readers’ attention, but once one goes deeper into Julia they can be encountered in practice.