Introduction

Julia is a powerful language giving a lot of flexibility to developers. This flexibility was one of the factors that convinced me to use Julia as my go-to language in my work.

Expert advice

(source: https://www.facebook.com/nixcraft/photos/a.431194973560553/3484925554854131)

However, with great power comes great responsibility. New Julia users are often overwhelmed by how much there is to learn: type system, multiple dispatch, memory management, multi-threading, advanced indexing, …

This post is aimed at new users of Julia. Its goal is to present my personal perspective how I think people should try to learn and understand Julia. In short, I recommend you do not try to follow all advice from Julia experts on idiomatic Julia code as it could quickly get overly complicated. Instead write code that you can easily understand that follows few fundamental principles (like avoiding global variables or writing type stable code). The advanced tips related to writing fully generic code are mostly relevant if you want to start writing packages in Julia, which is not what new users start with.

I will try to explain you this issue by discussing how I approach 1-based indexing in Julia.

The post was written under Julia 1.8.0 and OffsetArrays.jl 1.12.7.

What is 1-based indexing?

If you read the Julia manual you learn that:

In Julia, indexing of arrays, strings, etc. is 1-based not 0-based.

This means that if you have some collection, like for example a Vector or a String its first element has index of 1. Here is an example:

julia> v = [5, 7, 13]
3-element Vector{Int64}:
  5
  7
 13

julia> v[1]
5

julia> firstindex(v)
1

julia> s = "abc"
"abc"

julia> s[1]
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

julia> firstindex(s)
1

The firstindex function in Julia informs us what is a firs index in a collection. In both cases it returned 1.

If you are a new Julia user then I advise you that you assume that all standard collections in Julia use 1 as a starting index. So, for example, the following code for calculation of the sum of some vector is in my opinion acceptable (assuming we want to accept only non-empty collections):

julia> function mysum(v::AbstractVector)
           n = length(v)
           n == 0 && throw(ArgumentError("the collection is empty"))
           s = v[1]
           for i in 2:n
               s += v[i]
           end
           return s
       end
mysum (generic function with 1 method)

julia> mysum(v)
25

Does Julia use 1-based indexing

However, when you show such a code to an experienced Julia developer you will very quickly get a feedback that it is incorrect as vectors in Julia do not have to use 1-based indexing. Instead you might get a recommendation of the code that looks, for example, like this:

julia> function mysum2(v::AbstractVector)
           n = length(v)
           n == 0 && throw(ArgumentError("the collection is empty"))
           s = v[begin]
           for i in firstindex(v)+1:lastindex(v)
               s += v[i]
           end
           return s
       end
mysum2 (generic function with 1 method)

julia> mysum2(v)
25

This code will work with any AbstractVector and correctly handle non 1-based cases. Let us check it by creating a non 1-based vector:

julia> using OffsetArrays

julia> ov = OffsetArray([5, 7, 13], -2)
3-element OffsetArray(::Vector{Int64}, -1:1) with eltype Int64 with indices -1:1:
  5
  7
 13

julia> firstindex(ov)
-1

julia> lastindex(ov)
1

julia> ov[-1]
5

Now check both functions we have defined above:

julia> mysum(ov)
ERROR: BoundsError: attempt to access 3-element
OffsetArray(::Vector{Int64}, -1:1) with eltype Int64
with indices -1:1 at index [2]

julia> mysum2(ov)
25

Indeed, as expected mysum does not work correctly, but mysum2 is OK.

So why do I say that the mysum function is acceptable. There are the following reasons:

  • In practice you are very unlikely to encounter OffsetArrays.jl, and if you will, you will know it; all standard collections in Julia are 1-based.
  • mysum function is much easier to write and reason about than mysum2. To write the mysum function you need to have only a minimal knowledge of Julia. To write mysum2 you need to know much more.
  • Even if you pass a vector that is not 1-based to mysum then in the worst case what you will get is an error, so in this case you will not get a wrong result silently. (of course in general there could be a case that you silently obtained an incorrect result, but this is not the case when you want to iterate all elements of a vector, which is the most frequent use-case).
  • Descriptions of many algorithms in books assume 1-based indexing, so it is easy to port them to code if we assume a fixed first index. Of course it is usually relatively easy to change the algorithm to support other indexing base, but it requires mental effort.

Now, if you want extra safety you can add a call to the Base.require_one_based_indexing function on top of your function like this:

julia> function mysum3(v::AbstractVector)
           Base.require_one_based_indexing(v)
           n = length(v)
           n == 0 && throw(ArgumentError("the collection is empty"))
           s = v[1]
           for i in 2:n
               s += v[i]
           end
           return s
       end
mysum3 (generic function with 2 methods)

julia> mysum3(ov)
ERROR: ArgumentError: offset arrays are not supported
but got an array with index other than 1

Now you are sure that you only work with vectors supporting 1-based indexing and using other vectors throw an informative error.

Practical examples of handling of 1-based indexing

Let me give you two practical examples what is done in some popular packages.

First let us look at the shuffle! function implementation in the Random module:

function shuffle!(r::AbstractRNG, a::AbstractArray)
    require_one_based_indexing(a)
    n = length(a)
    n <= 1 && return a # nextpow below won't work with n == 0
    @assert n <= Int64(2)^52
    mask = nextpow(2, n) - 1
    for i = n:-1:2
        (mask >> 1) == i && (mask >>= 1)
        j = 1 + rand(r, ltm52(i, mask))
        a[i], a[j] = a[j], a[i]
    end
    return a
end

Note two things. First, although in general it would be possible to re-write this code to accept non 1-based indices Julia developers decided not to do so (I assume this is to make the code easier to read). However, there is more to it. The second point is that I have never heard anyone complain that shuffle! implementation should be changed. In general, of course it could, but the reality is that during the last 10 years of Julia usage it seems that this functionality gap was not hurting anyone enough to warrant fixing this design flaw.

The second example is from DataFrames.jl. Some time ago we got a request to correctly handle non 1-based indexed vectors as columns of a DataFrame. We could have done one of two things:

  1. allow non 1-based vectors and adjust all code to support this change;
  2. disallow non 1-based vectors and throw an error if the user tries to put such a vector in a DataFrame.

Again, we could have went the more flexible path of allowing non 1-based vectors, but I preferred not to do so. The reasons are:

  1. We would need to increase the complexity of the API of DataFrames.jl and additionally our users would constantly need to think and check if some data frame uses 1-based or some other indexing;
  2. Internal code would be more complex. Any change in the DataFrames.jl package would need to be designed in a way that correctly handles non 1-based indexing;
  3. All tests in DataFrames.jl would need to be adjusted to add coverage of non 1-based indexed case for all functions.

I thought it was too much effort for too little benefit. It is much easier, both for users and for developers to assume and enforce that we only accept as columns vectors that use 1-based indexing.

Conclusions

When learning Julia I recommend you try to keep it simple. Learn the functionality of Base Julia and how fundamental types defined in it work.

Do not try to write overly general code or try to increase the speed of code too early (e.g. by using the @inbounds macro - usually using it does not give much benefit and at the same time introduces the risk of bugs in your code).

What I recommend is writing simple code but making sure it is correct for the cases you expect to encounter. In particular make sure you write tests of your code (it is really easy in Julia given the functionality of the Test module).

After you get comfortable with using Julia you will learn that most of the things you assume how it works can be changed. Then you will be ready to go to the next level and write a fully generic code if you decide you need to.

In summary, assuming that Julia uses 1-based indexing is a safe enough bet that is not likely to hurt you in practice (even if it is not 100% correct in general). Therefore, when you learn Julia I recommend you make this assumption and just enjoy the language. You will have time to learn advanced design patterns later, after you get comfortable with the language.