I think the most common type of question related to DataFrames.jl package is that users are reporting that some functionality does not work as documented.
Sometimes it is indeed a bug but in the majority of cases the reason is that the user does not have a correct version of the package installed. In this post I discuss several ways of checking the version of the package one has in a current project environment.
This post was written under Julia 1.6.0-rc1, DataFrames.jl 0.22.5 and Chain.jl 0.4.4 (so as usual you can expect some exercises in using DataFrames.jl).
If you are in an interactive mode then you have two basic options. The first one
uses package manager mode. Press
] in Julia REPL and then write the following:
(bkamins) pkg> status Status `~/Project.toml` [4c9194b5] ABCDGraphGenerator v0.1.0 `https://github.com/bkamins/ABCDGraphGenerator.jl#master` [8be319e6] Chain v0.4.4 [9a962f9c] DataAPI v1.6.0 `~/.julia/dev/DataAPI` [a93c6f00] DataFrames v0.22.5
or if you are interested in a particular package do:
(bkamins) pkg> status DataFrames Status `~/Project.toml` [a93c6f00] DataFrames v0.22.5
In the above output it is worth to note two common scenarios:
- ABCDGraphGenerator.jl is tracking
masterbranch of a GitHub repository as a source (so it means it was not installed from Julia registry);
- DataAPI.jl is checked out for development (using
devcommand) and the package is tracking a local folder.
Alternatively we could have generated the same outputs using API like this:
julia> using Pkg julia> Pkg.status() Status `~/Project.toml` [4c9194b5] ABCDGraphGenerator v0.1.0 `https://github.com/bkamins/ABCDGraphGenerator.jl#master` [8be319e6] Chain v0.4.4 [9a962f9c] DataAPI v1.6.0 `~/.julia/dev/DataAPI` [a93c6f00] DataFrames v0.22.5 julia> Pkg.status("DataFrames") Status `~/Project.toml` [a93c6f00] DataFrames v0.22.5
The downside of both approaches is that they produce information to the screen. However, often one is interested in processing programmatically the installed packages status.
Pkg.dependencies function returns a dictionary mapping package UUIDs
to information about them. As you can check in the documentation string
of the function the available information is stored in the following fields:
name: the name of the package
version: the version of the package (this is Nothing for stdlibs)
is_direct_dep: the package is a direct dependency
is_tracking_path: whether a package is directly tracking a directory
is_pinned: whether a package is pinned
source: the directory containing the source code for that package
dependencies: the dependencies of that package as a vector of UUIDs
Pkg.dependencies we can easily write a function that returns a version
of the package. Here is an example:
julia> using Chain julia> get_pkg_version(name::AbstractString) = @chain Pkg.dependencies() begin values [x for x in _ if x.name == name] only _.version end get_pkg_version (generic function with 1 method) julia> get_pkg_version("DataFrames") v"0.22.5"
Here is another example getting summary statistics about installed packages in a data frame:
julia> using DataFrames julia> get_pkg_status(;direct::Bool=true) = @chain Pkg.dependencies() begin values DataFrame direct ? _[_.is_direct_dep, :] : _ select(:name, :version, [:is_tracking_path, :is_tracking_repo, :is_tracking_registry] => ByRow((a, b, c) -> ["path", "repo", "registry"][a+2b+3c]) => :tracking) end get_pkg_status (generic function with 1 method) julia> get_pkg_status() 4×3 DataFrame Row │ name version tracking │ String Union… String ─────┼─────────────────────────────────────── 1 │ DataAPI 1.6.0 path 2 │ DataFrames 0.22.5 registry 3 │ Chain 0.4.4 registry 4 │ ABCDGraphGenerator 0.1.0 repo
As you see I have selected to provide only the most essential information about packages in the output: name, version and whether package is tracking registry, local path, or external repository.
If you would pass
direct=false you get information about all available
packages (direct and indirect dependencies of the project). It is usually not
very useful, however, as the list tends to be long, as you can see here:
julia> get_pkg_status(direct=false) 62×3 DataFrame Row │ name version tracking │ String Union… String ─────┼──────────────────────────────────────────────── 1 │ OrderedCollections 1.4.0 registry 2 │ LibSSH2_jll registry 3 │ Statistics registry 4 │ ArgTools registry 5 │ Compat 3.25.0 registry 6 │ Reexport 1.0.0 registry 7 │ SharedArrays registry ⋮ │ ⋮ ⋮ ⋮ 56 │ Dates registry 57 │ MbedTLS_jll registry 58 │ Serialization registry 59 │ IteratorInterfaceExtensions 1.0.0 registry 60 │ Libdl registry 61 │ Artifacts registry 62 │ InteractiveUtils registry 48 rows omitted
I hope you might find these patterns useful in your work with the Julia language.
Before finishing, let me mention one other case that you might occasionally need. The above examples show you the version of the package in your current project environment. However, in one Julia session you can change active project environment many times. If you would be interested in getting information about a version of the currently loaded package here is the way to do it (this will not work for packages from stdlib as they are bundled with Julia and have a fixed version):
julia> Pkg.TOML.parsefile(joinpath(pkgdir(DataFrames), "Project.toml"))["version"] "0.22.5"
Let us check that indeed the loaded version does not change if we change project environment:
(bkamins) pkg> status DataFrames Status `~/Project.toml` [a93c6f00] DataFrames v0.22.5 (bkamins) pkg> add DataFrames@0.21 Resolving package versions... Updating `~/Project.toml` [a93c6f00] ↓ DataFrames v0.22.5 ⇒ v0.21.8 Updating `~/Manifest.toml` [324d7699] ↓ CategoricalArrays v0.9.3 ⇒ v0.8.3 [a8cc5b0e] - Crayons v4.0.4 [a93c6f00] ↓ DataFrames v0.22.5 ⇒ v0.21.8  - Formatting v0.4.2 [2dfb63ee] ↓ PooledArrays v1.1.0 ⇒ v0.5.3 [08abe8d2] - PrettyTables v0.11.1 [189a3867] ↓ Reexport v1.0.0 ⇒ v0.2.0 Progress [========================================>] 3/3 ? DataFrames 2 dependencies successfully precompiled in 2 seconds (21 already precompiled) 1 dependency failed but may be precompilable after restarting julia (bkamins) pkg> status DataFrames Status `~/Project.toml` [a93c6f00] DataFrames v0.21.8 julia> Pkg.TOML.parsefile(joinpath(pkgdir(DataFrames), "Project.toml"))["version"] "0.22.5"
and we see that although project version of the package is changed the loaded version remains the same.