Skip to content

Reducing time to first model #448

Open
@nalimilan

Description

I've tried reducing latency ("time to first model") by enabling precompilation. Following the strategy adopted for DataFrames, I used SnoopCompile to extract all methods that are compiled when running the test suite:

using SnoopCompileCore
inf_timing = @snoopi include("test/runtests.jl")
using SnoopCompile
pc = SnoopCompile.parcel(inf_timing)
SnoopCompile.write("precompile_tmp.jl", pc[:GLM], always=true)

Unfortunately, the result is disappointing even for examples that are in the tests:

julia> using DataFrames, GLM
julia> @time begin
           data = DataFrame(X=[1,2,3], Y=[2,4,7])
           ols = lm(@formula(Y ~ X), data)
           show(stdout, "text/plain", ols)
       end
# Before
10.970611 seconds (28.70 M allocations: 1.659 GiB, 5.06% gc time, 2.04% compilation time)
# After
7.954707 seconds (19.61 M allocations: 1.109 GiB, 4.68% gc time, 2.84% compilation time)

julia> using DataFrames, GLM

julia> @time begin
           data = DataFrame(X=[1,2,2], Y=[1,0,1])
           probit = glm(@formula(Y ~ X), data, Binomial(), ProbitLink())
           show(stdout, "text/plain", probit)
       end
# Before
11.608224 seconds (29.52 M allocations: 1.704 GiB, 5.59% gc time, 7.11% compilation time)
# After
9.601293 seconds (23.09 M allocations: 1.319 GiB, 5.27% gc time, 10.48% compilation time)

This is probably due to two reasons:

  • Precompilation doesn't save everything (not machine code currently). Indeed, if I run the precompile directives directly in the session, I get slightly better timings (about 5s for the first example).
  • Most of the methods to precompile are in other modules. Actually, only 86 methods are in GLM, versus 280 in Base and 221 in StatsModels (other modules are marginal). Maybe precompiling some common methods in StatsModels would help a lot here.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions