Skip to content
Jason Gross edited this page Aug 24, 2024 · 8 revisions

(Part of the Coq FAQ)

What can I do when Qed. is slow?

To debug a slow Qed, you can write Set Debug "backtrace". Timeout 1 Qed. to determine what component of the type checker is responsible (for example, whether the issue is universes or something else).

To minimize what part of the proof is causing the issue, you can do a sort-of manual binary search on the proof script to find out what tactic invocation is responsible for slowness. To do this, add something like Axiom admit : forall {T}, T. before your proof and then try all: exact admit. Unshelve. all: exact admit. Time Qed. at various points in the proof.

A Qed that is much slower than the tactic script generating the proof is almost always either

  1. conversion not recorded in the proof trace (especially simpl / cbn);
  2. an explosion of universes;
  3. the guard checker.

Conversion not recorded in the proof trace

Coq's kernel uses a different algorithm for checking when two terms are equal than most tactics do; it is closest to the lazy tactic. The tactics vm_compute and native_compute when used on the goal only leave a cast node in the proof term instructing the kernel to use the corresponding reduction machines to fully reduce the term. The tactics simpl, cbn, lazy, cbv, compute, hnf, unfold, and red (but not change, see Issue #13984) when used on the goal only will leave a cast in the proof term. This cast records the type of the goal prior to reduction, but does not record the reduction strategy used nor the result of reducing (which is often inferable from the remaining proof term). Hence lazy, and to a lesser extent cbv, compute, and unfold are relatively safe to use when applied to the goal only. (By contrast, the reduction strategy of simpl and cbn is so far from the reduction strategy used by the kernel that merely recording the start and end states of reduction is insufficient for preventing blowup.) All other use of reduction tactics (simpl, cbn, change, hnf, red, vm_compute in H, native_compute in H, as well as lazy, cbv, compute, and unfold when used in hypotheses, and to a lesser extent when used on the goal) are prime candidates for Qed-slowdown. As a rule of thumb, if your Qed is slow, most of the time the culprit will be a call to simpl or cbn.

When the issue of slowness is due to reducing in a hypothesis, you can replace calls to simpl in H with let T := type of H in let T' := (eval simpl in T) in replace T with T' in H by reflexivity. Similarly, instead of doing vm_compute in H, do let T := type of H in let T' := (eval vm_compute in T) in replace T with T' in H by (vm_compute; reflexivity). The reason these work is that they separate out the reduction problem from the rest of the proof term, and limit the scope of the conversion problem Coq has to solve. Similarly, if you need to simplify only a subterm of your goal, rather than running simpl on the entire goal, you can do something like let x := constr:(term to simplify) in let x' := (eval simpl in x) in replace x with x' by reflexivity.

If you want to pay the cost of Qed upfront rather than later on, you can use use the abstract tactic, as in abstract (vm_compute; reflexivity) (or, better, match goal with |- ?A = _ => abstract (vm_cast_no_check (eq_refl A)) end so that vm_compute is run only once) which makes as if you had stated some local lemma.

If you know exactly how the kernel is going wrong in its lazy algorithm, you can apply a more fine-grained surgical approach to solving this problem. If you know which definitions are being unfolded too early or too late, you can issue Strategy commands before starting your proof (or before Qed), such as Strategy 100 [id] or Strategy -100 [id] to tell the conversion algorithm when to unfold the identifiers between the square brackets. By default, all identifiers are at strategy level 0; higher numbers get unfolded later than lower numbers. See the reference manual for more details on Strategy. One quick-and-easy way to use strategy in conjunction with cbv/lazy is that, so long as you always use the same whitelist/blacklist, you should issue Strategy commands before your proof that set all identifiers in your whitelist(s) to some negative number lower than the strategy of any other identifier used in your goal/statement/proof, and that set all identifiers in your blacklist(s) to some positive number higher than the strategy of any other identifier used in your goal/statement/proof.

More discussion of these suggestions can be found on Zulip's topic "How to Debug slow QED?".

If you find yourself dabbling in the arcane art of performance-optimizing the conversion problems sent to the kernel, be aware that the conversion algorithm is not symmetric; it can be sensitive to which term is on the left and which is on the right. On important trick to learn is how to control whether the kernel sees the conversion problem G' ≡ G or the conversion problem G ≡ G'. For example,

Fixpoint fact (n : nat) := match n with 0 => 1 | S n' => n * fact n' end.
Axiom P : nat -> Prop.
Check (fun (H : P (id fact 10))
       => (fun (f : P (fact 10) -> P (fact 10)) => (f : P (id fact 10) -> P (fact 10)))
            (fun x => x)
            H)
  : P (id fact 10) -> P (fact 10). (* fast *)
Check (fun (H : P (id fact 10))
       => (H : P (fact 10)))
  : P (id fact 10) -> P (fact 10). (* very slow *)

An explosion of universes

This will only happen if you have Set Universe Polymorphism or are using libraries which Set Universe Polymorphism. The only part of the standard library which currently uses universe polymorphism (as of the writing of this sentence) is setoid rewriting in Type (setoid rewriting in Prop does not), i.e., CMorphisms, CEquivalence, and CRelationClasses.

Set Printing Universes is your friend in debugging these problems. If you're seeing more than a couple hundred universes per definition, then this may well be your culprit. At a couple hundred thousand universes in a single definition, the universe checking algorithm tends to slow to a crawl, and even a couple thousand universes may be enough to make your Qed take tens of seconds rather than fractions of a second.

The only advice the author of this section (Jason Gross) has to offer here is to try tracking and minimizing universes carefully. I'm not aware of any silver bullets.

The guard checker

If your fixpoint has many arguments and no annotation which one is the structurally decreasing argument, it might take the guardedness checker a long time to figure it out by itself. If that is your problem a quick fix is to add an annotation like {struct x} to the end of the Fixpoint declaration where x is the relevant argument that gets smaller.

Otherwise this will likely come up if you define a Fixpoint (or CoFixpoint) in proof mode, or use the fix or cofix tactics. You can check whether or not the guard checker is your problem by inserting Time Guarded. before Qed. If Guarded is slow, then this is the culprit (but even if the command is fast the guard checker might still be responsible because they behave slightly different, see message on Zulip). A common issue is that fixpoint refolding is slow, especially on fixpoints with large bodies (see Issue #11887). An easy first step to fixing a Qed that is slow due to the guard checker is to factor out the body of the fixpoint. Instead of writing

Fixpoint F (args : argsT) : T.
Proof. <tactic script> Qed.

write

Definition F_body (F : forall args : argsT, T) (args : argsT) : T.
Proof. <tactic script> Defined.
Fixpoint F (args : argsT) : T := F_body F args.

Use F_body (@F) args instead if your first argument is implicit and maximally inserted (if your first argument is between curly braces ({ })). In most cases, this simple factoring is enough to fix issues with the guard checker being slow. Additionally, this factoring improves both the performance and usability of the fixpoints in other proofs, performance because of Issue #11887 and usability because it's easier to control their reduction.

If this factoring is insufficient, the next thing to try is to move recursive calls to be as early as possible and to hide as much computation as possible behind definitions which do not get access to the unguarded recursive function. This way, the term the kernel needs to check for guardedness is as small as possible.

What if Qed is still slow for some reason that is not one of these three?

Please ask about it on Zulip and consider tagging Jason Gross, who will be interested in expanding his knowledge of what makes Coq slow.

Why does Reset Initial. not work when using coqc?

The initial state corresponds to the state of coqtop when the interactive session began. It does not make sense in files to compile.

What can I do if I get "No more subgoals but non-instantiated existential variables"?

This means that eauto or eapply didn’t instantiate an existential variable which eventually got erased by some computation. You may backtrack to the faulty occurrence of eauto or eapply and give the missing argument a explicit value. Alternatively, you can use the commands Show Existentials. and Existential. to display and instantiate the remaining existential variables.

Coq < Lemma example_show_existentials : forall a b c:nat, a=b -> b=c -> a=c.

1 subgoal

  ============================
  forall a b c : nat, a = b -> b = c -> a = c

example_show_existentials < Proof.

1 subgoal

  ============================
  forall a b c : nat, a = b -> b = c -> a = c

example_show_existentials < intros.

1 subgoal

  a, b, c : nat
  H : a = b
  H0 : b = c
  ============================
  a = c

example_show_existentials < eapply eq_trans.

2 focused subgoals
(shelved: 1)

  a, b, c : nat
  H : a = b
  H0 : b = c
  ============================
  a = ?y

subgoal 2 is:
  ?y = c

example_show_existentials < Show Existentials.
Existential 1 =
?y : [a : nat b : nat c : nat H : a = b H0 : b = c |- nat]
Existential 2 =
?Goal : [a : nat b : nat c : nat H : a = b H0 : b = c |- a = ?y]
Existential 3 =
?Goal0 : [a : nat b : nat c : nat H : a = b H0 : b = c |- ?y = c]

example_show_existentials < eassumption.

1 subgoal

  a, b, c : nat
  H : a = b
  H0 : b = c
  ============================
  b = c

example_show_existentials < assumption.
No more subgoals.

example_show_existentials < Qed.
example_show_existentials is defined

What can I do if I get “Cannot solve a second-order unification problem”?

You can help Coq using the pattern tactic.

I copy-paste a term and Coq says it is not convertible to the original term. Sometimes it even says the copied term is not well-typed.

This is probably due to invisible implicit information (implicit arguments, coercions and Cases annotations) in the printed term, which is not re-synthesised from the copied-pasted term in the same way as it is in the original term.

Consider for instance (@eq Type True True). This term is printed as True=True and re-parsed as (@eq Prop True True). The two terms are not convertible (hence they fool tactics like pattern).

There is currently no satisfactory answer to the problem. However, the command Set Printing All is useful for diagnosing the problem.

Due to coercions, one may even face type-checking errors. In some rare cases, the criterion to hide coercions is a bit too loose, which may result in a typing error message if the parser is not able to find again the missing coercion.

What can I do when setoid_rewrite hangs?

There are some suggestions for debugging typeclass logs (accessible via Set Typeclasses Debug or Typeclasses eauo := debug) on StackExchange, in Issue #6141, and in Issue #3730.

Clone this wiki locally