We describe algorithms for symbolic reasoning about executable models of type systems, supporting three queries intended for designers of type systems. First, we check for type soundness bugs and synthesize a counterexample program if such a bug is found. Second, we compare two versions of a type system, synthesizing a program accepted by one but rejected by the other. Third, we minimize the size of synthesized counterexample programs.

These algorithms symbolically evaluate typecheckers and interpreters, producing formulas that characterize the set of programs that fail or succeed in the typechecker and the interpreter. However, symbolically evaluating interpreters poses efficiency challenges, which are caused by having to merge execution paths of the various possible input programs. Our main contribution is the Bonsai tree, a novel symbolic representation of programs and program states which addresses these challenges. Bonsai trees encode complex syntactic information in terms of logical constraints, enabling more efficient merging.

We implement these algorithms in the BONSAI tool, an assistant for type system designers. We perform case studies on how BONSAI helps test and explore a variety of type systems. BONSAI efficiently synthesizes counterexamples for soundness bugs that have been inaccessible to automatic tools, and is the first automated tool to find a counterexample for the recently discovered Scala soundness bug SI-9633.