Split overlapping_{inherent,trait}_impls
This yielded some perf improvement for me. Reduces some calls to `impl_trait_header` query. But I think the llvm optimization is more relevant.
For high-level intro to how type checking works in rustc, see the type checking chapter of the rustc dev guide.