Fixed checkpoint weight-tying to resolve tie_word_embeddings top-level-first, matching Hugging Face tying behavior.
Merged PR fixing checkpoint weight-tying so tie_word_embeddings is resolved top-level-first, matching Hugging Face’s tying semantics.