01 // record linkage · LLM agent
Resume
Deduplication
for an ATS
Two-stage entity matching that refuses to pay the O(n²) tax. Cheap hard filters and blocking keys — phonetic, career-fingerprint and HNSW vector blocks — collapse the pairwise space before weighted fuzzy scoring ever decides a merge. I also built the candidate-matching funnel: applicants ranked against a role through cascading stages — hard filters → embeddings → reranker → AI-agent scoring — each stage cheaper and broader than the one after it. The canonical record is non-destructive and the scoring algorithm is versioned, so full source history survives and every result can be reprocessed as the model gets better.