Fixing (and Preventing) the Affiliation Data Mess

"MIT", "MIT Sloan", "Sloan School of Management", "Massachusetts Institute of Technology" — these all refer to the same institution, but free-text affiliation fields create endless variations. This messy data causes real problems for conference management.

The Hidden Cost of Inconsistent Affiliations

When affiliations are entered as free text, the same institution can appear dozens of different ways:

"CMU", "Carnegie Mellon", "Carnegie Mellon University"
"NUS", "National University of Singapore", "NUS Business School"
"LSE", "London School of Economics", "London School of Economics and Political Science"

This inconsistency ripples through the system:

Conflict of Interest Detection — If two authors list "Stanford" and "Stanford University", are they from the same institution? Automated COI checks may miss the connection, potentially assigning reviewers to papers from their own colleagues.

Reviewer Discovery — When chairs search for reviewers by institution, they may miss qualified candidates simply because the affiliation was entered differently.

Analytics & Reporting — Conference statistics on institutional representation become unreliable. Is "UC Berkeley" the same as "University of California, Berkeley"? The data can't tell you.

Author Profiles — The same researcher might appear with different affiliations across submissions, making it harder to track contributions and build accurate author profiles.

Our Solution: AI-Assisted Affiliation Matching

We built a system that links user-entered affiliations to the Research Organization Registry (ROR) — a community-led, open registry of research organizations worldwide. ROR provides unique, persistent identifiers for institutions, solving the consistency problem at its root.

Here's how it works:

1. Smart Matching with Country Context

When a user enters their affiliation, we now ask for their country first. This simple UX change dramatically improves matching accuracy.

Consider "CMU" — searching globally returns matches from multiple countries. But when we know the user selected "United States", Carnegie Mellon University rises to the top immediately. Same information from the user, just collected in a smarter order.

2. AI Verification with Human Oversight

For unmatched affiliations, we use AI-assisted matching:

ROR API Search — We query the ROR registry with the user's affiliation text, filtered by their country when available
AI Verification — GPT-5 with web search verifies the match and identifies the correct institution level (e.g., "MIT Sloan School of Management" should map to "Massachusetts Institute of Technology")
Human Review — An admin reviews each match before approval, ensuring additional accuracy

3. Privacy by Design

We're careful about what data we use when using OpenAI API:

Only public information — Name, affiliation text, country
Email domain hints — Just the part after @ (e.g., "stanford.edu" helps confirm Stanford University)
No private data — No private data in the system is used

When a match is approved, users receive a friendly email explaining what changed and why, with a link to their ROR record. They can update their affiliation anytime if something doesn't look right.

Internal admin interface with testing data

Account Settings — Update your affiliation and other profile settings
COI Check — How PaperFox detects and manages COI

The Hidden Cost of Inconsistent Affiliations

Our Solution: AI-Assisted Affiliation Matching

1. Smart Matching with Country Context

2. AI Verification with Human Oversight

3. Privacy by Design

Related Docs