Skip to content

A University of Washington-led research team created GovScape, an efficient search system for PDFs from the End of Term Web Archive. Users can look up exact keywords, like “FAFSA,” or use semantic search, which finds documents on a topic even if the exact search terms don’t appear on the page. Because researchers used highly efficient artificial intelligence models, processing the 10 million PDFs hosted online during Donald Trump’s first term costs less than $1,500, or about $1 per 47,000 pages.