ARCHIGNES

Searching. Unscrambled.

ARCHIGNES is a research & development lab building tools for exploring & evaluating search.

May 2024

An exciting month! Some significant progress on the ARCHIGNES toolset and gaining clarity on our approach. You can read these search highlights or jump to notes on Searchjunct, Searchevals, the ARCHIGNES core toolset, and more.

Search Highlights

Yangqing Jia sparked a discussion on Twitter with the query: [what did the most popular post in hacker news say today?]. This is a nice example of not only the difficulty of the task we've set for search systems, but of the difficulty of evaluating correctness (what was the 'right' answer?) and fullness (was it a full test of the relevant parameters or models?) of search systems.

The Google document leak and subsequent commentary, see this post from Brent Payne, was a reaffirmation of the work ARCHIGNES is set out to do: "bolstering user search complaints to make them legible and defensible", recalling Daniel's research on search quality complaints with Emma Lurie.

Initial reporting of the leak:

We replicated search results for a complaint about Brave's performance on [luke 16:3 kjv] , Google's performance with 'AI Overviews' for [felony murder], , and malicious Google's sponsored search results for [bitwarden].

We also saw the difficulty of evaluating the quality of search complaints (and the lack of support within the search systems for carefully comparing and contrasting generated responses with original sources) in a discussion about washing your washing machine.

We also saw gpt2-chatbot fail the Claude Shannon hallucination false premise test.

Perplexity released Perplexity Pages, which Daniel had been invited to beta test in late April. Early test queries included [Why are pineapples spikey?] and [Who created Jelly, what was it, and when was it launched? What was unique?]. Could perhaps still use more controls for human edits. Daniel shared a comment on Twittter:

I think pulling in users to help edit and curate new content makes a lot of sense (more interaction data to learn from, potential for more fine-grained feedback through edits). Another tool that has been exploring the "AI Wikipedia" / pages space is @ask_pandi's PandiPedia.
Screenshot of Google's knowledge graph

Followed Gagan Ghotra's lead to showcase some examples of mistakes in Google's knowledge graph, or what Lily Ray wondered might be a "Knowledge graph implosion?"

Screenshot of Google's knowledge graph

At the end of the month Daniel was interviewed by Will Knight at WIRED about Google's AI Overviews and the published article included a mention of the ARCHIGNES work: "Google's AI Overviews Will Always Be Broken. That's How AI Works" (May 31).

Searchjunct

We now have 55 search systems. We added only one in May: Simplicity (searchjunct.com/simplicity). Simplicity is a demo from Pongo of their semantic filtering within a RAG pipeline using Exa.ai search results. We still have the two default search shortcuts (/links and /beta). We received our first two GitHub issues from the community.

June

A key goal for the next month is improving the web page load time and ironing out bugs related to sharing links to queries. We also want to add easy suggested evaluation queries on the system pages, via EvalQueries, and directly link to relevant evaluations on SearchEvals.

Searchevals

At the end of May we had 135 evals (up from 52 in April). We converted the homepage marquee to a scrollable feed of evals.

We are refactoring our evals extract userscript, Searchevals.extract, and porting it into a Chrome extension. Currently only in private development but reach out if you want to try it unpacked. It includes basic OCR (via a separate API built with Tesseract.js) and simple LLM calls to extract queries and search systems from images of SERPs and texts of tweets, and a flow to facilitate the creation of eval objects.

June

A key goal for the next month is allowing user submissions from the website to go into a moderation queue. We will also be fixing bugs and UI issues in the search interface and updating SearchOnEvals works to include the full list of systems from Searchjunct and inserting suggested queries (via EvalQueries) for evaluation directly into the homepage feed. We also want to improve the Open Graph share cards to show the screenshots in miniature.

ARCHIGNES core toolset

June
  • Enlist private beta users for the extensions.
  • Clearly outline how the features under development support key jobs to be done that are not satisfied by existing tools.

Other projects

We are using the @ARCHIGNES account on Twitter more - check it out!

We are currently exploring fundraising and public outreach opportunities. If you have any ideas or suggestions regarding grants or other funding, if you're interested in using these tools in your workplace or classroom, or if you're interested in a presentation or workshop on exit and voice in search, please reach out to us at daniel@archignes.com.