From e5f28b16ee574fb98bf01a4f6d5c3ab4c0a2ae20 Mon Sep 17 00:00:00 2001 From: ThatGuySam Date: Sun, 15 Mar 2026 13:03:57 -0500 Subject: [PATCH] docs(research): add pagefind feature parity memo Capture user-visible parity requirements for a future Pagefind migration. This keeps the earlier viability memo focused on engine fit and documents the recommended adapter approach, carry-over patterns, and remaining prototype risks around ranking and title highlighting. --- .../pagefind-feature-parity-2026-03-15.md | 306 ++++++++++++++++++ 1 file changed, 306 insertions(+) create mode 100644 docs/research/pagefind-feature-parity-2026-03-15.md diff --git a/docs/research/pagefind-feature-parity-2026-03-15.md b/docs/research/pagefind-feature-parity-2026-03-15.md new file mode 100644 index 0000000..708583b --- /dev/null +++ b/docs/research/pagefind-feature-parity-2026-03-15.md @@ -0,0 +1,306 @@ +# Pagefind Feature Parity For doesitarm + +Date: 2026-03-15 + +## Scope + +Read alongside `docs/research/pagefind-viability-2026-03-15.md`. + +Investigate how a Pagefind migration could preserve the current Stork-backed +search UX in `doesitarm`, focusing on user-visible behavior rather than on +whether Pagefind is viable in the abstract. + +## Short Answer + +Yes, most of the current search experience can be carried over without the user +feeling a major regression, but only if Pagefind is treated as the search core +under a custom Vue adapter. + +Recommended parity path: + +1. Keep the current server-rendered initial lists, pagination links, summary + block, and page chrome exactly as they are. +2. Replace only the "user has started searching/filtering" path with Pagefind's + JavaScript API. +3. Build the Pagefind index from the existing sitemap/listing data, not from an + HTML crawl. +4. Use Pagefind `filters` for status/category/type scoping. +5. Use Pagefind `meta` only for simple scalar fields needed in result rendering. +6. Reattach richer card UI state such as `searchLinks` from a local + URL-or-slug keyed map instead of trying to force arrays into Pagefind + metadata. + +The one place where a prototype may still change the implementation choice is +search quality. If `addCustomRecord()` does not rank app-name and alias matches +well enough, the next-best parity option is to generate virtual HTML records via +`addHTMLFile()` so Pagefind can use `h1` weighting and `data-pagefind-*` +attributes. + +## Current UX Contract In The Repo + +From `components/search-stork.vue`, `helpers/stork/toml.js`, and the scoped +Astro pages: + +- The page initially shows the existing paginated list from the API when the + user has not typed anything yet. +- Search is search-as-you-type, with loading placeholders while results are + pending. +- The UI exposes quick status chips. +- Scoped pages such as `/kind/...` and `/games` inject base filters so the same + component behaves like "search within this slice". +- Empty results on a scoped page show a "Search Everything" escape hatch. +- Query results show highlighted snippets and a detail link. +- Non-query cards can also show timestamps and auxiliary action buttons such as + benchmark/performance links. +- The current Stork index injects synthetic searchable tokens for `status_*`, + category, and route type, in addition to title/content/description/aliases and + tags. +- Stork also post-filters query results so every typed term must be present + somewhere in the returned title/URL/excerpts. + +That means parity is not just "can users search", but: + +- can they search globally and within a scoped page +- can they click status chips +- can they still get good snippets and stable detail URLs +- can the initial browse mode remain unchanged + +## What The Evidence Says + +Confirmed from Pagefind docs and repo activity: + +- The Node API supports `addCustomRecord()` with `url`, `content`, `language`, + optional flat `meta`, optional flat `filters`, and optional flat `sort`. +- The Node API also supports `addHTMLFile()` for virtual HTML pages and + `writeFiles()` / `getFiles()` for writing the bundle to `/pagefind/`. +- The browser API is intended for custom search interfaces, not just the stock + widget. +- `pagefind.init()` can be called on focus, and `pagefind.preload()` / + `pagefind.debouncedSearch()` exist specifically to reduce first-search + latency. +- `result.data()` returns `url`, `excerpt`, `meta`, and related result data. + The docs explicitly say `excerpt` is safe to use as `innerHTML`, while + `content` and `meta` are raw. +- The JS API supports filter-only browsing by calling + `pagefind.search(null, { filters: ... })`. +- The JS API can also return filter counts via `pagefind.filters()`, plus + remaining-result counts on subsequent searches. +- Filtering defaults to AND semantics, and compound `any` / `all` / `none` / + `not` logic is available. +- Sorting can be applied at search time, but records missing a sort value are + omitted when that sort is active. +- Highlighting on destination pages is supported via `highlightParam` and + `pagefind-highlight.js`. +- Historical GitHub issues `#198` and `#277` asked for direct non-HTML input; + both are now closed, and the current docs document that capability. +- The latest stable release is `v1.4.0`, published on 2025-09-01. +- Issue `#574` about the `npx` wrapper on `ubuntu-latest` is still open as of + 2026-03-15, so a pinned dependency or direct binary path is safer than a + casual CLI swap. + +Community signal: + +- In the main HN discussion for Pagefind's launch, the maintainer explicitly + said multi-word query merging is built in. +- Another HN commenter reported that deploying Pagefind was "pleasingly easy" + and the result was "reasonably nippy". +- Zach Leatherman's `pagefind-search` component is a concrete GitHub example of + treating Pagefind as a customizable UI layer with explicit fallback content + and controlled asset loading. + +## Feature Mapping + +| Current user-visible feature | Carry-over path with Pagefind | Confidence | Notes | +| --- | --- | --- | --- | +| Search-as-you-type | `pagefind.debouncedSearch()` or manual debounce + `pagefind.preload()` | High | This is native to the JS API. | +| Lazy first-load behavior | `pagefind.init()` on focus, or rely on first search | High | This matches the current deferred Stork load pattern. | +| Scoped search pages | Keep current initial page data, then call `pagefind.search(term-or-null, { filters })` | High | Better fit than the current synthetic token approach. | +| Quick status chips | Map chips to `filters.status` values | High | Pagefind filters are cleaner than indexing `status_native` into content. | +| Empty-state "Search Everything" | Clear base filters and rerun, or keep current link to `/` | High | User-visible behavior is easy to preserve. | +| Highlighted excerpts | Render `result.data().excerpt` | High | Officially documented and safe as `innerHTML`. | +| Highlighted title text | No first-class JS API equivalent was found in the docs | Medium | Likely plain title unless we add client-side emphasis ourselves. | +| Detail links | Use `result.data().url` | High | Direct match. | +| Relative timestamp text | Put timestamp in `meta`, or join from local listing data | High | `meta` is string-only, so store ISO strings if using metadata. | +| Benchmark / Performance buttons | Join from local listing data keyed by URL or slug | High | Inference: better than encoding arrays as metadata strings. | +| Status / category / type scoping | Use Pagefind `filters`, not fake searchable tokens | High | Cleaner and more explicit than the current Stork trick. | +| "Every typed term must match somewhere" behavior | Likely client-side post-filter using returned raw `content` if needed | Medium | Current Stork behavior is explicit; Pagefind query semantics need a parity check in a prototype. | +| Result ordering that favors app names and aliases | Start with `addCustomRecord()` content shaping; fall back to `addHTMLFile()` if needed | Medium | Custom-record metadata appears display-oriented, not ranking-oriented. | + +## Options + +### 1. Custom Vue adapter over `addCustomRecord()` output + +This is the lowest-risk parity path. + +Why it works: + +- It matches the repo's existing data-first indexing model. +- It preserves the current page shell and only swaps the query engine. +- It uses Pagefind features the way they are documented today: + `meta` for display fields, `filters` for scoping, `sort` for explicit sort + options. + +Tradeoffs: + +- `meta` is for returned metadata, not clearly for ranking. +- Complex card state such as `searchLinks` does not fit naturally into flat + string metadata. +- The docs do not show title-highlight ranges in the JS API, so exact title + highlighting may need custom client logic. + +### 2. Custom Vue adapter over generated virtual HTML via `addHTMLFile()` + +This is the "higher parity if ranking is off" option. + +Why it might be worth it: + +- Pagefind documents default weighting for HTML headings. +- Pagefind documents `data-pagefind-weight`, `data-pagefind-meta`, + `data-pagefind-filter`, and `data-pagefind-sort`. +- If app-name, alias, and status text need finer relevance tuning than a plain + custom record gives, virtual HTML gives more levers. + +Tradeoffs: + +- More adapter code. +- Harder to justify unless a real query corpus shows ranking problems. + +### 3. Replace the current component with the stock Pagefind UI + +Not recommended. + +Why: + +- It discards the current browse-first behavior and scoped-page behavior. +- It loses the current empty-state copy and action-button treatment. +- It would make parity depend on overriding Pagefind's UI instead of preserving + the repo's existing search component contract. + +## What Works + +- Keep "no query" mode exactly as it is today and switch to Pagefind only after + the user types or toggles a filter. +- Build one Pagefind record per listing/detail route, using the same sitemap + payloads already feeding the Stork pipeline. +- Put searchable text in `content`, starting with the fields users most expect + to match: app name, aliases, support text, description, tags, and any status + phrasing users already see. +- Put render-only scalar fields in `meta`, such as title, slug, status label, + last-updated ISO timestamp, and short display text. +- Use `filters` for `status`, `category`, `kind`, and other scoped-page + constraints. +- Build a parallel local result-decoration map keyed by URL or slug so Pagefind + results can be decorated with the same `searchLinks`, timestamps, or other + card chrome without turning the index into a transport format for the whole + listing object. +- Call `pagefind.filters()` once if you want the chip row to expose counts or + disabled states. + +Inference: +For `doesitarm`, this "Pagefind for retrieval + local map for decoration" +split is probably the cleanest way to preserve the current UI without bloating +Pagefind metadata or weakening search semantics. + +## What To Avoid + +- Do not replace the entire search component with the stock Pagefind UI if the + goal is parity. +- Do not assume `meta` alone is enough for search quality. Metadata is clearly + documented as returned result data; searchable content still needs to live in + `content`. +- Do not try to stuff arrays or nested structures like `searchLinks` into flat + Pagefind metadata if the same information already exists in local page data. +- Do not apply Pagefind sort options to sparse fields unless every record has a + value, because missing sort keys are omitted from sorted results. +- Do not assume the `npx pagefind` wrapper is production-safe on Ubuntu CI + without pinning and testing. + +## Recommendation + +Recommended implementation order: + +1. Keep the current Astro pages and initial list rendering exactly as-is. +2. Build a Pagefind prototype with `addCustomRecord()` from the existing sitemap + payloads. +3. Map the current scoped-page `baseFilters` to real Pagefind filters. +4. Add a thin Pagefind adapter inside the current Vue component rather than + replacing the component. +5. Use a local `listingByUrl` or `listingBySlug` map to reattach rich card UI + fields. +6. Compare a real query set against Stork, especially app-name, alias, and + multi-term searches. +7. Only if ranking quality is weaker than expected, move the prototype from + `addCustomRecord()` to generated `addHTMLFile()` records with weighted + markup. + +Why this is the best default: + +- It preserves the current user-visible experience in the cheapest way. +- It uses Pagefind features where Pagefind is strongest: retrieval, snippets, + filtering, counts, and static bundle delivery. +- It avoids forcing Pagefind to become the canonical source of every UI field. + +## Missing Information + +- I did not find a documented JS API for title highlight ranges equivalent to + the Stork title-range behavior. +- I did not find clear documentation on exact multi-term query semantics beyond + Pagefind supporting multi-word queries in practice. +- I did not find a high-signal Stack Overflow thread that added more than the + official docs for this migration. +- The Lobsters URL surfaced during search no longer resolved, so I did not use + it as evidence. + +## Recommended Next Inspection Steps + +1. Build a small Pagefind prototype against 100-200 representative listings. +2. Test 25-50 real queries from the current site vocabulary: + app names, aliases, status words, category words, and mixed multi-term + queries. +3. Decide whether status chips should stay effectively single-select, matching + current behavior, or become explicit OR filters within the same filter key. +4. Verify whether plain title rendering is acceptable, or whether custom + client-side title emphasis is needed. +5. Measure first-search latency on mobile before removing Stork. + +## Source Links + +- Pagefind Node API docs: + https://pagefind.app/docs/node-api/ +- Pagefind browser API docs: + https://pagefind.app/docs/api/ +- Pagefind filtering docs: + https://pagefind.app/docs/filtering/ +- Pagefind JS API filtering docs: + https://pagefind.app/docs/js-api-filtering/ +- Pagefind sorting docs: + https://pagefind.app/docs/sorts/ +- Pagefind JS API sorting docs: + https://pagefind.app/docs/js-api-sorting/ +- Pagefind metadata docs: + https://pagefind.app/docs/metadata/ +- Pagefind JS API metadata docs: + https://pagefind.app/docs/js-api-metadata/ +- Pagefind weighting docs: + https://pagefind.app/docs/weighting/ +- Pagefind ranking docs: + https://pagefind.app/docs/ranking/ +- Pagefind highlighting docs: + https://pagefind.app/docs/highlighting/ +- Pagefind sub-results docs: + https://pagefind.app/docs/sub-results/ +- Pagefind latest release `v1.4.0` (published 2025-09-01): + https://github.com/Pagefind/pagefind/releases/tag/v1.4.0 +- Pagefind issue `#198` ("Manually defining content, without passing HTML"): + https://github.com/Pagefind/pagefind/issues/198 +- Pagefind issue `#277` ("Can pagefind pull its data from a json index file?"): + https://github.com/Pagefind/pagefind/issues/277 +- Pagefind issue `#574` (`ubuntu-latest` / `npx` wrapper failure, still open on + 2026-03-15): + https://github.com/Pagefind/pagefind/issues/574 +- HN discussion for Pagefind launch: + https://news.ycombinator.com/item?id=32290634 +- HN item API for the same discussion: + https://hn.algolia.com/api/v1/items/32290634 +- Zach Leatherman's `pagefind-search` web component: + https://github.com/zachleat/pagefind-search