mirror of
https://github.com/ThatGuySam/doesitarm.git
synced 2026-05-18 06:44:46 -07:00
docs(research): add pagefind feature parity memo
Capture user-visible parity requirements for a future Pagefind migration. This keeps the earlier viability memo focused on engine fit and documents the recommended adapter approach, carry-over patterns, and remaining prototype risks around ranking and title highlighting.
This commit is contained in:
parent
2f667357af
commit
e5f28b16ee
1 changed files with 306 additions and 0 deletions
306
docs/research/pagefind-feature-parity-2026-03-15.md
Normal file
306
docs/research/pagefind-feature-parity-2026-03-15.md
Normal file
|
|
@ -0,0 +1,306 @@
|
|||
# Pagefind Feature Parity For doesitarm
|
||||
|
||||
Date: 2026-03-15
|
||||
|
||||
## Scope
|
||||
|
||||
Read alongside `docs/research/pagefind-viability-2026-03-15.md`.
|
||||
|
||||
Investigate how a Pagefind migration could preserve the current Stork-backed
|
||||
search UX in `doesitarm`, focusing on user-visible behavior rather than on
|
||||
whether Pagefind is viable in the abstract.
|
||||
|
||||
## Short Answer
|
||||
|
||||
Yes, most of the current search experience can be carried over without the user
|
||||
feeling a major regression, but only if Pagefind is treated as the search core
|
||||
under a custom Vue adapter.
|
||||
|
||||
Recommended parity path:
|
||||
|
||||
1. Keep the current server-rendered initial lists, pagination links, summary
|
||||
block, and page chrome exactly as they are.
|
||||
2. Replace only the "user has started searching/filtering" path with Pagefind's
|
||||
JavaScript API.
|
||||
3. Build the Pagefind index from the existing sitemap/listing data, not from an
|
||||
HTML crawl.
|
||||
4. Use Pagefind `filters` for status/category/type scoping.
|
||||
5. Use Pagefind `meta` only for simple scalar fields needed in result rendering.
|
||||
6. Reattach richer card UI state such as `searchLinks` from a local
|
||||
URL-or-slug keyed map instead of trying to force arrays into Pagefind
|
||||
metadata.
|
||||
|
||||
The one place where a prototype may still change the implementation choice is
|
||||
search quality. If `addCustomRecord()` does not rank app-name and alias matches
|
||||
well enough, the next-best parity option is to generate virtual HTML records via
|
||||
`addHTMLFile()` so Pagefind can use `h1` weighting and `data-pagefind-*`
|
||||
attributes.
|
||||
|
||||
## Current UX Contract In The Repo
|
||||
|
||||
From `components/search-stork.vue`, `helpers/stork/toml.js`, and the scoped
|
||||
Astro pages:
|
||||
|
||||
- The page initially shows the existing paginated list from the API when the
|
||||
user has not typed anything yet.
|
||||
- Search is search-as-you-type, with loading placeholders while results are
|
||||
pending.
|
||||
- The UI exposes quick status chips.
|
||||
- Scoped pages such as `/kind/...` and `/games` inject base filters so the same
|
||||
component behaves like "search within this slice".
|
||||
- Empty results on a scoped page show a "Search Everything" escape hatch.
|
||||
- Query results show highlighted snippets and a detail link.
|
||||
- Non-query cards can also show timestamps and auxiliary action buttons such as
|
||||
benchmark/performance links.
|
||||
- The current Stork index injects synthetic searchable tokens for `status_*`,
|
||||
category, and route type, in addition to title/content/description/aliases and
|
||||
tags.
|
||||
- Stork also post-filters query results so every typed term must be present
|
||||
somewhere in the returned title/URL/excerpts.
|
||||
|
||||
That means parity is not just "can users search", but:
|
||||
|
||||
- can they search globally and within a scoped page
|
||||
- can they click status chips
|
||||
- can they still get good snippets and stable detail URLs
|
||||
- can the initial browse mode remain unchanged
|
||||
|
||||
## What The Evidence Says
|
||||
|
||||
Confirmed from Pagefind docs and repo activity:
|
||||
|
||||
- The Node API supports `addCustomRecord()` with `url`, `content`, `language`,
|
||||
optional flat `meta`, optional flat `filters`, and optional flat `sort`.
|
||||
- The Node API also supports `addHTMLFile()` for virtual HTML pages and
|
||||
`writeFiles()` / `getFiles()` for writing the bundle to `/pagefind/`.
|
||||
- The browser API is intended for custom search interfaces, not just the stock
|
||||
widget.
|
||||
- `pagefind.init()` can be called on focus, and `pagefind.preload()` /
|
||||
`pagefind.debouncedSearch()` exist specifically to reduce first-search
|
||||
latency.
|
||||
- `result.data()` returns `url`, `excerpt`, `meta`, and related result data.
|
||||
The docs explicitly say `excerpt` is safe to use as `innerHTML`, while
|
||||
`content` and `meta` are raw.
|
||||
- The JS API supports filter-only browsing by calling
|
||||
`pagefind.search(null, { filters: ... })`.
|
||||
- The JS API can also return filter counts via `pagefind.filters()`, plus
|
||||
remaining-result counts on subsequent searches.
|
||||
- Filtering defaults to AND semantics, and compound `any` / `all` / `none` /
|
||||
`not` logic is available.
|
||||
- Sorting can be applied at search time, but records missing a sort value are
|
||||
omitted when that sort is active.
|
||||
- Highlighting on destination pages is supported via `highlightParam` and
|
||||
`pagefind-highlight.js`.
|
||||
- Historical GitHub issues `#198` and `#277` asked for direct non-HTML input;
|
||||
both are now closed, and the current docs document that capability.
|
||||
- The latest stable release is `v1.4.0`, published on 2025-09-01.
|
||||
- Issue `#574` about the `npx` wrapper on `ubuntu-latest` is still open as of
|
||||
2026-03-15, so a pinned dependency or direct binary path is safer than a
|
||||
casual CLI swap.
|
||||
|
||||
Community signal:
|
||||
|
||||
- In the main HN discussion for Pagefind's launch, the maintainer explicitly
|
||||
said multi-word query merging is built in.
|
||||
- Another HN commenter reported that deploying Pagefind was "pleasingly easy"
|
||||
and the result was "reasonably nippy".
|
||||
- Zach Leatherman's `pagefind-search` component is a concrete GitHub example of
|
||||
treating Pagefind as a customizable UI layer with explicit fallback content
|
||||
and controlled asset loading.
|
||||
|
||||
## Feature Mapping
|
||||
|
||||
| Current user-visible feature | Carry-over path with Pagefind | Confidence | Notes |
|
||||
| --- | --- | --- | --- |
|
||||
| Search-as-you-type | `pagefind.debouncedSearch()` or manual debounce + `pagefind.preload()` | High | This is native to the JS API. |
|
||||
| Lazy first-load behavior | `pagefind.init()` on focus, or rely on first search | High | This matches the current deferred Stork load pattern. |
|
||||
| Scoped search pages | Keep current initial page data, then call `pagefind.search(term-or-null, { filters })` | High | Better fit than the current synthetic token approach. |
|
||||
| Quick status chips | Map chips to `filters.status` values | High | Pagefind filters are cleaner than indexing `status_native` into content. |
|
||||
| Empty-state "Search Everything" | Clear base filters and rerun, or keep current link to `/` | High | User-visible behavior is easy to preserve. |
|
||||
| Highlighted excerpts | Render `result.data().excerpt` | High | Officially documented and safe as `innerHTML`. |
|
||||
| Highlighted title text | No first-class JS API equivalent was found in the docs | Medium | Likely plain title unless we add client-side emphasis ourselves. |
|
||||
| Detail links | Use `result.data().url` | High | Direct match. |
|
||||
| Relative timestamp text | Put timestamp in `meta`, or join from local listing data | High | `meta` is string-only, so store ISO strings if using metadata. |
|
||||
| Benchmark / Performance buttons | Join from local listing data keyed by URL or slug | High | Inference: better than encoding arrays as metadata strings. |
|
||||
| Status / category / type scoping | Use Pagefind `filters`, not fake searchable tokens | High | Cleaner and more explicit than the current Stork trick. |
|
||||
| "Every typed term must match somewhere" behavior | Likely client-side post-filter using returned raw `content` if needed | Medium | Current Stork behavior is explicit; Pagefind query semantics need a parity check in a prototype. |
|
||||
| Result ordering that favors app names and aliases | Start with `addCustomRecord()` content shaping; fall back to `addHTMLFile()` if needed | Medium | Custom-record metadata appears display-oriented, not ranking-oriented. |
|
||||
|
||||
## Options
|
||||
|
||||
### 1. Custom Vue adapter over `addCustomRecord()` output
|
||||
|
||||
This is the lowest-risk parity path.
|
||||
|
||||
Why it works:
|
||||
|
||||
- It matches the repo's existing data-first indexing model.
|
||||
- It preserves the current page shell and only swaps the query engine.
|
||||
- It uses Pagefind features the way they are documented today:
|
||||
`meta` for display fields, `filters` for scoping, `sort` for explicit sort
|
||||
options.
|
||||
|
||||
Tradeoffs:
|
||||
|
||||
- `meta` is for returned metadata, not clearly for ranking.
|
||||
- Complex card state such as `searchLinks` does not fit naturally into flat
|
||||
string metadata.
|
||||
- The docs do not show title-highlight ranges in the JS API, so exact title
|
||||
highlighting may need custom client logic.
|
||||
|
||||
### 2. Custom Vue adapter over generated virtual HTML via `addHTMLFile()`
|
||||
|
||||
This is the "higher parity if ranking is off" option.
|
||||
|
||||
Why it might be worth it:
|
||||
|
||||
- Pagefind documents default weighting for HTML headings.
|
||||
- Pagefind documents `data-pagefind-weight`, `data-pagefind-meta`,
|
||||
`data-pagefind-filter`, and `data-pagefind-sort`.
|
||||
- If app-name, alias, and status text need finer relevance tuning than a plain
|
||||
custom record gives, virtual HTML gives more levers.
|
||||
|
||||
Tradeoffs:
|
||||
|
||||
- More adapter code.
|
||||
- Harder to justify unless a real query corpus shows ranking problems.
|
||||
|
||||
### 3. Replace the current component with the stock Pagefind UI
|
||||
|
||||
Not recommended.
|
||||
|
||||
Why:
|
||||
|
||||
- It discards the current browse-first behavior and scoped-page behavior.
|
||||
- It loses the current empty-state copy and action-button treatment.
|
||||
- It would make parity depend on overriding Pagefind's UI instead of preserving
|
||||
the repo's existing search component contract.
|
||||
|
||||
## What Works
|
||||
|
||||
- Keep "no query" mode exactly as it is today and switch to Pagefind only after
|
||||
the user types or toggles a filter.
|
||||
- Build one Pagefind record per listing/detail route, using the same sitemap
|
||||
payloads already feeding the Stork pipeline.
|
||||
- Put searchable text in `content`, starting with the fields users most expect
|
||||
to match: app name, aliases, support text, description, tags, and any status
|
||||
phrasing users already see.
|
||||
- Put render-only scalar fields in `meta`, such as title, slug, status label,
|
||||
last-updated ISO timestamp, and short display text.
|
||||
- Use `filters` for `status`, `category`, `kind`, and other scoped-page
|
||||
constraints.
|
||||
- Build a parallel local result-decoration map keyed by URL or slug so Pagefind
|
||||
results can be decorated with the same `searchLinks`, timestamps, or other
|
||||
card chrome without turning the index into a transport format for the whole
|
||||
listing object.
|
||||
- Call `pagefind.filters()` once if you want the chip row to expose counts or
|
||||
disabled states.
|
||||
|
||||
Inference:
|
||||
For `doesitarm`, this "Pagefind for retrieval + local map for decoration"
|
||||
split is probably the cleanest way to preserve the current UI without bloating
|
||||
Pagefind metadata or weakening search semantics.
|
||||
|
||||
## What To Avoid
|
||||
|
||||
- Do not replace the entire search component with the stock Pagefind UI if the
|
||||
goal is parity.
|
||||
- Do not assume `meta` alone is enough for search quality. Metadata is clearly
|
||||
documented as returned result data; searchable content still needs to live in
|
||||
`content`.
|
||||
- Do not try to stuff arrays or nested structures like `searchLinks` into flat
|
||||
Pagefind metadata if the same information already exists in local page data.
|
||||
- Do not apply Pagefind sort options to sparse fields unless every record has a
|
||||
value, because missing sort keys are omitted from sorted results.
|
||||
- Do not assume the `npx pagefind` wrapper is production-safe on Ubuntu CI
|
||||
without pinning and testing.
|
||||
|
||||
## Recommendation
|
||||
|
||||
Recommended implementation order:
|
||||
|
||||
1. Keep the current Astro pages and initial list rendering exactly as-is.
|
||||
2. Build a Pagefind prototype with `addCustomRecord()` from the existing sitemap
|
||||
payloads.
|
||||
3. Map the current scoped-page `baseFilters` to real Pagefind filters.
|
||||
4. Add a thin Pagefind adapter inside the current Vue component rather than
|
||||
replacing the component.
|
||||
5. Use a local `listingByUrl` or `listingBySlug` map to reattach rich card UI
|
||||
fields.
|
||||
6. Compare a real query set against Stork, especially app-name, alias, and
|
||||
multi-term searches.
|
||||
7. Only if ranking quality is weaker than expected, move the prototype from
|
||||
`addCustomRecord()` to generated `addHTMLFile()` records with weighted
|
||||
markup.
|
||||
|
||||
Why this is the best default:
|
||||
|
||||
- It preserves the current user-visible experience in the cheapest way.
|
||||
- It uses Pagefind features where Pagefind is strongest: retrieval, snippets,
|
||||
filtering, counts, and static bundle delivery.
|
||||
- It avoids forcing Pagefind to become the canonical source of every UI field.
|
||||
|
||||
## Missing Information
|
||||
|
||||
- I did not find a documented JS API for title highlight ranges equivalent to
|
||||
the Stork title-range behavior.
|
||||
- I did not find clear documentation on exact multi-term query semantics beyond
|
||||
Pagefind supporting multi-word queries in practice.
|
||||
- I did not find a high-signal Stack Overflow thread that added more than the
|
||||
official docs for this migration.
|
||||
- The Lobsters URL surfaced during search no longer resolved, so I did not use
|
||||
it as evidence.
|
||||
|
||||
## Recommended Next Inspection Steps
|
||||
|
||||
1. Build a small Pagefind prototype against 100-200 representative listings.
|
||||
2. Test 25-50 real queries from the current site vocabulary:
|
||||
app names, aliases, status words, category words, and mixed multi-term
|
||||
queries.
|
||||
3. Decide whether status chips should stay effectively single-select, matching
|
||||
current behavior, or become explicit OR filters within the same filter key.
|
||||
4. Verify whether plain title rendering is acceptable, or whether custom
|
||||
client-side title emphasis is needed.
|
||||
5. Measure first-search latency on mobile before removing Stork.
|
||||
|
||||
## Source Links
|
||||
|
||||
- Pagefind Node API docs:
|
||||
https://pagefind.app/docs/node-api/
|
||||
- Pagefind browser API docs:
|
||||
https://pagefind.app/docs/api/
|
||||
- Pagefind filtering docs:
|
||||
https://pagefind.app/docs/filtering/
|
||||
- Pagefind JS API filtering docs:
|
||||
https://pagefind.app/docs/js-api-filtering/
|
||||
- Pagefind sorting docs:
|
||||
https://pagefind.app/docs/sorts/
|
||||
- Pagefind JS API sorting docs:
|
||||
https://pagefind.app/docs/js-api-sorting/
|
||||
- Pagefind metadata docs:
|
||||
https://pagefind.app/docs/metadata/
|
||||
- Pagefind JS API metadata docs:
|
||||
https://pagefind.app/docs/js-api-metadata/
|
||||
- Pagefind weighting docs:
|
||||
https://pagefind.app/docs/weighting/
|
||||
- Pagefind ranking docs:
|
||||
https://pagefind.app/docs/ranking/
|
||||
- Pagefind highlighting docs:
|
||||
https://pagefind.app/docs/highlighting/
|
||||
- Pagefind sub-results docs:
|
||||
https://pagefind.app/docs/sub-results/
|
||||
- Pagefind latest release `v1.4.0` (published 2025-09-01):
|
||||
https://github.com/Pagefind/pagefind/releases/tag/v1.4.0
|
||||
- Pagefind issue `#198` ("Manually defining content, without passing HTML"):
|
||||
https://github.com/Pagefind/pagefind/issues/198
|
||||
- Pagefind issue `#277` ("Can pagefind pull its data from a json index file?"):
|
||||
https://github.com/Pagefind/pagefind/issues/277
|
||||
- Pagefind issue `#574` (`ubuntu-latest` / `npx` wrapper failure, still open on
|
||||
2026-03-15):
|
||||
https://github.com/Pagefind/pagefind/issues/574
|
||||
- HN discussion for Pagefind launch:
|
||||
https://news.ycombinator.com/item?id=32290634
|
||||
- HN item API for the same discussion:
|
||||
https://hn.algolia.com/api/v1/items/32290634
|
||||
- Zach Leatherman's `pagefind-search` web component:
|
||||
https://github.com/zachleat/pagefind-search
|
||||
Loading…
Add table
Add a link
Reference in a new issue