WaveSearch Labs: Indexing, Search, and Recommendations

This document explains how product data is indexed, how search and recommendations APIs are executed, and how the demo platform is composed and operated.

Demo-only AKS dev overlay In-cluster Kaniko builds Zero-trust tokens via STS

Architecture (runtime)

Control Plane

  • wave-sts: issues and validates audience-scoped JWT tokens.
  • wavesearch-frontend: operator UI for ingest, query tests, merchandising rules, and analytics.

Commerce Plane

  • wavestore-erp-api: source of truth for products, stock, pricing, orders, and export catalog.
  • wavestore-frontend: shopper UI with basket module, checkout module, account module, and ERP order placement.

Search Plane

  • wavesearch-api: indexing runtime, query API, recommendation API, events pipeline, and admin controls.
  • Ingress (nginx): host/path routing through one shared public IP.

Modular integration boundaries

Storefront modules

  • Basket module: local basket state, quantity controls, totals, and checkout payload construction.
  • Checkout module: calls POST /v2/checkout and persists placed orders to account history.
  • Account module: sign-in via STS and order history via GET /v2/account/orders.
  • Promotions module: loads ERP offers via GET /v2/offers; banner click applies offer query/category/productIds and executes storefront search.

ERP modules

  • Order module: POST /erp/orders validates items, prices from ERP pricing state, decrements stock, and creates invoice.
  • Catalog module: /erp/products, /erp/stock, /erp/pricing, /erp/offers, /erp/export/catalog.

Search modules

  • Ingest module: /search/ingest/from-erp materializes catalog snapshots into runtime indexes.
  • Query module: /search/query handles retrieval + facets.
  • Merchandising module: /search/admin/rules applies boost/bury/pin controls at ranking time.

How indexing works

Search and recommendations APIs

Search

  • POST /search/query with query text, optional filters, and page size.
  • Requires search.query scope.
  • Returns ranked products with facets and metadata from indexed catalog fields.

Recommendations

  • POST /search/recommend with productId, visitorId, and page size.
  • Uses catalog relationships and behavior signals to produce related products.
  • Also requires search.query scope.

Operational APIs

  • POST /search/events for click/search telemetry ingestion.
  • GET /search/admin/analytics for click-through and usage summaries.
  • POST /search/admin/rules for boost/bury controls.

How results are generated (user-level view)

How results are generated (implementation-level deep dive)

Search ranking pipeline

  1. Candidate generation: tokenize query; retrieve candidate products from in-memory runtime indexes.
  2. Base scoring: compute relevance score from term overlap / backend rank.
  3. Merchandising adjustments: apply boost/bury/pin actions from admin rules.
  4. Inventory adjustment: penalize out-of-stock and slightly uplift healthy in-stock items.
  5. Filter + sort: apply category/brand/availability/price/stock filters and finalize ranking.

Facets and metadata return shape

  • Facets are computed from the final result set using indexed fields (category, brand, availability counters).
  • Result documents include product metadata copied from indexed catalog snapshots plus overlayed stock/pricing context.
  • This enables one response to populate cards, filters, and status badges in the storefront.

Recommendations pipeline

  • Input context: productId and optional visitor context.
  • Candidate generation favors related categories/brands and known similarity relationships.
  • Scoring reuses merchandising + availability controls so recommendations respect boost/bury and stock posture.

Clickstream ingestion and usage

  1. Storefront posts events to POST /search/events (search, click, view, etc.).
  2. WaveSearch appends events to the tenant/partition append-log for durable replay and auditability.
  3. In-memory counters/analytics are updated and surfaced via GET /search/admin/analytics.
  4. Signals are used for operator decisions (boost/bury tuning, no-result diagnosis, promo effectiveness checks).
  5. On rebuild/recovery, durable snapshots + append logs can reconstruct state and analytics context.

Key design assumptions

Prompt to rebuild the platform from scratch

Use this prompt with Copilot/agent mode to recreate the stack end-to-end: