Minwin is a social platform where people share visual content — think posts with images, captions, and hashtags. The core problem: when a user opens the app, what should they see? Not just what’s new, but what’s relevant to them.
I built a recommendation engine that uses vector embeddings, pgvector, and a multi-stage feed algorithm to deliver personalized content. Here’s how it works end to end.
The embedding pipeline
Every post that gets uploaded goes through a three-step pipeline:
-
Image processing — Sharp resizes the image to 1080×1350 for posts and generates a 300×300 thumbnail, stored in Cloudflare R2.
-
Vision analysis — Groq’s Llama 4 Scout model analyzes the image and generates semantic tags: product categories, styles, occasions, demographics, color palettes, aesthetics. Up to 40 tags per post.
-
Embedding generation — The tags, caption, and hashtags are concatenated and fed into OpenAI’s
text-embedding-3-smallmodel, producing a 512-dimensional vector. This gets stored in amedia_embeddingtable using pgvector.
I chose 512 dimensions instead of the default 1536 — the similarity quality is nearly identical for this use case and the searches are noticeably faster.
User profiles as vectors
When a user likes, saves, comments on, or clicks a product in a post, it triggers a background job (BullMQ) that recomputes their user embedding.
The computation:
- Fetch the last 90 days of interactions
- Apply weights: saves (2.0×) > comments (1.5×) > likes (1.0×) = clicks (1.0×)
- Apply recency decay with a 30-day half-life:
decay = 0.5^(age / halfLife) - Retrieve the media embeddings for all interacted posts
- Compute a weighted average vector
- L2 normalize the result for cosine similarity
The output is a single 512-dimensional vector that represents what this user is interested in right now. It’s debounced to compute at most once every 30 minutes per user.
The feed algorithm
The personalized feed is a three-phase pipeline with cursor-based pagination:
Phase 1: Followed creators. Posts from people the user follows, sorted by recency. Every 5 posts, a trending post gets interleaved to add variety.
Phase 2: Discovery. This is where the embeddings shine. A pgvector cosine similarity query finds the closest content:
SELECT m.id FROM media_embedding me
JOIN media m ON m.id = me."mediaId"
WHERE m.purpose = 'post'
AND m.status IN ('uploaded', 'processing', 'processed')
AND m."userId" != ALL($1::text[])
ORDER BY me.embedding <=> $2::vector ASC
LIMIT $3
The <=> operator is pgvector’s cosine distance — lower means more similar. I fetch 3× the needed limit to account for filtering, then apply a bloom filter to exclude posts the user has already seen.
Phase 3: Chronological fallback. When follow and discovery are exhausted, fall back to newest-first ordering. The user always has something to scroll.
Bloom filter for deduplication
A Redis bloom filter (capacity 100K, 1% error rate, 30-day TTL) tracks which posts each user has seen. Before returning any result, I check BF.MEXISTS and after displaying, mark with BF.MADD. This prevents the same post from appearing twice across sessions without maintaining expensive per-user seen lists.
Trending computation
A BullMQ job runs hourly and computes trending scores across three windows (24h, 7d, 30d):
rawScore = (w.likes × likes) + (w.comments × comments)
+ (w.saves × saves) + (w.clicks × clicks)
ageDecay = max(0.5, 1 - (ageInHours / windowHours) × 0.5)
score = rawScore × ageDecay
Scores go into Redis sorted sets. Weights are configurable from a database table so I can tune without deploying.
Why pgvector and not a dedicated vector DB
I evaluated Pinecone and Weaviate early on. Both are excellent, but for a mid-scale app:
- pgvector lives inside the same Postgres I already run — zero additional infrastructure
- Joins between vectors and relational data (moderation status, user blocks, post status) happen in a single query
- At the current scale, similarity search over the dataset is fast enough with the pgvector index
- One fewer service to monitor, backup, and pay for
If the dataset grows past a few million embeddings and query latency becomes an issue, I can move to a dedicated vector DB. For now, pgvector works absolutely fine.
What I’d improve
The user embedding is a single vector averaging all interests — a user who likes both photography and cooking gets a blended vector that’s great at neither. Cluster-based profiles (maintaining 3-5 interest vectors per user) would improve discovery precision.
The vision tagging could also be more structured — right now it’s freeform tags from the LLM, which sometimes drift. A fixed taxonomy with the LLM classifying into predefined categories would give more consistent embeddings.
But honestly, the current system works well for the scale we’re at. The blend of follow-based content, vector similarity, trending interleaving, and bloom filter deduplication produces a feed that feels relevant without being a filter bubble.