Implementation

Custom Recommendation Engine Development: Build for the Data You Have

Apr 22, 20268 min read

Most recommendation engine briefs copy Amazon, which needs interaction data you do not have. The cold-start problem, not the algorithm, is the real constraint. A staged build that fits your stage.

Custom Recommendation Engine Development: Build for the Data You Have

Almost every custom recommendation engine development brief starts by describing Amazon. The buyer wants “customers who bought this also bought that,” the kind of collaborative filtering that powers the big platforms. The problem is that those systems run on billions of interactions, and most businesses have thousands. The first real question is not which algorithm to use. It is whether you have enough behavioural data to feed any of them, and what to build while you do not.

Why most recommendation engine briefs copy a pattern they cannot feed

Collaborative filtering, the technique behind the famous recommendation systems, learns from how many users behave across many items. It needs volume and density: lots of users, lots of overlap in what they touch. A retailer with 5,000 monthly visitors and 800 products has neither. Feed that algorithm sparse data and it recommends bestsellers to everyone, which is not a recommendation engine. It is a popularity chart.

The constraint that actually decides your build is the cold-start problem: new users with no history, new products with no interactions, and a catalogue that turns over faster than the data accumulates. This is where most projects to build a recommendation system quietly stall. The model was never the hard part. The hard part is producing useful suggestions before you have the behavioural data that makes the textbook approach work.

Before scoping an engine, answer one number: how many interactions per active item per month do you log? Under a few hundred, collaborative filtering will underperform a good rules-and-content approach. The data volume, not the ambition, sets the starting architecture.

How a custom recommendation system grows with your data

The right design is staged. You start with what works on thin data and add sophistication only as the interactions accumulate to support it.

  • Stage one: rules and content similarity. Recommend by attributes you already know, category, price band, tags, description text. This needs zero interaction history and handles new products on day one. For most small catalogues it is 70% of the value.
  • Stage two: lightweight behavioural signals. Layer in co-views and co-purchases as they accumulate. You do not need billions of rows for “people who viewed this also viewed” to beat a static list, just enough density on your top items.
  • Stage three: a learned model, when earned. Once the data is dense enough to evaluate honestly, a trained model can lift results further. Whether you reach this stage at all is a data question, not a status symbol.
  • Throughout: a measurable objective. Decide upfront what the engine optimises, click-through, basket size, retention, and measure against a control. An engine without a metric is a black box nobody can defend.

An AI recommendation engine is judged on lift, not cleverness

The only honest measure of an AI recommendation engine is whether it beats the simple baseline you would have shipped anyway. That baseline is usually “show the bestsellers in this category.” A surprising number of expensive engines fail to beat it, because the data could not support the technique the vendor sold. So the first thing a good partner builds is the measurement: a control group seeing the baseline, a test group seeing the engine, and a metric that maps to revenue.

This is also where the question of whether you need a trained model at all gets decided honestly. As covered in choosing a partner to build custom machine learning models, a large share of “we need a model” briefs are better served by retrieval and rules than by training anything. Recommendations are a textbook case: the trained model is stage three, and many businesses get their full return at stage one. You can see how we scope builds to the data that exists rather than the data a buyer wishes they had.

The strategic point is simple. A recommendation engine creates value the same way any AI does, by expanding capacity or sharpening a decision, and it only does that if the suggestions are good enough to change behaviour. Cleverness that does not move the metric is cost.

When a recommendation engine is the wrong project

Several situations mean the engine is not your highest-return build.

  • Your catalogue is tiny. If a customer can see your whole range in one scroll, recommendations add little. Good search and clear categories do more.
  • Discovery is not your bottleneck. If customers know exactly what they want and the friction is checkout or delivery, fixing those returns more than a smarter suggestion ever will.
  • You cannot measure the result. Without the analytics to run a controlled test, you will never know if the engine works, and an unmeasurable engine is impossible to improve or defend.
  • Most briefs copy Amazon-style collaborative filtering, which needs interaction volume most businesses do not have.
  • The cold-start problem, not the algorithm choice, is the real constraint on a custom recommendation system.
  • Build in stages: rules and content first, behavioural signals as they accumulate, a trained model only when the data earns it.
  • Judge the engine on measured lift over a bestsellers baseline, not on how sophisticated the technique sounds.
  • Skip it if your catalogue is tiny, discovery is not the bottleneck, or you cannot run a controlled test.

Good recommendation engine development starts from the data you log today, not the platform you admire. The audit is where you find out which stage your data can actually support and whether the engine is even your best first build. gamgi runs a two-week diagnostic that ends with a scoped project you own, sized to your real numbers. How many interactions per item do you log in a month?

Book your AI audit