Contents, 14 sections
- 01How this workbook was built
- 02Dataset at a glance
- 03Data source
- 04Cell grid and sampling
- 05Cleaning and aggregation
- 06The "diverges" finding
- 07Per-category multipliers
- 08The Demand Read
- 09Quality safeguards
- 10Two findings worth citing
- 11What this is NOT
- 12A walked-through lookup
- 13References
- 14How to cite this
Research methodology, v3
How this workbook was built.
Population-level lookup tables for resellers, answering: at a given BSR with a given number of 30-day rank drops, what did similar badged products in this category actually sell? Built from a fresh quarterly sample of live Keepa data, then validated through six layers of quality control before any number reaches the workbook. v3 adds the Demand Read (§8), a synthesis layer over the same validated sample that triangulates rank velocity and review velocity to size demand where the badge floor censors it.
Methodology revision:
US collection window:
CA collection window: to
Refresh cadence: quarterlyThe dataset at a glance.
Headline numbers, after the full cleaning and aggregation pipeline. Every figure traces back to a verifiable count in the workbook itself.
| US (Amazon.com) | Canada (Amazon.ca) | Combined | |
|---|---|---|---|
| Unique products sampled | 187,782 ASINs | 74,502 ASINs | 254,407 (some overlap) |
| Sample observations | 215,306 rows | 91,821 rows | 307,127 |
| Category sheets in workbook | 231 | 140 | 371 |
| Categories with sales-per-drop | 259 | 182 | 441 |
| Cell rows aggregated | 6,541 | 4,345 | 10,886 |
| Data collection window | April 2026 | April to May 2026 | · |
Data source.
Every observation comes from Amazon's product catalog via the Keepa API, the same upstream source used by most major reseller research tools. We never scrape Amazon directly.
3a. Fields used.
Keepa supplies the following on a per-ASIN basis:
| Field | What it is |
|---|---|
| monthlySold | Amazon's "Bought in past month" badge value, variation-specific. Sample requires a non-null badge. |
| drops30 | Count of significant sales-rank drops over the trailing 30 days, computed from the parent ASIN's BSR history. |
| reviewsAdded30 | Count of new reviews in the trailing 30 days. Review-velocity input, not cumulative review count. |
| categoryTree | Keepa's category breadcrumb path. Depth 0 is root, depth 1+ is sub. |
3b. The monthlySold badge.
Amazon's "X bought in past month" badge is reported in bucket-aligned tiers (50, 100, 200, 300, ..., 1,000, 2,000, ..., 100,000+). We use the badge value as-is. We do not synthesize sales from BSR.
A widely cited critique from r/FulfillmentByAmazon[1]: "all sales rank based sales estimators are inherently inaccurate to the same degree. Think of it this way; you can't measure the average speed of your car just by looking at the speedometer once."[2]
We agree. Rank is a relative position, not a sales count, so we sampled actual sales-revealing badge values rather than inferring sales from rank.
3c. The badge floor.
The badge is shown only for products selling 50 per month or more. Roughly 60% of the Amazon marketplace is unbadged and structurally invisible to this sample. Cells dominated by the badge floor get explicit two-tier flags (see §5d).
Through v2 the guidance for hard-floor cells was "rule out, do not decide." v3 supersedes that: hard-floor cells are now exactly where the Demand Read (§8) earns its keep. A product hiding behind the badge floor still leaves two footprints the badge cannot censor, its rank drops and its added reviews, and the Demand Read sizes the slot from those instead of the censored median.
3d. Variation handling.
Amazon's variation-parent listings aggregate child-variant sales into a single monthlySold value while reporting only the parent's own rank drops. A parent with 80,000 monthly sold and 50 drops can show a per-drop ratio two orders of magnitude above the category typical. We detect and remove suspect rollup samples in stage 2 (see §5b).
The cell grid and sampling.
Each category is sliced into a fixed grid of 28 cells (up to 9 BSR bands by up to 4 drops bands). Every reseller question reduces to: which cell does my product land in?
4a. BSR bands.
Nine bands covering ranks 1 to 300,000. Ranks above 300K are outside scope.
| BSR band | Drops bands (nested) |
|---|---|
| 1-100 | 0-15, 16+ |
| 101-500 | 0-15, 16-50, 51+ |
| 501-2K | 0-15, 16-50, 51+ |
| 2K-5K | 0-15, 16-40, 41-100, 101+ |
| 5K-10K | 0-15, 16-40, 41-100, 101+ |
| 10K-20K | 0-15, 16-40, 41-100, 101+ |
| 20K-50K | 0-15, 16-50, 51+ |
| 50K-100K | 0-15, 16-50, 51+ |
| 100K-300K | 0-15, 16+ |
Mid-BSR rows use four drops bands (0-15, 16-40, 41-100, 101+) because there is more drop variance to resolve in that range. Head and tail BSRs use coarser splits.
4b. Sampling.
For each category in Keepa's depth-≤2 taxonomy (775 categories in the US tree, 696 in CA), we draw a stratified sample across the cell grid. The sampler:
- Targets ~50 products per cell where possible.
- Walks the full BSR by drops grid per category, so a category can contribute up to ~1,400 ASINs.
- Pulls only badged products: products where Keepa reports a non-null monthlySold value. This is the single largest methodological constraint and we surface it on every sheet. Roughly 60% of Amazon products are unbadged and structurally invisible to us.
- Reaches depth 2 in the category tree so subcategory data is captured, not just root level.
After sampling: 307,127 observation rows from 254,407 unique ASINs across both markets (some overlap between US and CA samples, since multi-marketplace ASINs can earn badges in both).
Cleaning and aggregation.
Every sample passes through six cleaning gates, then through a per-cell aggregator that trims outliers, removes suspect variation-parent rollups, computes percentiles, detects bimodality, and assigns confidence and floor flags.
5a. Six cleaning gates (stage 1).
Rows that fail any gate are dropped with a logged reason:
- ASIN present.
- Category ID present and resolvable in the cat tree.
- Cell key parses (BSR band plus drops band).
- BSR present.
- Listing age determinable (either
isYoungorlistedSinceDays). - Category exists in our loaded cat tree.
This step also bins each surviving product into its cell using Keepa-aligned band boundaries.
5b. Top-5% trim plus rollup-suspect filter (stage 2).
For each (category by cell) tuple we compute percentiles, but only after two defenses.
Trim. Sort the cell's monthlySold values, drop the highest 5%, then compute p10 / p25 / p50 / p75 on what remains. The trim defends against Amazon's variation-parent rollup problem. p95 and max are computed on the untrimmed data so true outliers stay visible.
Rollup-suspect filter (k=5). Variation parents can aggregate child sales into a single monthlySold value while still reporting parent-level drops. A parent with 80,000 monthly sold and only 50 drops shows a per-drop ratio (1,600) two orders of magnitude above the category typical, which is often 3 to 10 per drop. Without filtering, a handful of these parents distort every cell they land in.
The filter:
- Per category, compute the median of
monthlySold ÷ drops30across qualifying samples (wheredrops30 ≥ 5andmonthlySold ≥ 100). - Mark any sample whose ratio exceeds 5× that median as rollup-suspected.
- Drop it from the cell aggregator.
12,074 samples dropped on US (5.6% of cleaned input), 3,667 on CA (4.0%). Affected cells get an N parents removed annotation when N is 2 or more.
5c. Bimodality detection.
Some cells contain two distinct sales populations (a slot where half the products sell ~100/mo and half sell ~1,000/mo, for instance). A simple median misleads in those cases.
The algorithm:
- Bin
monthlySoldvalues into the 29 Keepa-aligned tier buckets: 50, 100, 200, ..., 1K, 2K, ..., 100K. - Require at least 30 total samples in the cell.
- Find local-maxima buckets with at least 3 samples and at least as many as their immediate neighbours.
- Take the two strongest peaks, order low to high.
- Reject if peaks are less than 2 bucket indices apart.
- Reject if the smallest valley count is not at least half the smaller peak count.
Cells passing every check are flagged bimodal and earn a two tiers (X vs Y) note showing the two peak bucket centres.
1,137 of 6,541 US cells flagged bimodal (17.4%). 140 of 4,345 CA cells flagged bimodal (3.2%).
5d. Floor share, two-tier flag.
Products selling between roughly 25 and 75 per month all report as 50 (the badge minimum). A cell where many products sit at this floor has a structurally compressed distribution and a misleadingly low typical reading.
For each cell we compute floor_share = fraction of samples whose monthlySold equals exactly 50, then assign:
- Hard floor (red badge, "most barely badged"):
floor_share ≥ 0.50. Most products at the badge minimum, so the true worst case is below what we can measure. - Soft floor (amber badge, displays percentage):
0.30 ≤ floor_share < 0.50. Notable share at floor, typical reading is pulled down. - No badge: below 0.30.
| Marketplace | Cells (total) | Hard floor | Soft floor |
|---|---|---|---|
| US | 6,541 | 2,520 (38.5%) | 1,438 (22.0%) |
| CA | 4,345 | 3,072 (70.7%) | 527 (12.1%) |
The Canadian market has a markedly higher floor concentration: over 70% of (category by cell) tuples are dominated by 50/mo products. We surface this honestly rather than smoothing it over. CA cells reading "100-500/mo" on Strong confidence are still meaningful; cells at the badge minimum tell you the slot is structurally below the measurement floor.
One vocabulary change ships with v3: in the tier map and evidence table, hard-floor cells are now labeled "50+/mo" instead of "~50/mo". The old label asserted a number we now know is a censoring artifact; the new one is open-ended on purpose. Legend wording: badge floor, true rate unresolved, size it with the Demand Read (§8), not the badge.
5e. Per-cell confidence chips.
Confidence is driven by n_unique, the unique-product count after the rollup filter:
Cells flagged Thin are excluded from tier classification entirely. They read as "No data" in the workbook rather than pretending to a number we can't support.
The "diverges" finding.
Subcategory data is the moat because subcategories often behave very differently from their parent roots.
For each cell we compute subcat_vs_root_x = cell.sold_p50 ÷ root_baseline, where root_baseline is the median across all cells in the root's subtree of their unfiltered sold_p50 values.
A cell diverges iff subcat_vs_root_x ≥ 2.0 OR ≤ 0.5.
The "Diverges" column in each workbook tallies a category's diverging cells. A high count means BSR by drops behaviour in that category drifts materially from the root baseline, so subcategory-specific data matters more there than a generic root-level estimate.
Subcategories that diverge from their root, by root category
Per-category sales multipliers.
Two cross-category multipliers live in the workbook on dedicated sheets: sales per rank drop, and sales per added review. Both are computed per category, not extrapolated from any single rule.
7a. Sales per Drop.
For each category, we compute the p25 / p50 / p75 of monthlySold ÷ drops30 across qualifying samples (drops30 ≥ 5 AND monthlySold ≥ 100). Categories with fewer than 30 qualifying samples are excluded: the ratio percentiles would be too noisy.
Survivors: 259 categories on US, 182 on CA, 441 combined.
The Sales per Drop sheet gives a single per-category multiplier ("each 30-day rank drop in this category typically represents N sales"). Spread (p75 ÷ p25) flags wide categories where the median should be used with caveat. In v3 this rate also drives the Demand Read's rank-velocity signal (§8); wide-spread categories cap that read's Signal at Medium.
7b. Sales per Review.
Same approach using monthlySold ÷ reviewsAdded30, with qualifying rule reviewsAdded30 ≥ 3 and same n ≥ 30 minimum. The per-category p50 powers the review-velocity signal in the Demand Read (§8): type in a review count, the sheet converts it to an estimated sales count using the category's measured rate and cross-checks it against rank velocity.
Categories without enough qualifying data fall back to a default 5% review rate (~20 sales per review). The irony is intentional: when the data is too thin to support a category-specific number, the workbook honestly defaults to the conventional heuristic rather than inventing a calibrated-looking one. Cells affected are still Thin-flagged.
Implied review rate per category = 100 ÷ p50%. Cumulative review count is never used: it is pooled across variation families on Amazon's product page and varies wildly by category, which would mislead more than inform.
The Demand Read.
New in v3: every category sheet leads with a Demand Read, one estimated range of monthly units built by triangulating three signals (the sampled slot, rank velocity, and review velocity) over the same sample, with a Signal strength chip, a worst-case line, and the arithmetic shown. It exists to solve the one problem the raw sample could not: Amazon's sales badge is censored at 50 per month, so the busiest slots in the workbook used to read "~50/mo" exactly when the products in them were obviously selling far more.
8a. The problem: the badge floor censors the answer.
Amazon's "bought in past month" badge is the only sales figure Amazon publishes, and it has a floor. Anything selling from roughly nothing up to about 50 a month reports as "50". The badge is a fact, but it is a censored fact.
That censoring concentrates exactly where lookups matter. In a BSR-by-drops cell where most sampled products sit at the badge minimum, the cell's median collapses to 50, and the v2 workbook reported "~50/mo" for slots containing obvious top sellers. The tool lost discrimination: it could not tell a near-the-top product from a true dud, because both rendered as 50.
This is worst in Canada. 70.7% of CA cells are hard-floor (most of their sample at the badge minimum) versus 38.5% in the US (see §5d). The CA badge has very little resolution on its own.
v2 handled this honestly but passively: it flagged floor-dominated cells and told you not to decide on them. v3 does better. The same product that hides behind the badge floor still leaves two footprints the badge cannot censor: its sales rank keeps dropping every time a unit sells, and its review count keeps growing. The Demand Read uses those.
8b. The three signals.
For the product you are evaluating, you type three numbers straight off a Keepa chart: its BSR, its 30-day sales-rank drop count, and optionally its reviews added in the last 30 days. The dropdowns are gone; inputs are all numeric. From these the sheet computes three estimates of monthly units (the two velocity reads are textbook ratio estimation with an auxiliary variable[9]):
| Signal | Computed as | Badge can censor it? | Role |
|---|---|---|---|
| Sampled slot (lookup) | Median monthly sold among sampled products in the same BSR-by-drops cell of this category | Yes: in a hard-floor cell this number is an artifact of the badge, not a measurement | Anchor on clean cells, evidence elsewhere |
| Rank velocity | Drops × this category's measured sales-per-drop rate (§7a) | No: drops are counted from the rank history | The workhorse, and the only signal we can back-test (§8f) |
| Review velocity | Reviews added in 30 days × this category's measured sales-per-review rate (§7b) | No | Weaker and noisier, so it serves as the cross-check rather than the driver |
8c. How they combine: gates and a geometric average.
Each signal first passes a usability gate. The gates are deliberately simple on-or-off rules, not tuned weights:
- The lookup is ignored when its cell is hard-floor (half or more of the sample at the badge minimum), or has fewer than 5 unique products, or has no data. It counts at half strength on soft-floor cells (30 to 50% at the minimum). Otherwise it anchors the read.
- Rank velocity counts when the category has a measured sales-per-drop rate (30 or more qualifying products) and the product shows at least 5 drops.
- Review velocity counts when at least 3 reviews were added; at half strength when the category's rate is a fallback rather than measured. It is discarded entirely when the review count looks like a variation-family rollup (50 or more reviews added AND more than 5 times the category's typical review velocity). When that screen trips, the sheet says so.
The surviving signals are combined as a weighted geometric mean (an average of logarithms). Two reasons, both standard statistics rather than invention:
- Sales volumes are multiplicative quantities with long right tails. For numbers like that the geometric mean is the correct center; an ordinary average would let one high signal drag the estimate up.[6]
- Decades of forecast-combination research show that simple, equal-style weights beat cleverly tuned ones out of sample.[7][8] We cannot measure each signal's precision (there is no ground truth below the badge floor), so we do not pretend to.
The floor behavior falls out on its own, with no special-casing. In a hard-floor cell the lookup's gate is zero, so the velocity signals carry the estimate, and it lifts off 50 exactly when the drops and reviews justify it. A genuinely slow product has few drops and few reviews, so its velocity estimates are small and the read stays near 50. A clean cell keeps the lookup as the anchor and nothing changes.
Two final guards, both conservative:
- The estimate is never allowed below the cell's observed median (de-censoring only ever raises a floor, it never wishes one away).
- A rank ceiling: no estimate may exceed the most we have ever observed sell at that rank in that category (the maximum visible badge value among rollup-filtered samples in the category-by-BSR-band). Rank is the one signal a variation rollup cannot inflate, so it is the contamination-proof upper bound.
8d. The Signal chip: what corroboration means here.
Each Demand Read carries a Signal label: High, Medium, or Low. This is a different concept from the sample-size chips (Strong / Limited / Thin sample, §5e) that grade how much data sits under each evidence cell. Signal grades whether independent signals corroborate this particular read.
Only one pair is genuinely independent: rank velocity versus review velocity. They come from different Keepa fields tracking different physical traces of a sale. The lookup is not independent of rank velocity (the cell is defined by the same drop count), so lookup-velocity agreement earns nothing. Corroboration is counted only between independent signals; correlated signals do not get to vote twice.
- High: rank velocity and review velocity are both live and land within 2x of each other, and the category's sales-per-drop rate is not wide-spread. Two independent measurements agree.
- Medium: one trustworthy signal carries the read (typically rank velocity alone), or the pair is live but 2x to 4x apart.
- Low: only the weakest configurations: a lone review-velocity read, a lone lookup, or signals spread more than 4x apart.
- No data: nothing usable. The sheet says so instead of inventing a number.
8e. What gets displayed: a range, a worst case, and the arithmetic.
The Demand Read never prints a single point. It prints a range whose width is set by the Signal (tighter at High, wider at Medium), both ends snapped down to a coarse display grid (50, 100, 150, 200, 300, 500, 700, 1,000, 1,500, 2,000). Snapping always rounds in the conservative direction.
Display rules, all locked:
- The 2K+ cap. Nothing anywhere in the Demand Read displays above 2,000 a month; higher values render as the open bucket "2K+ /mo". A sourcing decision saturates around one to two thousand units a month; resolution above that adds risk, not information. The cap also structurally bounds the damage of any contaminated input: the worst possible failure is a wrong "2K+", never a wrong "52,000".
- Low prints a lower bound only. At Low signal there is no range and no upper number, just "sells above the badge floor, treat as ~N+ /mo and verify". A printed 20x-wide range invites anchoring on its top, which is the exact over-buy failure the product exists to prevent. We print only the bound the data can defend.
- The worst case is always shown. Under the hero range sits the conservative floor figure ("worst case supported by the data"), visually primary. Size the buy against it.
- The why-line shows the arithmetic. Every read explains itself in one line, for example: "36 drops x ~9 sales/drop (category rate) = ~320/mo; reviews: 12 x 14 = ~170/mo (agrees)". When the review screen tripped, or the rank ceiling clamped the number, the line says that too. You are the last filter; the sheet gives you what you need to overrule it.
- An AGREE marker appears only when the two velocity reads actually converge. It is suppressed otherwise.
8f. What we can and cannot claim.
There is no ground truth below the badge floor. The badge is the only sales figure that exists, and it is the censored one, so the cases where the Demand Read matters most are exactly the cases no one can verify without buying inventory and selling it. The Demand Read is therefore a velocity-implied estimate, not a verified sales figure. The deliverable is restored discrimination and internal coherence, not proven accuracy.
The one external check available is the above-floor back-test: on products whose badge shows a real number (above 50), does rank velocity predict it? It does, out of sample. Tracking real sales above the floor is the evidence the same arithmetic extrapolates below it; it is evidence, not proof. The numbers below apply to products whose true badge value is visible:
| Check (above-floor products, out of sample) | US | CA |
|---|---|---|
| Binned estimate-to-actual ratio (rank velocity), centered at | 0.98 | 0.82 |
| Unbinned, display-faithful read, centered at (slightly high, hence every display rule rounds down) | 1.21 | 1.09 |
| Displayed range contains the real figure | 80.5% | 88.1% |
| Read misleads upward (whole range above the truth) | 7.5% | 3.7% |
| Hard-floor cells that now lift off the 50 floor when velocity justifies it | 89.8% | 80.0% |
Known failure mode, disclosed on-sheet: listings whose drops and reviews accrue at the variation-family level while the badge is per-variation can read high. The review screen, the rank ceiling, and the 2K+ cap each exist to bound this; they bound it, they do not eliminate it.
Direction-of-error policy: a false low costs a missed opportunity; a false high causes the over-buy. Every tie-break (gate, snap, cap, ceiling, Low's lower-bound-only rule) is chosen to make false highs the rarer error.
Quality safeguards.
Three guards keep the workbook honest: a tier classification that requires sample support, two category-level filters that decide which sheets ship, and a byte-reproducibility check on every build.
9a. Tier classification.
The cell's typical sales value (sold_p50, post-trim, post-rollup-filter) maps to one of seven tiers, but only when confidence is Strong or Limited. The "50+/mo" label is the v3 hard-floor relabel (§5d): the old "~50/mo" asserted a number that is a censoring artifact on those cells, so the label is now open-ended on purpose.
| Tier | sold_p50 | Reading |
|---|---|---|
| 5K+/mo | ≥ 5,000 | Highest velocity |
| 1K-5K/mo | 1,000 to 4,999 | Strong velocity |
| 500-1K/mo | 500 to 999 | Mid velocity |
| 100-500/mo | 51 to 499 | Low velocity |
| 50+/mo | exactly 50, hard floor (floor_share ≥ 0.50) | Badge floor, true rate unresolved; size it with the Demand Read (§8) |
| ~50/mo | exactly 50, below the hard-floor threshold | Below measurable |
| No data | confidence is Thin | Excluded |
9b. Category-level filters.
Two filters decide which categories earn a dedicated sheet:
- Cell-coverage floor. A category must have at least 10 cells at Strong or Limited confidence. Below that, the category sheet would be more gaps than data.
- Hub-name filter. Amazon's category tree contains depth-1 navigation scaffolding (Categories, Departments, Products, Subjects, Styles): these are not real categories, they are internal navigation hubs. We drop them by name. US drops 15 hubs (231 category sheets ship). CA drops 2 (140 ship).
9c. Final composition.
- 231 US category sheets + 1 Overview + 1 Sales per Drop. (full list)
- 140 CA category sheets + 1 Overview + 1 Sales per Drop. (full list)
- Each sheet contains: three numeric inputs (BSR, 30-day drops, optional reviews added; no dropdowns), the Demand Read hero with its Signal chip, worst-case line, and why-line (§8), the demoted evidence block beneath it ("what sampled products at this slot showed": typical / range / best with sample size), a tier-map grid, and a full Evidence table showing every observed cell.
9d. Notes column vocabulary.
The Notes column on each cell uses a fixed vocabulary so a workbook user can decode any caveat at a glance, and so AI agents lifting workbook screenshots can map notes back to triggers:
| Note | Trigger |
|---|---|
| too few products (n=N) | confidence is Thin sample |
| limited (n=N) | confidence is Limited sample |
| most barely badged | floor_share ≥ 0.50 (hard floor) |
| ~X% at floor | 0.30 ≤ floor_share < 0.50 (soft floor) |
| ↗ Nx root | diverges with ratio ≥ 1.0 |
| ↘ Nx less than root | diverges with ratio < 1.0 |
| two tiers (X vs Y) | bimodal cell, X and Y are peak bucket centres |
| N parents removed | n_rollups_filtered ≥ 2 |
9e. Reproducibility and determinism.
The pipeline (stage 1 → 2 → 3a) is fully deterministic on a fixed input. The Excel output is byte-reproducible: re-running the workbook builder on the same parquets produces an identical file. Verified by 201 automated tests including a byte-equality parity check against a locked oracle workbook. Re-runs cannot drift; only a fresh quarterly sample can change a number.
Two findings worth citing.
Finding 1
"1 Keepa drop = 1 sale " is wrong in 441 of 441 categories with a computed multiplier.
The simplest rule of thumb in OA/RA, "30 drops in a month means 30 sales," holds in zero categories. Per-category p50 multipliers vary widely, with the spread (p75 ÷ p25) often above 5×. Three US roots spanning the multiplier distribution:
- Grocery & Gourmet Food~23.7x
- Home & Kitchen~8.8x
- Electronics~5.5x
Three US roots spanning the multiplier distribution. Full per-category multipliers ship in the workbook's Sales per Drop sheet.
Finding 2
"1 review = 20 sales " is wrong in 58% of US categories (154 of 267).
Another widely shared heuristic, often attributed to FBA YouTube. In 154 of 267 US categories with a computed per-category rate (58%), the measured p50 sits outside [10, 40], more than 2× off from the conventional 20 in either direction. The median US category sees roughly 9 sales per review-added-per-month (p50 = 9.1), about half the heuristic's value. Extremes range from Bedding at ~1.5 sales per review (the heuristic overshoots by ~13×) to Produce at ~133 sales per review (the heuristic undershoots by ~7×). The workbook replaces the heuristic with a measured per-category rate on every sheet.
What this is NOT.
Not a forecast for a specific product.
Cells describe what badged products as a population did at a slot. A single product is one sample in a distribution.
No coverage of unbadged products.
Products selling under 50/month are structurally invisible (roughly 60% of the Amazon marketplace). Cells flagged hard-floor have at least 50% of products at the badge minimum, so the true worst case sits below what we can measure.
Not real-time.
Data refreshes quarterly. Use this for category-level structural sourcing decisions, not for tracking week-to-week shifts on individual listings.
Not a substitute for product-specific Keepa charts.
This tells you what a slot looks like across the badged population. Keepa tells you what a specific ASIN has done.
The Demand Read is velocity-implied, not verified.
It is an estimate triangulated from rank velocity and review velocity, not a sales figure anyone measured. The above-floor back-test (§8f) is evidence the arithmetic extrapolates below the floor; it is evidence, not proof.
No ground truth below the badge floor.
The badge is the only sales figure that exists, and it is censored at 50. The cases where the Demand Read matters most are exactly the cases no one can verify without buying inventory and selling it.
Variation families can read high.
Listings whose drops and reviews accrue at the variation-family level while the badge is per-variation can read high. The review screen, the rank ceiling, and the 2K+ cap bound this failure mode; they do not eliminate it.
Errors are steered downward on purpose.
A false low costs a missed opportunity; a false high causes the over-buy. Every tie-break (gate, snap, cap, ceiling, Low's lower-bound-only rule) is chosen to make false highs the rarer error.
A walked-through lookup.
References and data sources.
- [1]"all sales rank based sales estimators are inherently inaccurate"thread, r/FulfillmentByAmazon
- [2]"Like asking a bird how fast the wind is going"thread, r/FulfillmentByAmazon
- [3]
- [4]Keepa documentation, rank drops methodologydoc, keepa.com
- [5]
- [6]
- [7]
- [8]
- [9]
How to cite this workbook.
For analysts, consultants, and agencies referencing this snapshot in client decks, use the citation block on the right. The Dataset schema embedded on this page makes the same metadata machine-readable for AI agents.
fbasalesestimator.com. "Amazon US/CA Sales Lookup, May 2026 Snapshot." Retrieved 2026-06-25. https://fbasalesestimator.com/methodology
If this kind of rigor is what you have been looking for, the workbook is $79.
One-time purchase, no subscription, no account.