About Slow Boil — How We Test Kitchen Gear

Our approach to kitchen gear

A kitchen gear resource where the reviewer actually cooks dinner.

Most kitchen gear writing is a gap between two things: someone who held a product for fifteen minutes and someone who measured its thermal mass. Slow Boil is what happens when neither of those is sufficient — and you have eight weeks and a real kitchen to find out for yourself.

“40% of what we test never gets recommended. That number earns the other 60%.”

The problem wasn’t a lack of information. It was that the information was untethered to cooking.

The kitchen gear review space had a format problem. Sites would publish enthusiastic prose about a pan’s “ergonomic design” and “even heat distribution” without mentioning what they actually cooked in it, how long, or at what temperature. The professional publications reviewed as if every reader had a combi oven and a kitchen scale calibrated to 0.1 grams. The YouTube reviewers unboxed things and called them beautiful, then moved on. The result was that someone trying to decide between three chef’s knives at similar price points — an entirely reasonable question — had to read six reviews that all said the same things in different adjectives and leave no better informed than before.

“We needed a site that talked about kitchen gear the way cooks talk about kitchen gear — in the context of what was actually on the stove.”

The first six months were mostly purchasing mistakes. A mandoline that was genuinely impressive until the locking mechanism failed on the third use. A carbon steel pan that was recommended by three credible sources and warped on a gas range running medium heat — a detail no one had mentioned because, it turned out, none of them had cooked on gas. A chef’s knife that was widely praised for its balance and edge retention and that turned out to be difficult to sharpen at home without a fairly specific stone angle — again, absent from every review because knife sharpening apparently doesn’t come up when you’re reviewing knives. The failure rate was startling. The consistent explanation was that almost nothing had been tested long enough or in conditions that matched how people actually cook.

What Slow Boil settled on, after about a year of figuring out what this needed to be, is simple but not easy: every product gets cooked with for a minimum of eight weeks before anything is written, every review goes back to check at three months and six months, and we don’t publish recommendations for products we wouldn’t replace if ours broke tomorrow. That last one eliminates about 40% of everything we test. The 40% matters. It’s not a footnote — it’s the reason the remaining 60% means something.

What we do every week, and what we’re trying to change.

Mission — right now

We cook with gear for eight weeks, then tell the truth about it.

Every week: at least one product is in active testing. That means it’s being used in real meals — not unboxed, not used once for photographs, not stored to be used “soon.” When a review is ready, it includes what failed, what surprised us, whether the price held up against the performance, and whether we’d buy it again knowing what we know now. What doesn’t get recommended stays documented. Four products that once made the list have been removed after long-term problems surfaced. Those updates are noted publicly on the original review pages.

Vision — the change we want to make

A future where kitchen gear marketing has to compete with honest long-term results.

What we’re working toward is a version of this space where “tested by someone who cooks regularly” is the minimum standard for publication — not a differentiator. Where a pan manufacturer knows their product will be evaluated on whether it’s still worth using after a year, not whether it looks good in a press release photograph. That shift doesn’t require a large site. It requires a consistent one that readers trust enough to cite when they’re trying to convince a friend to buy the right thing instead of the impressive one.

What guides every test, every review, every published word.

Season Before You Judge

Cast iron doesn’t reveal itself in a single use. Neither does most serious cookware. Our minimum testing window is eight weeks because first impressions — even careful ones — miss almost everything that matters in a kitchen tool. The carbon steel pan that felt rough on day one and became our default by week six is the rule, not the exception. We wait.

The Rejection Rate Is the Standard

We don’t minimize the 40% of products we don’t recommend. We lead with it. That rejection rate is what makes a recommendation meaningful — and it’s what separates a resource from a catalog. When we say a knife is worth buying, it means it passed a standard that eliminated three other knives in the same price category.

Cook the Hard Thing

We deliberately use products in conditions that stress them — acidic braises in carbon steel, frozen proteins in non-stick, bread dough in scales not marketed for baking. A knife that performs beautifully on soft vegetables should also be tested on butternut squash. If it fails a reasonable use case, that belongs in the review, not left as a footnote.

Own Money, Own Opinion

Every product on this site was purchased at retail price with our own money. No gifted items, no “review units,” no advanced copies from brands. This is not a virtue we claim — it’s an operational requirement. The moment a recommendation is financially detached from the cost of being wrong, the review is about something other than the kitchen.

The Three-Month Check

Publishing a recommendation is not the end of the process. We return to every product at three months and six months after publication. Performance changes. Coatings degrade. Handles loosen. Four products we once recommended were removed after follow-up testing. Those updates live on the original review pages permanently — because the fact of removal is as important as the original recommendation.

Name the Competitor

Every product we test is evaluated against at least one alternative at a similar price point. A recommendation without comparison is incomplete. Saying “the Victorinox 8-inch is a better everyday knife than the Global G-2 for most home cooks” is more honest and more useful than saying “the Victorinox 8-inch is excellent.” We make the comparison explicit.

Four steps. None of them are fast.

Sourcing — how the product gets chosen

Categories are selected based on where the information gap is most damaging: places where price ranges are wide, marketing language is thick, and the difference between a good purchase and a bad one plays out over months rather than minutes. Within a category, products are chosen by reading what cooks are actually asking about — on forums, in recipe comments, in the questions people search before they buy. The specific product ordered is always the version the buyer would receive, at the price they’d pay on a normal Tuesday. No manufacturer samples. No pre-production versions. No discount codes offered in exchange for coverage.

Testing — what “eight weeks of cooking” actually means

The product gets used in real cooking — weeknight dinners, weekend projects, the kinds of recipes that stress different properties of a tool at different heat levels. A skillet gets used for searing, braising, oven finishing, and then left out for a week to see how maintenance feels after novelty has faded. Notes are taken after sessions where something went wrong, because how a product behaves when you use it incorrectly is part of what you’re evaluating. The handle that gets dangerously hot on medium heat. The blade that rolls rather than chips at a hard surface. The scale that drifts after 60 minutes of continuous use. These don’t come up in the first week.

Comparison — what we mean when we say “better”

Every tested product is evaluated against at least one competitor at a comparable price. “Better” in this context means better for a home cook cooking four or five times a week — not better for a professional kitchen, not better on a spec sheet. The comparison is cooked-in, not theoretical. When we say the $45 carbon steel pan performs comparably to the $180 version for most uses, that claim comes from cooking with both, not from measuring gauge thickness. The one case where the expensive version is worth it gets named plainly: what specifically does it do better, in what cooking scenario, and whether most home cooks will ever reach that scenario.

Publication and follow-up — when the review isn’t over

A review is published only when there’s something honest to say — not when a deadline demands it. The 40% of products that don’t earn a recommendation get documented in our “The 40% Rule” feature rather than quietly ignored: what it was, what failed, what we bought instead. Published recommendations are revisited at three months and six months. If something changes — coating failure, handle degradation, price increase that changes the value equation — the review gets updated and dated. Four products have been removed from the list since launch. Those removal notes are part of the public record. We’d rather have fewer recommendations that hold up than more that don’t.

Read the list that didn’t make it to the site.

The recommendations are public. What’s harder to find is the other side: the products that failed at week three, the knives we quietly stopped using by month two, the pan that warped in conditions it should have handled easily. The “40% Rule” feature is where that record lives — unfiltered, with specific product names and specific failure modes.

40%

Of products tested have never appeared as a recommendation on this site. That number is the standard everything else is measured against.