/Documentation

Running

Backtesting

Last updated April 20, 2026

The Backtest tab simulates your compiled strategy against real historical Binance kline data — entirely in the browser, with no server round-trip required. After each run, a comprehensive Strategy Analytics Suite automatically computes eight analysis modules that reveal how, when, and why your strategy performs the way it does.

Overview

The backtest engine uses the same compiled JavaScript code and indicator library as the live runner, ensuring that what you test is exactly what runs live. Fills execute at each bar's close price (the mark price), and the timeframe is automatically detected from the compiled strategy code — you do not select it manually.

⚠️
Compile before backtesting. The backtest uses the last compiled version of your strategy. If you've made changes in the Editor, click Compile first — otherwise you're testing stale code.

Setup & Configuration

FieldDescription
SymbolThe trading pair to simulate. Defaults to the first symbol configured in the project. You can override it with any valid Binance symbol (e.g. SOLUSDT).
From DateStart of the simulation window. The engine fetches additional bars before this date to serve as the warm-up period for indicator initialization.
To DateEnd of the simulation window. Defaults to today.
Starting CapitalSimulated USD balance at the start. Default: $1,000. Does not affect indicator calculations — only PnL and equity curve values.
TimeframeAuto-detected from compiled code. The engine reads the @norena-timeframes manifest comment in the compiled JS to determine which timeframes to fetch. You cannot override this — it reflects what your strategy actually uses.

Data Limits by Timeframe

The backtest fetches data directly from Binance's public REST API. To prevent extremely long downloads, each timeframe has a maximum date range cap. If your requested window exceeds the cap, the engine silently trims the start date and notes the adjustment.

TimeframeMax RangeApprox. Max Bars
1m7 days~10,080
5m30 days~8,640
15m90 days~8,640
30m180 days~8,640
1h~1 year~8,760
2h12h~2 years~8,760
1d~5 years~1,825
💡
Use 1h or 4h for meaningful backtests. These timeframes give you 1–4 years of data — enough to cover multiple market cycles (bull, bear, sideways) — while keeping the download fast. The 1m timeframe is only suitable for very short-range tests.

How the Engine Runs

When you click Run Backtest, the engine performs the following steps in order:

  1. Fetch klines: Historical candle data is fetched from Binance for the primary timeframe and any additional timeframes used by the strategy. Each paginated request fetches up to 1,000 candles.
  2. Identify warm-up bars: Any candles that fall before the selected from-date are counted as warm-up bars. These bars run the strategy so indicators can initialize, but no trades are executed during this period.
  3. Load kline cache: All fetched series are loaded into an in-memory cache identical in structure to the live runner's cache.
  4. Bar-by-bar simulation: The engine iterates through each candle in chronological order. For each bar, it recreates the full strategy API surface (indicators, HP.buy, HP.sell, condition tracking, etc.) and runs the compiled strategy code in an isolated function context with a 2-second timeout per bar.
  5. Equity and drawdown tracking: After each bar, equity is computed and the running maximum is tracked for drawdown calculation.
  6. Post-loop analytics: All eight analytics modules are computed from the collected EvalLogs and SignalEvents after the simulation completes.

Warm-up Period

The warm-up period is the number of bars that fall before your selected From Date in the fetched data. These bars exist because the engine fetches extra historical data going back further than your start date to ensure indicators are properly initialized by the time the simulation actually begins.

For example, if your strategy uses an EMA(200) and your From Date is January 1, the engine needs at least 200 bars of data before January 1 so the EMA value is meaningful on the first trading bar. The warm-up bars count is shown in the results summary.

ℹ️
Warm-up bars do not affect stats. Drawdown, Sharpe ratio, and all other performance metrics are calculated only from the post-warmup (live) portion of the simulation. The equity curve covers the full date range including warm-up, displayed at a flat line until trading begins.

Core Results

Performance Metrics

MetricDefinitionGood benchmark
Total Return %((End Capital − Start Capital) / Start Capital) × 100. Measures total P&L as a percentage of starting capital.Depends on time period; compare to buy-and-hold.
Total TradesNumber of completed sell-side fills. Each round-trip (buy + full sell) counts as one trade.At least 20–30 trades for statistical significance.
Win RatePercentage of sell trades where realized PnL > 0.Above 50% for trend-following; 40–50% acceptable for high R:R.
Max Drawdown %Largest peak-to-trough equity decline during the live (post-warmup) simulation period. Lower is better.Below 20% for conservative strategies.
Sharpe RatioAnnualized risk-adjusted return: (mean bar return / std dev of bar returns) × √(bars per year). Uses only post-warmup bars.Above 1.0 is acceptable; above 2.0 is strong.
Profit FactorGross profit ÷ gross loss in USD. A profit factor of 1.5 means you make $1.50 for every $1.00 lost.Above 1.5 is generally solid.
Avg Win %Average percentage gain from entry to exit on winning trades.Should exceed Avg Loss % × (1 / win rate) for positive expectancy.
Avg Loss %Average percentage loss from entry to exit on losing trades (shown as a negative number).Should be bounded by your stop-loss setting.
Avg Hold (Bars)Average number of primary-timeframe bars between entry and exit across all completed trades. Converted to hours/days for display.Longer holds reduce trade frequency noise.
Bars SimulatedTotal post-warmup bars processed. This is the "live" simulation length.More bars = more reliable statistics.

Equity Curve

The equity curve plots your simulated capital value over time. It includes both the warm-up period (flat line at starting capital, no trades) and the live simulation period (where capital fluctuates based on realized PnL from sells and mark-to-market value of open positions).

The curve color reflects overall performance: green for net-positive, red for net-negative. The chart is downsampled to a maximum of 2,000 points for fast rendering without losing the shape of the curve.

Strategy Analytics Suite

After every backtest run, eight analytics modules automatically analyze the collected evaluation data. These modules are computed from per-bar condition logs and per-signal event records, giving you insight into your strategy's behavior that goes far beyond simple PnL metrics.

ℹ️
No re-simulation needed. All analytics are derived from data captured during the single backtest run. Changing the date range or symbol and re-running will produce fresh analytics automatically. The analytics panels appear below the equity curve as soon as results are available.

1 — Strategy Behavior

Provides a high-level activity summary of how frequently your strategy signals:

  • Total bars evaluated: Number of post-warmup candles processed.
  • Total BUY / SELL signals: Count of times each signal type fired.
  • Avg bars between signals: Mean gap between consecutive signals (buy or sell). A very small number (e.g. 1–2 bars) suggests over-trading.
  • Avg holding time: Average bars between entry and exit, displayed in human-readable time (e.g. "4.2h" for a 1h-timeframe strategy).
💡
A very high signal frequency with short holding times often indicates a strategy that would be destroyed by transaction costs in live trading. Aim for holding times of at least several bars.

2 — Trigger Stats

Shows how often each individual condition in your strategy evaluated to true, across the full simulation. Each block that generates a boolean result (RSI < 30, Cross Up, In Position, etc.) is tracked separately.

  • Condition name: The block's label as it appears in the strategy (e.g. "RSI(14)").
  • True rate: Percentage of evaluated bars where this condition was true.
  • Contributor to BUY / SELL: Whether this condition is in the buy signal path, sell signal path, or both.

A condition with a very low true rate (e.g. 0.5%) is extremely restrictive and may be preventing valid entries. A condition with a 95%+ true rate is essentially doing nothing to filter signals.

3 — Signal Quality

Each buy and sell signal is evaluated using a lookahead window (5 bars by default) to classify whether it was directionally correct:

  • Good: Price moved in the expected direction by at least 0.3% within the lookahead window (up for buys, down for sells).
  • Bad: Price moved adversely by at least 0.3%.
  • Neutral: Price moved less than 0.3% in either direction — no strong directional signal.

Signal quality shows your strategy's directional accuracy independent of when it exits. A strategy can have a low win rate but high signal quality if its exits are poorly timed. Conversely, good win rate with poor signal quality suggests lucky entries rather than skill.

ℹ️
Signal quality requires at least 5 signals to display. The lookahead MFE (Maximum Favorable Excursion) and MAE (Maximum Adverse Excursion) are also tracked per signal to show the best and worst price reached within the lookahead window.

4 — Trade Distribution

Breaks down completed trades into buckets to reveal the shape of your strategy's outcome distribution:

  • PnL buckets: How many trades fell into each PnL% range (e.g. -5% to -3%, -3% to -1%, etc.). A healthy strategy shows a right-skewed distribution — more wins in the upper buckets than losses in the lower buckets.
  • Duration buckets: How long trades were held, bucketed by bar count. Reveals whether your wins and losses have different typical holding times.
  • Expectancy: (Win Rate × Avg Win%) − (Loss Rate × Avg Loss%). A positive expectancy means the strategy has a mathematical edge. This is the single most important metric for long-term viability.
  • Profit concentration: What percentage of total profit came from the top 20% of winning trades. High concentration (e.g. 80%+) means results depend heavily on a few outlier trades.

5 — Condition Removal Impact

This module estimates the impact of each individual condition on signal quality by comparing outcomes when the condition was present versus absent. It answers the question: does this condition actually improve or harm signal quality?

For each condition and each signal direction (buy / sell), the module computes:

  • Quality when present: % of "Good" signal quality classifications when this condition was true at signal time.
  • Quality when absent: % of "Good" classifications when this condition was not true.
  • Quality lift: The difference. A positive lift means removing this condition would hurt quality; a negative lift means it's actively harming signal quality.
  • Verdict: Automatically classified as Helps, Hurts, or Neutral based on a 5 percentage-point threshold.
💡
A condition marked Hurts is a candidate for removal or replacement. A condition marked Helps should be kept or possibly made stricter. This is a fast way to identify which parts of your strategy are doing the heavy lifting.

6 — Strategy DNA

Identifies the specific combinations of conditions that correlate most strongly with winning and losing trades. Rather than looking at individual conditions in isolation (like Condition Removal), Strategy DNA looks at which patterns of conditions tend to appear together at signal time.

  • Best patterns: Condition combinations with the highest win rates across trades where they appeared.
  • Worst patterns: Condition combinations associated with the highest loss rates.

This helps you identify beneficial and detrimental signal contexts that may not be visible when looking at conditions individually. For example, "RSI(14) true AND Trend Bullish true" together might have a much higher win rate than either condition alone.

ℹ️
Strategy DNA requires at least a few dozen trades with consistent condition patterns to produce meaningful results. If your strategy has fewer than 10–15 trades, this panel may show "Insufficient data."

7 — Regime Analysis

Classifies the market regime at the time of each trade entry and groups outcomes by regime, answering: in what market conditions does your strategy work best?

Trend classification uses EMA(50) vs EMA(200) with price position:

  • Strong Uptrend: EMA50 > EMA200 AND close > EMA50
  • Strong Downtrend: EMA50 < EMA200 AND close < EMA50
  • Sideways: Everything else — conflicting EMAs or price between them

Volatility classification uses ATR(14) percentiles across the full backtest period:

  • High Volatility: ATR at or above the 66th percentile
  • Low Volatility: ATR below the 33rd percentile
  • Medium Volatility: Everything in between

For each regime combination (e.g. "Strong Uptrend + High Volatility"), the panel shows trade count, win rate, and average PnL%. The Best Regime and Worst Regime are highlighted with an automatically generated insight.

⚠️
Regime analysis requires 200+ candles. The EMA(200) calculation needs at least 200 bars of data. If your backtest window is too short or your timeframe too large, this panel will indicate insufficient data.
💡
If regime analysis shows your strategy underperforms badly in one regime (e.g. Strong Downtrend), consider adding a Trend Direction filter block to gate entries to favorable conditions only.

8 — Strategy Autopsy

Compares the conditions present at buy signal time between winning and losing trades to surface diagnostic patterns. This is the most powerful module for understanding why specific trades fail.

For each condition, the autopsy shows:

  • Winner presence rate: How often this condition was true in winning-trade buy signals.
  • Loser presence rate: How often this condition was true in losing-trade buy signals.
  • Difference: Loser rate minus winner rate. A large positive difference means this condition is associated with bad trades — a flag for review.

The autopsy also lists the most representative winning and losing signal "fingerprints" — the exact set of conditions that were true at buy time, grouped by outcome. This gives you a concrete picture of what "good" vs "bad" entry conditions look like for your specific strategy.

ℹ️
A minimum sample of 3 trades per group is required to surface a condition stat. Small sample sizes reduce reliability — aim for at least 10 winners and 10 losers for meaningful autopsy results.

Exporting Results

Click Export CSV in the Backtest tab header to download a multi-section CSV file containing:

SectionContents
SUMMARYOne row with all headline metrics: symbol, timeframe, start/end capital, total return, trade count, win rate, max drawdown, Sharpe ratio, bars simulated, warm-up bars.
TRADESOne row per trade (buy and sell) with: side, price, quantity, USD value, realized PnL, equity after fill, position ID, and timestamp.
EQUITYFull equity curve with timestamp, equity value, and drawdown percentage for each bar.
ERRORSAny strategy errors encountered during simulation, with bar index and timestamp for debugging.

The CSV is suitable for further analysis in Excel, Python (pandas), or any quantitative analytics tool. The filename includes the symbol, timeframe, and date range for easy identification.