Dria

Swan

NFTHardhat

21,000 USDC

View results

Previous Next

Submission Details

Severity: low

Invalid

Biased variance estimator will result in narrower accepted scores intervals

robertodf99

Summary

When working with small sample sizes, using an unbiased estimator of variance is important to accurately reflect the population variance. Bessel’s correction is applied by adjusting the denominator to , rather than , to counteract the tendency of small samples to underestimate the true population variance. This adjustment removes 1 degree of freedom because the sample mean is calculated from the sample itself:

Vulnerability Details

Underestimating the variance results in a smaller standard deviation, which, in turn, leads to narrower confidence intervals. This bias reduces the range of scores considered in calculating final validation and generation scores, potentially missing the statistically optimal range of responses.

Impact

The impact of underestimating variance becomes more pronounced when:

The sample size of scores is small (as with 10 validation scores).
Scores vary significantly across samples.

To quantify this, a Monte Carlo simulation with 100,000 trials was conducted using MATLAB. In each trial, random validation scores were generated for different fictitious responses, sampling from a normal distribution with a fixed mean and standard deviation. The range of validation scores selected to calculate response scores was computed using both the biased and unbiased estimators, as well as the known population parameters.

The simulation aimed to determine how often the unbiased estimator’s filtered range provided a closer approximation to the optimal score derived using population boundaries (our theoretical standard) than the biased estimator did.

According to the information provided by the sponsor team, the scores are in the range 0 to 1e18, to keep the simulation closer to the real scenarios, they are generated from a normal distribution with mean 5e17 and a standard deviation of 1e17.

See PoC

% Outer loop for Monte Carlo simulations

diffs = [];

for i = 1:10

% Monte Carlo Simulation for Biased vs Unbiased Variance Estimator

% Simulation parameters

num_trials = 100000; % Number of Monte Carlo trials

num_generations = 10; % Number of generator responses in each trial

num_validations = 10; % Number of validation scores per generation

mu = 5e17; % Population mean

stdev = 1e17; % Population standard deviation

count_biased_diffs = 0;

count_unbiased_diffs = 0;

for trial = 1:num_trials

% Generate validation scores for each generator response

scores = randn(num_generations, num_validations) * stdev + mu; % Random scores ~ N(50, 10^2)

% Store results for biased and unbiased estimators

avg_scores_biased = zeros(num_generations, 1);

avg_scores_unbiased = zeros(num_generations, 1);

avg_scores_theoretical = zeros(num_generations, 1);

% Evaluate each generator response

for gen = 1:num_generations

% Current generator's validation scores

validation_scores = scores(gen, :);

% Calculate the biased and unbiased standard deviations

mean_score = mean(validation_scores);

biased_variance = var(validation_scores, 1); % biased: uses n

unbiased_variance = var(validation_scores, 0); % unbiased: uses n-1

% Filter scores within ±2 SD

filtered_scores_biased = validation_scores(abs(validation_scores - mean_score) <= 2 * sqrt(biased_variance));

filtered_scores_unbiased = validation_scores(abs(validation_scores - mean_score) <= 2 * sqrt(unbiased_variance));

theoretical_filtered_scores = validation_scores(abs(validation_scores - mu) <= 2 * stdev);

% Compute averages for scores within ±2 SD

avg_scores_biased(gen) = mean(filtered_scores_biased);

avg_scores_unbiased(gen) = mean(filtered_scores_unbiased);

avg_scores_theoretical(gen) = mean(theoretical_filtered_scores);

end

% Identify the optimal response for biased and unbiased estimators

[~, opt_gen_biased] = max(avg_scores_biased);

[~, opt_gen_unbiased] = max(avg_scores_unbiased);

[~, opt_gen_theoretical] = max(avg_scores_theoretical);

if opt_gen_biased ~= opt_gen_theoretical

count_biased_diffs = count_biased_diffs + 1;

end

if opt_gen_unbiased ~= opt_gen_theoretical

count_unbiased_diffs = count_unbiased_diffs + 1;

end

diffs = [diffs count_biased_diffs - count_unbiased_diffs];

fprintf('%d iterations completed\n',i);

end

% Display results

fprintf('Unbiased estimator results in a more accurate response on average %d runs more than the biased\n', mean(diffs))

The simulation found that, on average, the unbiased estimator more accurately approximates the optimal score (based on the true population range) approximately 300 times out of 100,000 more than the biased estimator.

Tools Used

Manual review.

Recommendations

Use the unbiased variance estimator in Statistics::variance:

function variance(

uint256[] memory data

) internal pure returns (uint256 ans, uint256 mean) {

mean = avg(data);

uint256 sum = 0;

for (uint256 i = 0; i < data.length; i++) {

uint256 diff = data[i] - mean;

sum += diff * diff;

}

- ans = sum / data.length;

+ ans = sum / (data.length-1);

}

Updates

Lead Judging Commences

inallhonesty Lead Judge 12 months ago

Submission Judgement Published

Invalidated

Reason: Non-acceptable severity

Prize pool breakdown

Total prize

21,000 USDC

nSLOC

1,034

20 USDC / LOC

High Medium

20,000 USDC

Low

1,000 USDC

Live

The contest is live. Earn rewards by submitting a finding.

Judging

Submissions are being carefully reviewed by our judges.

View all submissions

Appeals

This is your time to appeal against judgements on your submissions.

Appeals Review

Appeals are being carefully reviewed by our judges.

Rewards Distribution

The contest is complete and the rewards are being distributed.

View results

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.