Dria

Swan

NFTHardhat

21,000 USDC

View results

Previous Next

Submission Details

Severity: high

Valid

Statistics.sol's `variance` doesn't handle cases in which individual scores will be less than mean

inh3l

Summary

The variance function upon which stdDev is based incorrectly assumes that the mean of the scores will always be greater than or equal to the individual scores which is not always the case.

Vulnerability Details

The arithmetic mean is the sum of a dataset of values divided by the count of values in the dataset. It provides the "central" tendency of the elements in the dataset and always lies between the smallest and largest elements. As a result, cases can occur in which the mean is greater than certain elements in the set.

This property isn't taken into consideration instead when calculating variance for the stddev function.

We can see that the stddev function first attempt to calculate the variance of the data.

function stddev(uint256[] memory data) internal pure returns (uint256 ans, uint256 mean) {

@> (uint256 _variance, uint256 _mean) = variance(data);

mean = _mean;

ans = sqrt(_variance);

}

And when variance is calculated, just like the formula requires, the mean is deducted from each indivual member of the data array. The parameter is set as diff, uint256.

function variance(uint256[] memory data) internal pure returns (uint256 ans, uint256 mean) {

mean = avg(data);

uint256 sum = 0;

for (uint256 i = 0; i < data.length; i++) {

@> uint256 diff = data[i] - mean;

sum += diff * diff;

}

ans = sum / data.length;

}

However, as established the mean lies in between the elements of the array, and in a lot of cases, will actually be greater than the element. But since the variance function doesn't account this, the line uint256 diff = data[i] - mean; will mostly always revert due to undeflow error. As a result, the queries to stddev will be dossed.

This is important as to validate a taskId's request, the finalizeValidation function is reqiured to compute the standard deviation and mean of the scores and generation scores to be validated.

function finalizeValidation(uint256 taskId) private {

TaskRequest storage task = requests[taskId];

// compute score for each generation

for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {

// get the scores for this generation, i.e. the g_i-th element of each validation

uint256[] memory scores = new uint256[]();

for (uint256 v_i = 0; v_i < task.parameters.numValidations; v_i++) {

scores[v_i] = validations[taskId][v_i].scores[g_i];

}

// compute the mean and standard deviation

@> (uint256 _stddev, uint256 _mean) = Statistics.stddev(scores);

// compute the score for this generation as the "inner-mean"

// and send rewards to validators that are within the range

uint256 innerSum = 0;

uint256 innerCount = 0;

for (uint256 v_i = 0; v_i < task.parameters.numValidations; ++v_i) {

uint256 score = scores[v_i];

if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) {

innerSum += score;

innerCount++;

// send validation fee to the validator

_increaseAllowance(validations[taskId][v_i].validator, task.validatorFee);

}

// set score for this generation as the average of inner scores

uint256 inner_score = innerCount == 0 ? 0 : innerSum / innerCount;

responses[taskId][g_i].score = inner_score;

}

// now, we have the scores for each generation

// compute stddev for these and pick the ones above a threshold

uint256[] memory generationScores = new uint256[]();

for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {

generationScores[g_i] = responses[taskId][g_i].score;

}

// compute the mean and standard deviation

@> (uint256 stddev, uint256 mean) = Statistics.stddev(generationScores);

for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {

// ignore lower outliers

if (generationScores[g_i] >= mean - generationDeviationFactor * stddev) {

_increaseAllowance(responses[taskId][g_i].responder, task.generatorFee);

}

Impact

The function would mostly always revert due to underflow error, dossing request validations which is a very important protocol process.

Tools Used

Manual Review

Recommendations

Modify the variance function to check first for which is greater between data[i] and mean, then deduct the lesser from bigger.
Something like this.

function variance(uint256[] memory data) internal pure returns (uint256 ans, uint256 mean) {

mean = avg(data);

uint256 sum = 0;

for (uint256 i = 0; i < data.length; i++) {

- uint256 diff = data[i] - mean;

+ uint256 diff;

+ if (data[i] => mean) {

+ diff = data[i] - mean;

+ }

+ else {

+ diff = mean - data[i];

+ }

+ sum += diff * diff;

+ }

ans = sum / data.length;

}

Updates

Lead Judging Commences

inallhonesty Lead Judge 8 months ago

Submission Judgement Published

Validated

Assigned finding tags:

Underflow in computing variance

Prize pool breakdown

Total prize

21,000 USDC

nSLOC

1,034

20 USDC / LOC

High Medium

20,000 USDC

Low

1,000 USDC

Live

The contest is live. Earn rewards by submitting a finding.

Judging

Submissions are being carefully reviewed by our judges.

View all submissions

Appeals

This is your time to appeal against judgements on your submissions.

Appeals Review

Appeals are being carefully reviewed by our judges.

Rewards Distribution

The contest is complete and the rewards are being distributed.

View results

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.