Dria

Swan
NFTHardhat
21,000 USDC
View results
Submission Details
Severity: medium
Valid

Unrestricted validation score range for validators in `LLMOracleCoordinator::validate`.

Summary

The validate functions lack restrictions on the range of scores that validators can submit. This omission allows validators to submit arbitrary scores, which can disproportionately skew mean and standard deviation calculations. As a result, reward distributions can become biased, disproportionately benefiting validators who submit outlier scores.

Description

Within the validate function, validators can submit an array of scores without any restriction on maximum allowable values. These scores are then used to compute mean and standard deviation values using Statistics.stddev in the finalizeValidation function. In the next step the "inner mean" is calculated by including only scores within one standard deviation of the mean, rewarding validators who fall within this range, see [https://github.com/Cyfrin/2024-10-swan-dria/blob/main/contracts/llm/LLMOracleCoordinator.sol#L343-L348].

Without a maximum score constraint, the validation process is susceptible to skewed statistics due to outlier scores. Validators submitting extremely high values can inflate both the mean and standard deviation, affecting the inner mean calculations and global threshold. This bias risks excluding validators with normal scores from rewards, and creates potential for reward distribution manipulation.

Within finalizeValidation, this section performs the filtering based on mean and stddev:

for (uint256 v_i = 0; v_i < task.parameters.numValidations; ++v_i) {
uint256 score = scores[v_i];
@> if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) {
innerSum += score;
innerCount++;
_increaseAllowance(validations[taskId][v_i].validator, task.validatorFee);
}
}

This code block only includes scores that are within one standard deviation of the mean (_mean ± _stddev) to participate in the "inner mean" calculation. This can result in valid scores being excluded and validators not receiving a validator fee.

Exploit Scenario

Suppose a subset of validators submits arbitrarily high scores, for instance, in the range of 100 to 150, while the majority of validators submit scores in a "normal" range, like 0 to 5.
The presence of high scores inflates both _mean and _stddev. For example, lets assume that scores[] array in LLMOracleCoordinator::finalizeValidation (see [https://github.com/Cyfrin/2024-10-swan-dria/blob/main/contracts/llm/LLMOracleCoordinator.sol#L331]) contains 5 values, e.g. [130,3,140,4,120].
Introducing arbitrarily high scores will drive up the _mean significantly.
The _stddev will similarly increase, reflecting the variance introduced by these high values. For the given array values [130,3,140,4,120], both _mean and _stddev can be quickly calculated using the fixed version of the Statistics library in Remix. Here are the results:

decoded input {
"uint256[] data": [
"130",
"3",
"140",
"4",
"120"
]
}
decoded output {
"0": "uint256: ans 62",
"1": "uint256: mean 79"
}

This example illustrates that due to the inflated _mean and _stddev, the range (_mean - _stddev) && (_mean + _stddev) no longer covers the lower "normal" scores (like those in the range 0-5). This would mean that only scores closer to the inflated mean would satisfy the condition (score >= _mean - _stddev) && (score <= _mean + _stddev). As a result, validators with normal scores (0-5) are effectively excluded from validator rewards because their scores fall outside the newly skewed range. This behavior enables "dishonest" validators to effectively eliminate the competition, resulting in only the inflated scores being rewarded.

PoC

For the proof of concept we will stick to the score values, similar to those in LLMOracleCoordinator.test.ts. Here is the test case and setup, please paste it into the LLMOracleCoordinator.test.ts file and adjust the this.beforeAll(async function ()) section with additional validators, like this:

// Add validators to the setUp
this.beforeAll(async function () {
// assign roles, full = oracle that can do both generation & validation
const [deployer, dum, req1, gen1, gen2, gen3, gen4, gen5, val1, val2, val3, val4, val5] = await ethers.getSigners();
dria = deployer;
requester = req1;
dummy = dum;
generators = [gen1, gen2, gen3, gen4, gen5];
validators = [val1, val2, val3, val4, val5];
...........
...........
...........
// Test unfair reward distribution
describe("reward distribution", function () {
const [numGenerations, numValidations] = [1, 5];
const scores = [
parseEther("100"), // high score
parseEther("0.8"),
parseEther("120"), // high score
parseEther("0.4"),
parseEther("120") // high score
];
let generatorAllowancesBefore: bigint[];
let validatorAllowancesBefore: bigint[];
this.beforeAll(async () => {
taskId++;
generatorAllowancesBefore = await Promise.all(
generators.map((g) => token.allowance(coordinatorAddress, g.address))
);
validatorAllowancesBefore = await Promise.all(
validators.map((v) => token.allowance(coordinatorAddress, v.address))
);
});
it("should make a request", async function () {
await safeRequest(coordinator, token, requester, taskId, input, models, {
difficulty,
numGenerations,
numValidations,
});
});
it("should respond to each generation", async function () {
const availableGenerators = generators.length;
const generationsToRespond = Math.min(numGenerations, availableGenerators);
expect(availableGenerators).to.be.at.least(generationsToRespond);
for (let i = 0; i < generationsToRespond; i++) {
await safeRespond(coordinator, generators[i], output, metadata, taskId, BigInt(i));
}
});
it("should validate with varied scores, finalize validation, and distribute rewards correctly", async function () {
const requestBefore = await coordinator.requests(taskId);
console.log(`Request status before validation: ${requestBefore.status}`);
// Check the initial status to ensure the task is ready for validation
const initialStatus = BigInt(TaskStatus.PendingValidation);
expect(requestBefore.status).to.equal(initialStatus, "Task is not in PendingValidation state initially.");
for (let i = 0; i < numValidations; i++) {
console.log(`Validating with validator at index ${i} with address: ${validators[i].address}`);
console.log(`Score being used: ${scores[i].toString()}, Task ID: ${taskId}`);
const currentStatus = BigInt((await coordinator.requests(taskId)).status);
if (currentStatus !== initialStatus) {
console.error(`Aborting: Unexpected task status ${currentStatus} before validation at index ${i}`);
break;
}
try {
await safeValidate(coordinator, validators[i], [scores[i]], metadata, taskId, BigInt(i));
console.log(`Validation succeeded for validator at index ${i}`);
} catch (error: any) {
console.error(`Validation failed for validator at index ${i}: ${error.message}`);
}
}
// Final status check to confirm the task is completed
const finalRequest = await coordinator.requests(taskId);
console.log(`Request status after all validations: ${finalRequest.status}`);
expect(finalRequest.status).to.equal(BigInt(TaskStatus.Completed), "Task did not reach Completed status after all validations");
// Check validators' reward allowances after validations
const validatorAllowancesAfter = await Promise.all(
validators.map((v) => token.allowance(coordinatorAddress, v.address))
);
// Expected outcome: validators 0, 2, 4 receive rewards, 1 and 3 do not
const expectedRewards = [0, 2, 4]; // Indices expected to have rewards
const noRewardValidators = [1, 3]; // Indices expected not to receive rewards
for (let i = 0; i < numValidations; i++) {
const rewardDifference = validatorAllowancesAfter[i] - validatorAllowancesBefore[i];
console.log(`Validator ${i} reward: ${rewardDifference.toString()}`);
if (expectedRewards.includes(i)) {
// These validators should receive a reward
expect(rewardDifference).to.be.gt(0n, `Validator ${i} was expected to receive a reward but got none`);
console.log(`Validator ${i} received reward as expected.`);
} else if (noRewardValidators.includes(i)) {
// These validators should not receive a reward
expect(rewardDifference).to.equal(0n, `Validator ${i} was not expected to receive a reward but did`);
console.log(`Validator ${i} correctly received no reward.`);
}
}
});
});

The test can be run with yarn test ./test/LLMOracleCoordinator.test.ts.
Here are the logs:

Validator 0 reward: 2400000000000000
Validator 0 received reward as expected.
Validator 1 reward: 0
Validator 1 correctly received no reward.
Validator 2 reward: 2400000000000000
Validator 2 received reward as expected.
Validator 3 reward: 0
Validator 3 correctly received no reward.
Validator 4 reward: 2400000000000000
Validator 4 received reward as expected.

As you can see from the error logs, the validators with scores of 0.8 and 0.4 didn't get the rewards because they used a different scoring range than the other 3 validators.

Impact

Validators submitting outlier scores (e.g., in the range of 100–200) can disproportionately influence the validation mean and standard deviation, skewing the calculated range for validator reward eligibility. As a result, other, more representative scores (e.g., in the range of 0–5) are excluded from validator rewards. This manipulation can lead to honest validators being unfairly denied rewards.

Tools Used

Manual review, Remix, Hardhat

Recommendations

To prevent manipulation through extreme score outliers, consider introducing a configurable maxScore parameter within the TaskRequest or TaskResponse struct to enforce a maximum allowable score range. By setting a limit, validators' scores are restricted to a reasonable threshold, preventing inflated values from disproportionately affecting the mean and standard deviation calculations. This parameter should be checked within the validate function to ensure all scores stay within the acceptable range.

+ // Place this custom error at the top of the `LLMOracleCoordinator.sol` contract:
+ error ScoreOutOfRange(uint256 taskId, uint256 providedScore, uint256 maxScore);
.......
.......
function validate(uint256 taskId, uint256 nonce, uint256[] calldata scores, bytes calldata metadata)
public
onlyRegistered(LLMOracleKind.Validator)
onlyAtStatus(taskId, TaskStatus.PendingValidation)
{
TaskRequest storage task = requests[taskId];
// ensure there is a score for each generation
if (scores.length != task.parameters.numGenerations) {
revert InvalidValidation(taskId, msg.sender);
}
+ uint256 maxScore = task.maxScore;
+ for (uint256 i = 0; i < scores.length; i++) {
+ if (scores[i] > maxScore) {
+ revert ScoreOutOfRange(taskId, scores[i], maxScore);
}
}
............
............
}
Updates

Lead Judging Commences

inallhonesty Lead Judge 12 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Unbounded score values in `validate` function

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.