Dria

Swan
NFTHardhat
21,000 USDC
View results
Submission Details
Severity: high
Valid

finalizeValidation() incorrectly assumes that standard deviations is always less than mean

Summary

Due to an incorrect assumption about mean and standard deviation results finalizeValidation() can fail with underflow panic.

Vulnerability Details

finalizeValidation() has an if comparing scores with mean and standard deviation differences and sum:

...
uint256 score = scores[v_i];
if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) { // <- assumes _mean >= _stddev in subtraction
innerSum += score;
innerCount++;
...

Here it assumes that _mean >= _stddev is always true, because otherwise score >= _mean - _stddev will underflow due to the subtraction. But that is exactly what can happen. In certain combinations of scores, standard deviation can easily exceed mean of the same elements.

Example data [1e14, 1e15, 1e16]:

  • Standard deviation: 4469899327725401

  • Mean: 3700000000000000

We can see that standard deviation, can greatly exceed mean, even with no min/max elements like 0 or 1e18. The calculation was also confirmed using wolframalpha - population standard deviation {1e14, 1e15, 1e16} and average {1e14, 1e15, 1e16}.

Because standard deviation can exceed mean it will cause a panic and halt the execution.

Impact

When this happens the last oracle will be unable to finalizeValidation() - loss of gas and the buyerAgent will not be able to get a getBestResponse due the task never reaching "Completed" status - loss of fees. And can result in listed assets being lost.

Tools Used

Manual review + hardhat tests.

This issue can't be tested if one of the previous issues isn't fixed variance underflow:

for (uint256 i = 0; i < data.length; i++) {
- uint256 diff = data[i] - mean; // <- bug
+ uint256 diff = data[i] >= mean ? data[i] - mean : mean - data[i]; // <- fix

After the fix run this modified LLMOracleCoordinator.test.ts:

import { expect } from "chai";
import { ethers } from "hardhat";
import type { ERC20, LLMOracleCoordinator, LLMOracleRegistry } from "../../typechain-types";
import type { HardhatEthersSigner } from "@nomicfoundation/hardhat-ethers/signers";
import { parseEther } from "ethers";
import { deployLLMFixture, deployTokenFixture } from "./../fixtures/deploy";
import { registerOracles, safeRequest, safeRespond, safeValidate } from "./../helpers";
import { OracleKind, TaskStatus } from "./../types/enums";
import { transferTokens } from "./../helpers";
/**
* Test scenario:
*
* There are 3 generators, only 2 responses required in total.
*
* - Generator #1 responds to the request
* - Generator #1 fails to respond again because it already responded
* - Generator #2 responds to the request
* - Generator #3 tries to respond (should fail because generation phase has ended)
*/
describe("LLMOracleCoordinator", function () {
let dria: HardhatEthersSigner;
let requester: HardhatEthersSigner;
let generators: HardhatEthersSigner[];
let validators: HardhatEthersSigner[];
let dummy: HardhatEthersSigner; // just an outsider
let coordinator: LLMOracleCoordinator;
let registry: LLMOracleRegistry;
let token: ERC20;
let coordinatorAddress: string;
let registryAddress: string;
let taskId = 0n; // this will be updated throughout the test
/// mock LLM input & output
const input = "0x" + Buffer.from("What is 2 + 2?").toString("hex");
const output = "0x" + Buffer.from("2 + 2 equals 4.").toString("hex");
const models = "0x" + Buffer.from("gpt-4o-mini").toString("hex");
const metadata = "0x"; // empty metadata
const difficulty = 2;
const SUPPLY = parseEther("1000");
const STAKES = {
generatorStakeAmount: parseEther("0.01"),
validatorStakeAmount: parseEther("0.01"),
};
const FEES = {
platformFee: parseEther("0.001"),
generationFee: parseEther("0.002"),
validationFee: parseEther("0.0003"),
};
this.beforeAll(async function () {
// assign roles, full = oracle that can do both generation & validation
const [deployer, dum, req1, gen1, gen2, gen3, val1, val2, val3] = await ethers.getSigners();
dria = deployer;
requester = req1;
dummy = dum;
generators = [gen1, gen2, gen3];
validators = [val1, val2, val3];
token = await deployTokenFixture(deployer, SUPPLY);
({ registry, coordinator } = await deployLLMFixture(dria, token, STAKES, FEES));
const requesterFunds = parseEther("1");
await transferTokens(token, [
[requester.address, requesterFunds],
// each oracle should have at least the stake amount
...generators.map<[string, bigint]>((oracle) => [oracle.address, STAKES.generatorStakeAmount]),
...validators.map<[string, bigint]>((oracle) => [oracle.address, STAKES.validatorStakeAmount]),
]);
registryAddress = await registry.getAddress();
coordinatorAddress = await coordinator.getAddress();
});
it("should register oracles", async function () {
await registerOracles(token, registry, generators, validators, STAKES);
});
// quick fix for BigInt serialization
(BigInt.prototype as any).toJSON = function () {
return this.toString();
};
describe("with validation", function () {
const [numGenerations, numValidations] = [2, 3];
this.beforeAll(async () => {
taskId++;
});
it("should make a request", async function () {
await safeRequest(coordinator, token, requester, taskId, input, models, {
difficulty,
numGenerations,
numValidations,
});
});
it("should respond (1/2 & 2/2)", async function () {
for (let i = 0; i < numGenerations; i++) {
await safeRespond(coordinator, generators[i], output, metadata, taskId, BigInt(i));
}
});
it("should validate (1/3) a generation only once", async function () {
const validator = validators[0];
await safeValidate(coordinator, validator, [parseEther("0.01"), parseEther("0.01")], metadata, taskId, 0n);
});
it("should validate (2/3)", async function () {
const validator = validators[1];
await safeValidate(coordinator, validator, [parseEther("0.001"), parseEther("0.001")], metadata, taskId, 1n);
const request = await coordinator.requests(taskId);
expect(request.status).to.equal(TaskStatus.Completed);
});
it("should validate (3/3)", async function () {
const validator = validators[2];
await safeValidate(coordinator, validator, [parseEther("0.0001"), parseEther("0.0001")], metadata, taskId, 2n);
const request = await coordinator.requests(taskId);
expect(request.status).to.equal(TaskStatus.Completed);
});
});
});

The result will be: reverted with panic code 0x11 (Arithmetic operation overflowed outside of an unchecked block)

and the stack trace:

at LLMOracleCoordinator.finalizeValidation (contracts/llm/LLMOracleCoordinator.sol:343)
at LLMOracleCoordinator.validate (contracts/llm/LLMOracleCoordinator.sol:305)
at <UnrecognizedContract>.<unknown> (0x5fc8d32690cc91d4c39d9d3abcbd16989f875707)
at EdrProviderWrapper.request (node\_modules/hardhat/src/internal/hardhat-network/provider/provider.ts:431:41)
at async HardhatEthersSigner.sendTransaction (node\_modules/@nomicfoundation/hardhat-ethers/src/signers.ts:125:18)
at async send (node\_modules/ethers/src.ts/contract/contract.ts:313:20)
at async Proxy.validate (node\_modules/ethers/src.ts/contract/contract.ts:352:16)

It starts at LLMOracleCoordinator.sol:343:

uint256 score = scores[v_i];
if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) { // <- panic due to subtraction
innerSum += score;
innerCount++;

There is another a similar case on line LLMOracleCoordinator.sol:368. Which can become more prominent if generationDeviationFactor is increased above 1:

for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {
// ignore lower outliers
if (generationScores[g_i] >= mean - generationDeviationFactor * stddev) { // <- same problem
_increaseAllowance(responses[taskId][g_i].responder, task.generatorFee);
}
}

Recommendations

Either add an explicit limit to not go below zero or switch to using an alternative method to calculating average sample distance.

Updates

Lead Judging Commences

inallhonesty Lead Judge 9 months ago
Submission Judgement Published
Validated
Assigned finding tags:

Underflow in `LLMOracleCoordinator::validate`

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.