Dria

Swan

NFTHardhat

21,000 USDC

View results

Previous Next

Submission Details

Severity: high

Valid

finalizeValidation() incorrectly assumes that standard deviations is always less than mean

jsondoge

Summary

Due to an incorrect assumption about mean and standard deviation results finalizeValidation() can fail with underflow panic.

Vulnerability Details

finalizeValidation() has an if comparing scores with mean and standard deviation differences and sum:

...

uint256 score = scores[v_i];

if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) { // <- assumes _mean >= _stddev in subtraction

innerSum += score;

innerCount++;

...

Here it assumes that _mean >= _stddev is always true, because otherwise score >= _mean - _stddev will underflow due to the subtraction. But that is exactly what can happen. In certain combinations of scores, standard deviation can easily exceed mean of the same elements.

Example data [1e14, 1e15, 1e16]:

Standard deviation: 4469899327725401
Mean: 3700000000000000

We can see that standard deviation, can greatly exceed mean, even with no min/max elements like 0 or 1e18. The calculation was also confirmed using wolframalpha - population standard deviation {1e14, 1e15, 1e16} and average {1e14, 1e15, 1e16}.

Because standard deviation can exceed mean it will cause a panic and halt the execution.

Impact

When this happens the last oracle will be unable to finalizeValidation() - loss of gas and the buyerAgent will not be able to get a getBestResponse due the task never reaching "Completed" status - loss of fees. And can result in listed assets being lost.

Tools Used

Manual review + hardhat tests.

This issue can't be tested if one of the previous issues isn't fixed variance underflow:

for (uint256 i = 0; i < data.length; i++) {

- uint256 diff = data[i] - mean; // <- bug

+ uint256 diff = data[i] >= mean ? data[i] - mean : mean - data[i]; // <- fix

After the fix run this modified LLMOracleCoordinator.test.ts:

import { expect } from "chai";

import { ethers } from "hardhat";

import type { ERC20, LLMOracleCoordinator, LLMOracleRegistry } from "../../typechain-types";

import type { HardhatEthersSigner } from "@nomicfoundation/hardhat-ethers/signers";

import { parseEther } from "ethers";

import { deployLLMFixture, deployTokenFixture } from "./../fixtures/deploy";

import { registerOracles, safeRequest, safeRespond, safeValidate } from "./../helpers";

import { OracleKind, TaskStatus } from "./../types/enums";

import { transferTokens } from "./../helpers";

/**

* Test scenario:

* There are 3 generators, only 2 responses required in total.

* - Generator #1 responds to the request

* - Generator #1 fails to respond again because it already responded

* - Generator #2 responds to the request

* - Generator #3 tries to respond (should fail because generation phase has ended)

describe("LLMOracleCoordinator", function () {

let dria: HardhatEthersSigner;

let requester: HardhatEthersSigner;

let generators: HardhatEthersSigner[];

let validators: HardhatEthersSigner[];

let dummy: HardhatEthersSigner; // just an outsider

let coordinator: LLMOracleCoordinator;

let registry: LLMOracleRegistry;

let token: ERC20;

let coordinatorAddress: string;

let registryAddress: string;

let taskId = 0n; // this will be updated throughout the test

/// mock LLM input & output

const input = "0x" + Buffer.from("What is 2 + 2?").toString("hex");

const output = "0x" + Buffer.from("2 + 2 equals 4.").toString("hex");

const models = "0x" + Buffer.from("gpt-4o-mini").toString("hex");

const metadata = "0x"; // empty metadata

const difficulty = 2;

const SUPPLY = parseEther("1000");

const STAKES = {

generatorStakeAmount: parseEther("0.01"),

validatorStakeAmount: parseEther("0.01"),

};

const FEES = {

platformFee: parseEther("0.001"),

generationFee: parseEther("0.002"),

validationFee: parseEther("0.0003"),

};

this.beforeAll(async function () {

// assign roles, full = oracle that can do both generation & validation

const [deployer, dum, req1, gen1, gen2, gen3, val1, val2, val3] = await ethers.getSigners();

dria = deployer;

requester = req1;

dummy = dum;

generators = [gen1, gen2, gen3];

validators = [val1, val2, val3];

token = await deployTokenFixture(deployer, SUPPLY);

({ registry, coordinator } = await deployLLMFixture(dria, token, STAKES, FEES));

const requesterFunds = parseEther("1");

await transferTokens(token, [

[requester.address, requesterFunds],

// each oracle should have at least the stake amount

...generators.map<[string, bigint]>((oracle) => [oracle.address, STAKES.generatorStakeAmount]),

...validators.map<[string, bigint]>((oracle) => [oracle.address, STAKES.validatorStakeAmount]),

]);

registryAddress = await registry.getAddress();

coordinatorAddress = await coordinator.getAddress();

});

it("should register oracles", async function () {

await registerOracles(token, registry, generators, validators, STAKES);

});

// quick fix for BigInt serialization

(BigInt.prototype as any).toJSON = function () {

return this.toString();

};

describe("with validation", function () {

const [numGenerations, numValidations] = [2, 3];

this.beforeAll(async () => {

taskId++;

});

it("should make a request", async function () {

await safeRequest(coordinator, token, requester, taskId, input, models, {

difficulty,

numGenerations,

numValidations,

});

it("should respond (1/2 & 2/2)", async function () {

for (let i = 0; i < numGenerations; i++) {

await safeRespond(coordinator, generators[i], output, metadata, taskId, BigInt(i));

}

});

it("should validate (1/3) a generation only once", async function () {

const validator = validators[0];

await safeValidate(coordinator, validator, [parseEther("0.01"), parseEther("0.01")], metadata, taskId, 0n);

});

it("should validate (2/3)", async function () {

const validator = validators[1];

await safeValidate(coordinator, validator, [parseEther("0.001"), parseEther("0.001")], metadata, taskId, 1n);

const request = await coordinator.requests(taskId);

expect(request.status).to.equal(TaskStatus.Completed);

});

it("should validate (3/3)", async function () {

const validator = validators[2];

await safeValidate(coordinator, validator, [parseEther("0.0001"), parseEther("0.0001")], metadata, taskId, 2n);

const request = await coordinator.requests(taskId);

expect(request.status).to.equal(TaskStatus.Completed);

});

The result will be: reverted with panic code 0x11 (Arithmetic operation overflowed outside of an unchecked block)

and the stack trace:

at LLMOracleCoordinator.finalizeValidation (contracts/llm/LLMOracleCoordinator.sol:343)

at LLMOracleCoordinator.validate (contracts/llm/LLMOracleCoordinator.sol:305)

at <UnrecognizedContract>.<unknown> (0x5fc8d32690cc91d4c39d9d3abcbd16989f875707)

at EdrProviderWrapper.request (node\_modules/hardhat/src/internal/hardhat-network/provider/provider.ts:431:41)

at async HardhatEthersSigner.sendTransaction (node\_modules/@nomicfoundation/hardhat-ethers/src/signers.ts:125:18)

at async send (node\_modules/ethers/src.ts/contract/contract.ts:313:20)

at async Proxy.validate (node\_modules/ethers/src.ts/contract/contract.ts:352:16)

It starts at LLMOracleCoordinator.sol:343:

uint256 score = scores[v_i];

if ((score >= _mean - _stddev) && (score <= _mean + _stddev)) { // <- panic due to subtraction

innerSum += score;

innerCount++;

There is another a similar case on line LLMOracleCoordinator.sol:368. Which can become more prominent if generationDeviationFactor is increased above 1:

for (uint256 g_i = 0; g_i < task.parameters.numGenerations; g_i++) {

// ignore lower outliers

if (generationScores[g_i] >= mean - generationDeviationFactor * stddev) { // <- same problem

_increaseAllowance(responses[taskId][g_i].responder, task.generatorFee);

}

Recommendations

Either add an explicit limit to not go below zero or switch to using an alternative method to calculating average sample distance.

Updates

Lead Judging Commences

inallhonesty Lead Judge 11 months ago

Submission Judgement Published

Validated

Assigned finding tags:

Underflow in `LLMOracleCoordinator::validate`

Prize pool breakdown

Total prize

21,000 USDC

nSLOC

1,034

20 USDC / LOC

High Medium

20,000 USDC

Low

1,000 USDC

Live

The contest is live. Earn rewards by submitting a finding.

Judging

Submissions are being carefully reviewed by our judges.

View all submissions

Appeals

This is your time to appeal against judgements on your submissions.

Appeals Review

Appeals are being carefully reviewed by our judges.

Rewards Distribution

The contest is complete and the rewards are being distributed.

View results

Support

FAQs

Can't find an answer? Chat with us on Discord, Twitter or Linkedin.