In LLMOracleCoordinator.getBestResponse()
, when there are no validations (or too few validations to be meaningful), the function arbitrarily returns the first response without any quality assessment:
LLMOracleCoordinator.sol#L400-L423
This is problematic in two scenarios:
When numValidations = 0
: All responses will have a score of 0, so the first response is always selected
When numValidations <= 2
: The statistical calculations of mean and standard deviation become meaningless with such a small sample size. For example:
With 1 validation: There's no variance, making standard deviation calculations useless
With 2 validations: Any two scores will form a "valid" mean and standard deviation, even if they're completely incorrect
Statistical significance typically requires at least 3-5 samples for any meaningful analysis
Users receive arbitrary responses that may not be the best quality
The first responder is unfairly advantaged in no-validation scenarios
The protocol's quality assurance mechanism is bypassed
Users paying for multiple generations don't get the intended benefit of selecting the best response
The statistical scoring system becomes trivially gameable with 2 or fewer validations
Allow users to select their preferred response when validation count is <= 2.
Or add a minimum validation threshold for automated selection.
The contest is live. Earn rewards by submitting a finding.
This is your time to appeal against judgements on your submissions.
Appeals are being carefully reviewed by our judges.