Connect with us

Hi, what are you looking for?

Economy

AI News: OpenAI Launches New Benchmark To Tackle AI Factuality

OpenAI Launches New Benchmark To Tackle AI Factuality

Renowned Artificial Intelligence (AI) firm OpenAI has introduced SimpleQA, a factuality benchmark. Based on its description, this tool measures the ability for language models to answer short, fact-seeking questions. This new benchmark marks another attempt for the AI giant to restore trust in its flagship products.

SimpleQA Outperforms Frontier Models

A general problem faced by AI platforms is training models to provide responses that are factually correct.

Currently, the situation has reached a point where these models even produce false outputs or give answers without substantial evidence. This challenge is generally referred to as “hallucination.” Consequently, netizens are more geared towards the few models that provide more accurate responses with less hallucinations.

However, OpenAI decided to come up with the SimpleQA benchmark that measures factuality of language models. This vision is considered a difficult one to pursue because measuring factuality is challenging as the firm noted. SimpleQA is designed to focus on short, fact-seeking queries. Not only will this design reduce the scope of the benchmark, it will also make measuring factuality much more tractable.

The team behind the development of the benchmark fixed their gaze on high correctedness, diversity and good researcher UX. Unlike previous solutions like TriviaQA which has now become saturated, OpenAI’s SimpleQA was built to challenge frontier models including GPT-4o which currently scores less than 40%. While training the AI tool, the team ensured that each question in the dataset met certain criteria.

“As a final verification of quality, we had a third AI trainer answer a random sample of 1,000 questions from the dataset. We found that the third AI trainer’s answer matched the original agreed answers 94.4% of the time, with a 5.6% disagreement rate,” the ChatGPT maker wrote.

OpenAI’s Valuation Surge to $157 Bln

At the beginning of October, the AI firm saw its valuation top $157 billion after it secured $6.6 billion in funding from investors. These investors includes Thrive Capital, which led the round, Microsoft Corporation and AI giant NVIDIA. The ascent of the Sam Altman-led firm hinges on making plans to bolster its position in frontier AI research.

A week after raising the fund, the firm revealed that it is opening new offices in the United States, France, and Asia, marking another monumental stride globally.

The offices will be located in NYC, Seattle, Paris, Brussels, and Singapore, adding to those already in San Francisco, London, Dublin, and Tokyo. The decision to launch SimpleQA marks a product expansion push that followed the spike in OpenAI’s valuation.

The post AI News: OpenAI Launches New Benchmark To Tackle AI Factuality appeared first on CoinGape.

You May Also Like

Investing

Fisker (NYSE: FSR) stock price has been one of the best-performing electric vehicle (EV) stocks this week even as Tesla slumped. The shares jumped...

Investing

Newmont (NYSE: NEM) reported mixed financial results even as the price of gold approached its all-time high. In all, the company’s earnings per share...

Investing

The Fox Corporation (NASDAQ: FOX) stock price has been under pressure as investors come to terms with the abrupt firing of Tucker Carlson. The...

Investing

NatWest (LON: NWG) share price rose sharply, helped by the strong results from Barclays. The stock jumped to a high of 274.8p, which was...




Disclaimer: Oldamericanbroker.com, its managers, its employees, and assigns (collectively “The Company”) do not make any guarantee or warranty about what is advertised above. Information provided by this website is for research purposes only and should not be considered as personalized financial advice. The Company is not affiliated with, nor does it receive compensation from, any specific security. The Company is not registered or licensed by any governing body in any jurisdiction to give investing advice or provide investment recommendation. Any investments recommended here should be taken into consideration only after consulting with your investment advisor and after reviewing the prospectus or financial statements of the-company.


Copyright © 2024 Oldamericanbroker.com