Close Menu
Trader News
  • Markets
    • Stocks
    • Futures
    • Forex
    • Commodities
    • OTC
    • QB
    • QX
    • PINK
    • Crypto
    • Options
    • Bonds
  • Crypto
    • Market
    • BTC
    • NFTs
    • DeFi
  • Technology
    • Web3
    • FinTech
    • EdTech
    • AI
  • Startups
  • Real Estate
  • Personal Finance
    • Retirement
    • Investing
  • More
    • Market Data
    • Glossary
    • Crypto Heatmap
    • Newsletter
    • Submit News
    • Exchanges, Brokerage and Savings Platforms
X (Twitter)
X (Twitter) TikTok YouTube RSS
Trader News
  • Markets
    1. Stocks
    2. Futures
    3. Forex
    4. Commodities
    5. OTC
    6. QB
    7. QX
    8. PINK
    9. Crypto
    10. Options
    11. Bonds
    Featured

    Dave Ramsey Warns Homebuyers As Mortgage Rates Rise: Mistakes Could Cost ‘Tens Of Thousands’

    By News RoomMar 30, 2026 4:44 am EDT0
    Recent

    Dave Ramsey Warns Homebuyers As Mortgage Rates Rise: Mistakes Could Cost ‘Tens Of Thousands’

    Mar 30, 2026 4:44 am EDT

    MBT Unveils Next‑Gen Sustainable Solutions at MCE 2026

    Mar 30, 2026 4:41 am EDT

    Phreesia, Progress Software And 3 Stocks To Watch Heading Into Monday

    Mar 30, 2026 3:32 am EDT
  • Crypto
    1. Market
    2. BTC
    3. NFTs
    4. DeFi
    Featured

    Ethereum Price in Danger of Dropping to $1.2K Next, Analyst Warns

    By News RoomMar 30, 2026 3:53 am EDT0
    Recent

    Ethereum Price in Danger of Dropping to $1.2K Next, Analyst Warns

    Mar 30, 2026 3:53 am EDT

    OKX Integrates Aave on Ethereum L2 X Layer

    Mar 30, 2026 12:30 am EDT

    Lido DAO Mulls $20M LDO Buyback to Boost Token Price

    Mar 29, 2026 11:25 pm EDT
  • Technology
    1. Web3
    2. FinTech
    3. EdTech
    4. AI
    Featured

    Anthropic’s ‘Most Capable’ AI Model Claude Mythos Leaks, Deemed Major Cybersecurity Threat

    By News RoomMar 27, 2026 3:38 pm EDT0
    Recent

    Anthropic’s ‘Most Capable’ AI Model Claude Mythos Leaks, Deemed Major Cybersecurity Threat

    Mar 27, 2026 3:38 pm EDT

    ‘All to Play For’: Walrus Hits 450TB of Data Stored Amid Renewed AI Push

    Mar 27, 2026 12:11 pm EDT

    Broadcom Stock Near Death Cross As Momentum Fades

    Mar 27, 2026 10:43 am EDT
  • Startups
  • Real Estate
  • Personal Finance
    1. Retirement
    2. Investing
    Featured

    Which retailers win and lose from high gas prices? Deutsche Bank sorts it out

    By News RoomMar 29, 2026 10:32 am EDT0
    Recent

    Which retailers win and lose from high gas prices? Deutsche Bank sorts it out

    Mar 29, 2026 10:32 am EDT

    Not everyone can expect a bigger tax refund this year — what’s actually driving your result

    Mar 29, 2026 9:32 am EDT

    Psychedelic therapies are becoming mainstream. Deutsche Bank thinks this drug developer could triple

    Mar 29, 2026 9:29 am EDT
  • More
    • Market Data
    • Glossary
    • Crypto Heatmap
    • Newsletter
    • Submit News
    • Exchanges, Brokerage and Savings Platforms
Login
Trader News
You are at:Home » Is AGI Here? Not Even Close, New AI Benchmark Suggests
AI

Is AGI Here? Not Even Close, New AI Benchmark Suggests

News RoomNews RoomMar 26, 2026 3:54 pm EDT0 ViewsNo Comments5 Mins Read
Facebook Twitter Telegram WhatsApp Pinterest LinkedIn Tumblr Email Reddit
Share
Facebook Twitter LinkedIn Pinterest Email

In quick

  • ARC-AGI-3 exposes a huge space in between AGI claims and truth, with leading AI designs scoring listed below 1% while people accomplish best efficiency.
  • The benchmark tests real generalization– needing representatives to check out, strategy, and gain from scratch in unidentified environments instead of recall skilled patterns.
  • In spite of market buzz, present AI systems stay far from AGI, doing not have the thinking and flexibility that even young people show naturally.

Nvidia CEO Jensen Huang went on Lex Fridman’s podcast recently and stated, clearly, “I believe we have actually attained AGI.” 2 days later on, the most strenuous test in AI research study dropped its latest synthetic basic intelligence standard– and every frontier design scored listed below 1%.

The ARC Reward Structure launched ARC-AGI-3 today, and the outcomes are ruthless. Google’s Gemini 3.1 Pro led the pack at 0.37%. OpenAI’s GPT-5.4 was available in at 0.26%. Anthropic’s Claude Opus 4.6 handled 0.25%, while xAI’s Grok-4.20 scored precisely no. Human beings, on the other hand, fixed 100% of environments.

This isn’t a trivia test or coding examination, and even ultra-hard PhD-level concerns. ARC-AGI-3 is something totally various from anything the AI market has actually dealt with in the past.

The standard was developed by François Chollet and Mike Knoop’s structure, which established an internal video game studio and developed 135 initial interactive environments from scratch. The concept is to drop an AI representative into an unknown game-like world with no directions, no specified objectives, and no description of the guidelines. The representative needs to check out, determine what it’s expected to do, form a strategy, and perform it.

If that seems like something any five-year-old can do, you’re beginning to comprehend the issue. If you wish to see if you are much better than AI, you can play the exact same video games included in the test by clicking this link. We attempted one; it was strange initially, however after a couple of seconds, you can quickly master it.

It likewise is the clearest example of what the “G” in AGI means. When you generalize, you have the ability to produce brand-new understanding (how a strange video game works) without being trained on it beforehand.

Previous variations of ARC evaluated fixed visual puzzles– reveal a pattern, anticipate the next one. They were hard initially. Then the laboratories tossed calculate power and training at them till the criteria were efficiently dead. ARC-AGI-1, presented in 2019, was up to test-time training and thinking designs. ARC-AGI-2 lasted about a year before Gemini 3.1 Pro struck 77.1%. The laboratories are great at saturating criteria they can train versus.

Variation 3 was developed particularly to avoid that. With 110 of the 135 environments kept personal– 55 semi-private for API screening, 55 totally locked for competitors– there’s no dataset to remember. You can’t brute-force your method through unique video game reasoning you have actually never ever seen.

Scoring isn’t pass/fail either. ARC-AGI-3 utilizes what the structure calls RHAE– Relative Human Action Effectiveness. The standard is the second-best, first-run human efficiency. An AI that takes 10 times as lots of actions as a human ratings 1% for that level, not 10%. The formula squares the charge for inadequacy. Roaming around, backtracking, and thinking your method to a response gets penalized hard.

The very best AI representative in the month-long designer sneak peek scored 12.58%. Frontier LLMs evaluated through the main API, without any custom-made tooling, could not split 1%. Regular people fixed all 135 environments without any previous training and no directions. If that’s the bar, then the present crop of designs isn’t clearing it.

There is one genuine methodological dispute here. ARC’s report states a Duke-built custom-made harness pressed Claude Opus 4.6 from 0.25% to 97.1% on a single environment version called TR87. That does not indicate Claude scored 97.1% on ARC-AGI-3 in general; its main benchmark rating stayed 0.25%, however the shift is still worth keeping in mind.

The main standard feeds representatives JSON code, not visuals. That’s either a methodological defect or a presentation that today’s designs are much better at processing human-friendly info than raw structured information. Chollet’s structure has actually acknowledged the dispute, however isn’t altering the format.

” Frame material understanding and API format are not restricting elements for frontier design efficiency on ARC-AGI-3,” the paper checks out. Simply put, they appear to decline the concept that designs stop working due to the fact that they “can’t see” the jobs appropriately, arguing rather that understanding is currently enough– and the genuine space depends on thinking and generalization.

The AGI truth check got here throughout a week when the buzz device was performing at complete speed. Besides Huang’s remark, Arm called its brand-new information center chip the “AGI CPU.” OpenAI’s Sam Altman has actually stated they have actually “essentially developed AGI,” and Microsoft is currently marketing a laboratory concentrated on structure ASI: A development of what follows AGI is attained. The term is being extended till it suggests whatever is commercially practical, it appears.

Chollet’s position is easier. If a regular human without any directions can do it, and your system can’t, then you do not have AGI– you have a really costly autocomplete that requires a great deal of aid.

ARC Reward 2026 is using $2 million throughout 3 competitors tracks, all hosted on Kaggle. Every winning service needs to be open-sourced. The clock is running, and today, the makers aren’t even close.

Daily Debrief Newsletter

Start every day with the leading newspaper article today, plus initial functions, a podcast, videos and more.

Source

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Articles

Anthropic’s ‘Most Capable’ AI Model Claude Mythos Leaks, Deemed Major Cybersecurity Threat

AI Mar 27, 2026 3:38 pm EDT

Broadcom Stock Near Death Cross As Momentum Fades

AI Mar 27, 2026 10:43 am EDT

Trump names David Sacks co-chair of tech advisory council, expanding AI, crypto role

AI Mar 26, 2026 10:53 pm EDT

Wikipedia Bans AI-Generated Text in Articles Under New Editing Policy

AI Mar 26, 2026 7:35 pm EDT

First Sora, Now Sexy Chat? OpenAI Cancels Erotic ChatGPT Mode

AI Mar 26, 2026 6:27 pm EDT

SanDisk Chooses Taiwan Over U.S. To Secure AI Supply

AI Mar 26, 2026 2:31 pm EDT
Add A Comment
Leave A Reply Cancel Reply

You must be logged in to post a comment.

Latest News

MBT Unveils Next‑Gen Sustainable Solutions at MCE 2026

Mar 30, 2026 4:41 am EDT

Ethereum Price in Danger of Dropping to $1.2K Next, Analyst Warns

Mar 30, 2026 3:53 am EDT

Phreesia, Progress Software And 3 Stocks To Watch Heading Into Monday

Mar 30, 2026 3:32 am EDT

Trump Says ‘Regime Change’ In Iran Accomplished, Hails ‘Reasonable’ New Leadership—’Whole Different Group’

Mar 30, 2026 3:29 am EDT

Ross Gerber Slams Tesla’s FSD For Being ‘Level 2’ As Gary Black Decries Marketing Strategy: ‘You Still Basically Have…’

Mar 30, 2026 2:23 am EDT

Subscribe to Updates

Get the latest markets news and updates directly to your inbox.

[newsletter_form]

Top News

Commodities

Will S&P 500 Open Up Or Down On March 30? Here Is How Prediction Market Traders Lean As Oil Spikes On Trump’s ‘Take The Oil’ Threat

By News RoomMar 30, 2026 2:20 am EDT0

The S&P 500 gets in the last week of March under substantial pressure after toppling…

Dogecoin Remains ‘Stuck’ After Top Analyst’s 29% Move Prediction Fails To Play Out—But There’s Still Hope

Mar 30, 2026 1:20 am EDT

New Guide Helps Policymakers Safely Test Carbon Market Innovations

Mar 30, 2026 1:17 am EDT

OKX Integrates Aave on Ethereum L2 X Layer

Mar 30, 2026 12:30 am EDT
About
About

Trader News is the only source for the latest news and updates about the market, finance, crypto and real estate. Follow us to get the only news that matters.
We're social, connect with us:

X (Twitter) YouTube TikTok
Popular News

Bitcoin Cycle Will Continue In ‘Some Form,’ Says Gemini Exec

Oct 4, 2025 2:09 am EDT

Buy these five tech stocks as market turbulence continues, Bank of America says

Mar 28, 2026 9:41 am EDT

‘All to Play For’: Walrus Hits 450TB of Data Stored Amid Renewed AI Push

Mar 27, 2026 12:11 pm EDT

Subscribe to Updates

Get the latest markets news and updates directly to your inbox.

[newsletter_form]
Copyright © 2026. TraderNews. All Rights Reserved.
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?