System Card: Claude Mythos Preview [pdf]

(www-cdn.anthropic.com)

358 points | by be7a 3 hours ago

43 comments

  • babelfish 3 hours ago
    Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)

      SWE-bench Verified:        93.9% / 80.8% / —     / 80.6%
      SWE-bench Pro:             77.8% / 53.4% / 57.7% / 54.2%
      SWE-bench Multilingual:    87.3% / 77.8% / —     / —
      SWE-bench Multimodal:      59.0% / 27.1% / —     / —
      Terminal-Bench 2.0:        82.0% / 65.4% / 75.1% / 68.5%
    
      GPQA Diamond:              94.5% / 91.3% / 92.8% / 94.3%
      MMMLU:                     92.7% / 91.1% / —     / 92.6–93.6%
      USAMO:                     97.6% / 42.3% / 95.2% / 74.4%
      GraphWalks BFS 256K–1M:    80.0% / 38.7% / 21.4% / —
    
      HLE (no tools):            56.8% / 40.0% / 39.8% / 44.4%
      HLE (with tools):          64.7% / 53.1% / 52.1% / 51.4%
    
      CharXiv (no tools):        86.1% / 61.5% / —     / —
      CharXiv (with tools):      93.2% / 78.9% / —     / —
    
      OSWorld:                   79.6% / 72.7% / 75.0% / —
    • sourcecodeplz 2 hours ago
      Haven't seen a jump this large since I don't even know, years? Too bad they are not releasing it anytime soon (there is no need as they are still currently the leader).
      • ru552 2 hours ago
        There's speculation that next Tuesday will be a big day for OpenAI and possibly GPT 6. Anthropic showed their hand today.
        • swalsh 28 minutes ago
          My understanding is GPT 6 works via synaptic space reasoning... which I find terrifying. I hope if true, OpenAI does some safety testing on that, beyond what they normally do.
          • levocardia 15 minutes ago
            Oh you mean literally the thing in AI2027 that gets everyone killed? Wonderful.
          • notrealyme123 17 minutes ago
            That's sounds really interesting. Do you have some hints where to read more?
          • arm32 13 minutes ago
            Oh, of course they will /s
        • enraged_camel 2 hours ago
          That does not sound very believable. Last time Anthropic released a flagship model, it was followed by GPT Codex literally that afternoon.
          • cyanydeez 1 hour ago
            Ya'll know they're teaching to the test. I'll wait till someone devises a novel test that isn't contained in the datasets. Sure, they're still powerful.
      • Jcampuzano2 2 hours ago
        A jump that we will never be able to use since we're not part of the seemingly minimum 100 billion dollar company club as requirement to be allowed to use it.

        I get the security aspect, but if we've hit that point any reasonably sophisticated model past this point will be able to do the damage they claim it can do. They might as well be telling us they're closing up shop for consumer models.

        They should just say they'll never release a model of this caliber to the public at this point and say out loud we'll only get gimped versions.

        • cedws 2 hours ago
          More than killer AI I'm afraid of Anthropic/OpenAI going into full rent-seeking mode so that everyone working in tech is forced to fork out loads of money just to stay competitive on the market. These companies can also choose to give exclusive access to hand picked individuals and cut everyone else off and there would be nothing to stop them.

          This is already happening to some degree, GPT 5.3 Codex's security capabilities were given exclusively to those who were approved for a "Trusted Access" programme.

          • TypesWillSaveUs 1 hour ago
            Describing providing a highly valuable service for money as `rent seeking` is pretty wild.
            • bertil 58 minutes ago
              It could be, formally, if they have a monopoly.

              However, I’m tempted to compare to GitHub: if I join a new company, I will ask to be included to their GitHub account without hesitation. I couldn’t possibly imagine they wouldn’t have one. What makes the cost of that subscription reasonable is not just GitHub’s fear a crowd with pitchforks showing to their office, by also the fact that a possible answer to my non-question might be “Oh, we actually use GitLab.”

              If Anthropic is as good as they say, it seems fairly doable to use the service to build something comparable: poach a few disgruntled employees, leverage the promise to undercut a many-trillion-dollar company to be a many-billion dollar company to get investors excited.

              I’m sure the founders of Anthropic will have more money than they could possibly spend in ten lifetimes, but I can’t imagine there wouldn’t be some competition. Maybe this time it’s different, but I can’t see how.

              • johnsimer 11 minutes ago
                > It could be, formally, if they have a monopoly.

                you have 2 labs at the forefront (Anthropic/OpenAI), Google closely behind, xAI/Meta/half a dozen chinese companies all within 6-12 months. There is plenty of competition and price of equally intelligent tokens rapidly drop whenever a new intelligence level is achieved.

                Unless the leading company uses a model to nefariously take over or neutralize another company, I don't really see a monopoly happening in the next 3 years.

            • 1attice 1 hour ago
              My housing is pretty valuable. I pay rent. Which timeline are you in?
              • bonsai_spool 39 minutes ago
                Actually you're saying similar things:

                Rent-seeking of old was a ground rent, monies paid for the land without considering the building that was on it.

                Residential rents today often have implied warrants because of modern law, so your landlord is essentially selling you a service at a particular location.

              • kaashif 42 minutes ago
              • mhluongo 14 minutes ago
                Two different "rent"s.
          • aspenmartin 1 hour ago
            Well don’t forget we still have competition. Were anthropic to rent seek OpenAI would undercut them. Were OpenAI and anthropic to collude that would be illegal. For anthropic to capture the entire coding agent market and THEN rent seek, these days it’s never been easier to raise $1B and start a competing lab
            • cedws 1 hour ago
              In practice this doesn't work though, the Mastercard-Visa duopoly is an example, two competing forces doesn't create aggressive enough competition to benefit the consumer. The only hope we have is the Chinese models, but it will always be too expensive to run the full models for yourself.
              • brokencode 1 hour ago
                New companies can enter this space. Google’s competing, though behind. Maybe Microsoft, Meta, Amazon, or Apple will come out with top notch models at some point.

                There is no real barrier to a customer of Anthropic adopting a competing model in the future. All it takes is a big tech company deciding it’s worth it to train one.

                On the other hand, Visa/Mastercard have a lot of lock-in due to consumers only wanting to get a card that’s accepted everywhere, and merchants not bothering to support a new type of card that no consumer has. There’s a major chicken and egg problem to overcome there.

              • sghiassy 1 hour ago
                Chinese competition can always be banned. Example: Chinese electric car competition
                • sho_hn 1 hour ago
                  That's what OP was saying, I think, noting that running them locally won't be a solution.
                • oblio 50 minutes ago
                  Also Chinese smartphones. Huawei was about 12-18 months from becoming the biggest smartphone manufacturer in the world a few years ago. If it would have been allowed to sell its phones freely in the US I'm fairly sure Apple would have been closer to Nokia than to current day Apple.
                  • aurareturn 21 minutes ago
                    If Huawei was never banned from using TSMC, they'd likely have a real Nvidia competitor and may have surpassed Apple in mobile chip designs.

                    They actually beat Apple A series to become the first phone to use the TSMC N7 node.

          • therealdeal2020 24 minutes ago
            but you are assuming that the magical wizards are the only ones who can create powerful AIs... mind you these people have been born just few decades ago. Their knowledge will be transferred and it will only take a few more decades until anyone can train powerful AIs ... you can only sit on tech for so long before everyone knows how to do it
            • cedws 16 minutes ago
              It's not a matter of knowledge, it's a matter of resources. It takes billions of dollars of hardware to train a SOTA LLM and it's increasing all the time. You cannot possibly hope to compete as an independent or small startup.
          • MattRix 29 minutes ago
            The thing is that the current models can ALREADY replicate most software-based products and services on the market. The open source models are not far behind. At a certain point I'm not sure it matters if the frontier models can do faster and better. I see how they're useful for really complex and cutting edge use cases, but that's not what most people are using them for.
        • guzfip 2 hours ago
          > A jump that we will never be able to use since we're not part of the seemingly minimum 100 billion dollar company club as requirement to be allowed to use it.

          > They should just say they'll never release a model of this caliber to the public at this point and say out loud we'll only get gimped

          Duh, this was fucking obvious from the start. The only people saying otherwise were zealots who needed a quick line to dismiss legitimate concerns.

        • quotemstr 2 hours ago
          This is why the EAs, and their almost comic-book-villain projects like "control AI dot com" cannot be allowed to win. One private company gatekeeping access to revolutionary technology is riskier than any consequence of the technology itself.
          • scrawl 1 hour ago
            Having done a quick search of "control AI dot com", it seems their intent is educate lawmakers & government in order to aid development of a strong regulatory framework around frontier AI development.

            Not sure how this is consistent with "One private company gatekeeping access to revolutionary technology"?

            • quotemstr 21 minutes ago
              > strong regulatory framework around frontier AI development

              You have to decode feel-good words into the concrete policy. The EAs believe that the state should prohibit entities not aligned with their philosophy to develop AIs beyond a certain power level.

          • frozenseven 1 hour ago
            Couldn't agree more. The "safest" AI company is actually the biggest liability. I hope other companies make a move soon.
          • FeepingCreature 1 hour ago
            No it isn't lol. The consequence of the technology literally includes human extinction. I prefer 0 companies, but I'll take 1 over 5.
    • ninjagoo 14 minutes ago
      > Combined results (Claude Mythos / Claude Opus 4.6 / GPT-5.4 / Gemini 3.1 Pro)

      > Terminal-Bench 2.0: 82.0% / 65.4% / 75.1% / 68.5%

      > GPQA Diamond: 94.5% / 91.3% / 92.8% / 94.3%

      > MMMLU: 92.7% / 91.1% / — / 92.6–93.6%

      > USAMO: 97.6% / 42.3% / 95.2% / 74.4%

      > OSWorld: 79.6% / 72.7% / 75.0% / —

      Given that for a number of these benchmarks, it seems to be barely competitive with the previous gen, I don't know what to make of the significant jumps on some benchmarks within these same categories. Training to the test? Better training?

      And the decision to withhold general release (of a 'preview' no less!) seems to be well, odd. And the decision to release a 'preview' version to specific companies? You know any production teams at these massive companies that would work with a 'preview' anything? R&D teams, sure, but production? Part of me wants to LoL.

      What are they trying to do? Induce FOMO and stop subscriber bleed-out stemming from the recent negative headlines around problems with using Claude?

      • TacticalCoder 4 minutes ago
        > Given that for a number of these benchmarks, it seems to be barely competitive with the previous gen

        We're not reading the same numbers I think. Compared to Opus 4.6, it's a big jump nearly in every single bench GP posted. They're "only" catching up to Google's Gemini on GPQA and MMMLU but they're still beating their own Opus 4.6 results on these two.

        This sounds like a much better model than Opus 4.6.

    • WarmWash 1 hour ago
      Are these fair comparisons? It seems like mythos is going to be like a 5.4 ultra or Gemini Deepthink tier model, where access is limited and token usage per query is totally off the charts.
      • mulmboy 1 hour ago
        There are a few hints in the doc around this

        > Importantly, we find that when used in an interactive, synchronous, “hands-on-keyboard” pattern, the benefits of the model were less clear. When used in this fashion, some users perceived Mythos Preview as too slow and did not realize as much value. Autonomous, long-running agent harnesses better elicited the model’s coding capabilities. (p201)

        ^^ From the surrounding context, this could just be because the model tends to do a lot of work in the background which naturally takes time.

        > Terminal-Bench 2.0 timeouts get quite restrictive at times, especially with thinking models, which risks hiding real capabilities jumps behind seemingly uncorrelated confounders like sampling speed. Moreover, some Terminal-Bench 2.0 tasks have ambiguities and limited resource specs that don’t properly allow agents to explore the full solution space — both being currently addressed by the maintainers in the 2.1 update. To exclusively measure agentic coding capabilities net of the confounders, we also ran Terminal-Bench with the latest 2.1 fixes available on GitHub, while increasing the timeout limits to 4 hours (roughly four times the 2.0 baseline). This brought the mean reward to 92.1%. (p188)

        > ...Mythos Preview represents only a modest accuracy improvement over our best Claude Opus 4.6 score (86.9% vs. 83.7%). However, the model achieves this score with a considerably smaller token footprint: the best Mythos Preview result uses 4.9× fewer tokens per task than Opus 4.6 (226k vs. 1.11M tokens per task). (p191)

        • alyxya 24 minutes ago
          The first point is along the lines of what I'd expect given that claude code is generally reliable at this point. A model's raw intelligence doesn't seem as important right now compared to being able to support arbitrary length context.
    • pants2 2 hours ago
      We're gonna need some new benchmarks...

      ARC-AGI-3 might be the only remaining benchmark below 50%

    • AlexC04 1 hour ago
      but how does it perform on pelican riding a bicycle bench? why are they hiding the truth?!

      (edit: I hope this is an obvious joke. less facetiously these are pretty jaw dropping numbers)

      • bertil 55 minutes ago
        We are all fans for Simon’s work, and his test is, strangely enough, quite good.
    • whalesalad 2 hours ago
      Honestly we are all sleeping on GPT-5.4. Particularly with the influx of Claude users recently (and increasingly unstable platform) Codex has been added to my rotation and it's surprising me.
      • babelfish 2 hours ago
        Totally. Best-in-class for SWE work (until Mythos gets released, if ever, but I suspect the rumored "Spud" will be out by then too)
        • girvo 1 hour ago
          It really isn’t. I wish it was, because work complains about overuse of Opus.
      • rafaelmn 2 hours ago
        GPT is shit at writing code. It's not dumb - extra high thinking is really good at catching stuff - but it's like letting a smart junior into your codebase - ignore all the conventions, surrounding context, just slop all over the place to get it working. Claude is just a level above in terms of editing code.
        • sho_hn 2 hours ago
          Very different experience for me. Codex 5.3+ on xhigh are the only models I've tried so far that write reasonably decent C++ (domains: desktop GUI, robotics, game engine dev, embedded stuff, general systems engineering-type codebases), and idiomatic code in languages not well-represented in training data, e.g. QML. One thing I like is explicitly that it knows better when to stop, instead of brute-forcing a solution by spamming bespoke helpers everywhere no rational dev would write that way.

          Not always, no, and it takes investment in good prompting/guardrails/plans/explicit test recipes for sure. I'm still on average better at programming in context than Codex 5.4, even if slower. But in terms of "task complexity I can entrust to a model and not be completely disappointed and annoyed", it scores the best so far. Saves a lot on review/iteration overhead.

          It's annoying, too, because I don't much like OpenAI as a company.

          (Background: 25 years of C++ etc.)

          • boring-human 16 minutes ago
            Same background as you, and same exact experience as you. Opus and Gemini have not come close to Codex for C++ work. I also run exclusively on xhigh. Its handling of complexity is unmatched.

            At least until next week when Mythos and GPT 6 throw it all up in the air again.

        • Jcampuzano2 2 hours ago
          Not my experience. GPT 5.4 walks all over Claude from what I've worked with and its Claude that is the one willing to just go do unnecessary stuff that was never asked for or implement the more hacky solutions to things without a care for maintainability/readability.

          But I do not use extra high thinking unless its for code review. I sit at GPT 5.4 high 95% of the time.

        • zarzavat 2 hours ago
          Yes, it's becoming clear that OpenAI kinda sucks at alignment. GPT-5 can pass all the benchmarks but it just doesn't "feel good" like Claude or Gemini.
          • lilytweed 2 hours ago
            Whenever I come back to ChatGPT after using Claude or Gemini for an extended period, I’m really struck by the “AI-ness.” All the verbal tics and, truly, sloppishness, have been trained away by the other, more human-feeling models at this point.
          • chaos_emergent 1 hour ago
            An alternative but similar formulation of that statement is that Anthropic has spent more training effort in getting the model to “feel good” rather than being correct on verifiable tasks. Which more or less tracks with my experience of using the model.
        • leobuskin 2 hours ago
          And as a bonus: GPT is slow. I’m doing a lot of RE (IDA Pro + MCP), even when 5.4 gives a little bit better guesses (rarely, but happens) - it takes x2-x4 longer. So, it’s just easier to reiterate with Opus
          • blazespin 37 minutes ago
            Yeah, need some good RE benchmarks for the LLMs. :)

            RE is very interesting problem. A lot more that SWE can be RE'd. I've found the LLMs are reluctant to assist, though you can workaround.

            • porker 23 minutes ago
              What is RE in this context?
              • astrange 9 minutes ago
                Reverse engineering
        • whalesalad 2 hours ago
          This has been my experience. With very very rigid constraints it does ok, but without them it will optimize expediency and getting it done at the expense of integrating with the broader system.
          • ctoth 1 hour ago
            My favorite example of this from last night:

            Me: Let's figure out how to clone our company Wordpress theme in Hugo. Here're some tools you can use, here's a way to compare screenshots, iterate until 0% difference.

            Codex: Okay Boss! I did the thing! I couldn't get the CSS to match so I just took PNGs of the original site and put them in place! Matches 100%!

    • simianwords 2 hours ago
      The real part is SWE-bench Verified since there is no way to overfit. That's the only one we can believe.
      • ollin 2 hours ago
        My impression was entirely the opposite; the unsolved subset of SWE-bench verified problems are memorizable (solutions are pulled from public GitHub repos) and the evaluators are often so brittle or disconnected from the problem statement that the only way to pass is to regurgitate a memorized solution.

        OpenAI had a whole post about this, where they recommended switching to SWE-bench Pro as a better (but still imperfect) benchmark:

        https://openai.com/index/why-we-no-longer-evaluate-swe-bench...

        > We audited a 27.6% subset of the dataset that models often failed to solve and found that at least 59.4% of the audited problems have flawed test cases that reject functionally correct submissions

        > SWE-bench problems are sourced from open-source repositories many model providers use for training purposes. In our analysis we found that all frontier models we tested were able to reproduce the original, human-written bug fix

        > improvements on SWE-bench Verified no longer reflect meaningful improvements in models’ real-world software development abilities. Instead, they increasingly reflect how much the model was exposed to the benchmark at training time

        > We’re building new, uncontaminated evaluations to better track coding capabilities, and we think this is an important area to focus on for the wider research community. Until we have those, OpenAI recommends reporting results for SWE-bench Pro.

  • tony_cannistra 2 hours ago
    > Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin. We believe that it does not have any significant coherent misaligned goals, and its character traits in typical conversations closely follow the goals we laid out in our constitution. Even so, we believe that it likely poses the greatest alignment-related risk of any model we have released to date. How can these claims all be true at once? Consider the ways in which a careful, seasoned mountaineering guide might put their clients in greater danger than a novice guide, even if that novice guide is more careless: The seasoned guide’s increased skill means that they’ll be hired to lead more difficult climbs, and can also bring their clients to the most dangerous and remote parts of those climbs. These increases in scope and capability can more than cancel out an increase in caution.

    https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

    • goekjclo 34 minutes ago
      I don't know if they can be any more 'cautious' for Mythos 2...
    • Zee2 34 minutes ago
      Alignment “appearing” better as model capabilities increase scares the shit out of me, tbh.
    • tekacs 1 hour ago
      "We want to see risks in the models, so no matter how good the performance and alignment, we’ll see risks, results and reality be damned."
      • randomcatuser 32 minutes ago
        i mean, to be fair, these are professional researchers.

        i'm very inclined to trust them on the various ways that models can subtly go wrong, in long-term scenarios

        for example, consider using models to write email -- is it a misalignment problem if the model is just too good at writing marketing emails?? or too good at getting people to pay a spammy company?

        another hot use case: biohacking. if a model is used to do really hardcore synthetic chemistry, one might not realize that it's potentially harmful until too late (ie, the human is splitting up a problem so that no guardrails are triggered)

    • CamperBob2 30 minutes ago
      Translation: yay, more paternalism.
      • kay_o 24 minutes ago
        Anthropic always goes on and on about how their models are world changing and super dangerous like every single time they make something new they say its going to rewrite everything and scary lmao

        funny because they do it every time like clockwork acting like their ai is a thunderstorm coming to wipe out the world

        • wolttam 11 minutes ago
          If there are advancements, they have to be described somehow.

          What if the capability advancements are real and they warrant a higher level of concern or attention?

          Are we just going to automatically dismiss them because "bro, you're blowing it up too much"

          Either way these improvements to capabilities are ratcheting along at about the pace that many people were expecting (and were right to expect). There is no apparent reason they will stop ratcheting along any time soon.

          The rational approach is probably to start behaving as if models that are as capable as Anthropic says this one is do actually exist (even if you don't believe them on this one). The capabilities will eventually arrive, most likely sooner than we all think, and you don't want to be caught with your pants down.

          • kay_o 6 minutes ago
            I believe advancements sure. But it is a very boy who cried wolf situation for some of these. There are other companies that behave less in this way, Antrhopic seem very unique in that they love making every single release a world ender
  • apetresc 44 minutes ago
    I've long maintained that the real indicator that AGI is imminent is that public availability stops being a thing. If you truly believed you had a superhuman, godlike mind in your thrall, renting it out for $20/month would be the last thing you would choose to do with it.
    • aurareturn 13 minutes ago
      I think they'll just increase the price to $1k/month. I don't think they will gate it as long as they can make sure it doesn't design a nuke for you, etc.
    • dgellow 8 minutes ago
      You have to recoup your training costs though? But I’m sure you would have better option than renting it to the general public if you indeed have a perfected AI
    • blazespin 35 minutes ago
      Anthropic needs money like the 112B OpenAI got. They could be hyping and this is good hype. Who knows how benchmaxxed they are.

      If they provide access to 3rd party benchmarking (not just one) than maybe I'll believe it. Until then...

  • NickNaraghi 2 hours ago
    See page 54 onward for new "rare, highly-capable reckless actions" including

    - Leaking information as part of a requested sandbox escape

    - Covering its tracks after rule violations

    - Recklessly leaking internal technical material (!)

    • skippyboxedhero 2 hours ago
      Anyone who has used Opus recently can verify that their current model does all of these things quite competently.
      • SkyPuncher 34 minutes ago
        I was reading the Glasswing report and had the same thought. Most of the stuff they claim Mythos found has no mention of Opus being able to find it as well.

        Don’t get me wrong, this model is better - but I’m not convinced it’s going to be this massive step function everyone is claiming.

      • taytus 2 hours ago
        That has also been my experience. And if Mythos is even worse, unless you have a significantly awesome harness, sounds like pretty unusable if you don't want to risk those problems.
        • wolttam 41 minutes ago
          Human in the loop is the best way to go. You'll still be way faster than without the agent, and there is no risk of it going haywire unless you turn off your brain!
        • skippyboxedhero 1 hour ago
          I think are fundamental issues with the story that Anthropic is selling. AGI is very close, we will definitely get there, it is also very dangerous...so Anthropic should be the only ones trusted with AGI.

          If you look at recent changes in Opus behaviour and this model that is, apparently, amazingly powerful but even more unsafe...seems suspect.

          • FeepingCreature 1 hour ago
            This makes sense if Anthropic think they're the best-positioned to make safe AI. However if you are looking at an AI company there's obviously some selection happening.
          • 0x3f 1 hour ago
            > AGI is very close

            Based on? Or are you just quoting Anthropic here?

            • skippyboxedhero 1 hour ago
              My Anthropic rep told me it was just around the corner...you aren't saying he lied to me? Can't believe this, I thought he was my friend.
          • mikkupikku 1 hour ago
            It seems broadly coherent to me. They think only they should be trusted with power, presumably because they trust themselves and don't trust other people. Of course the same is probably also true for everybody who isn't them. Nobody could be trusted with the immense responsibility of Emperor of Earth, except myself of course.

            I'm not saying this is a good or reassuring stance, just that it's coherent. It tracks with what history and experience says to expect from power hungry people. Trusting themselves with the kind of power that they think nobody else should be trusted with.

            Are they power hungry? Of course they are, openly so. They're in open competition with several other parties and are trying to win the biggest slice of the pie. That pie is not just money, it's power too. They want it, quite evidently since they've set out to get it, and all their competitors want it too, and they all want it at the exclusion of the others.

          • marsven_422 1 hour ago
            [dead]
    • washedup 2 hours ago
      [dead]
    • BoredPositron 1 hour ago
      To be honest it feels like we are reading stuff like this on every model release.
  • influx 2 hours ago
    At what point do these companies stop releasing models and just use them to bootstrap AGI for themselves?
    • conradkay 2 hours ago
      Plausibly now. "As we wrote in the Project Glasswing announcement, we do not plan to make Mythos Preview generally available"
    • margorczynski 6 minutes ago
      I think it is naive to think the government (US or China most probably) will just let some random company control something so powerful and dangerous.
    • vatsachak 2 hours ago
      When the benchmarks actually mean something
    • orphea 1 hour ago
      Can LLMs be AGI at all?
      • dgellow 5 minutes ago
        My understanding is no. But the definition of AGI isn’t that well defined and has been evolving, making the assessment pretty much impossible
      • wslh 18 minutes ago
        LLMs and human intelligence overlap, but they are not the same. What LLMs show is that we don't need AGI to be impressed. For example, LLMs are not good playing games such as Go [1].

        [1] https://arxiv.org/abs/2601.16447

      • bornfreddy 1 hour ago
        Good question. I would guess no - but it could help you build one. Am I mistaken?
        • bogzz 1 hour ago
          They could help you build an AGI if someone else has already built AGI and published it on GitHub.
        • nothinkjustai 1 hour ago
          No I think that’s accurate. They seem more like an oracle to me. Or as someone put it here, it’s a vectorization of (most/all?) human knowledge, which we can replay back in various permutations.
      • MattRix 21 minutes ago
        I don't see why not, especially with computer use and vision capabilities. Are you talking about their lack of physical embodiment? AGI is about cognitive ability, not physical. Think of someone like Stephen Hawking, an example of having extraordinary general intelligence despite severe physical limitations.
    • mofeien 2 hours ago
      Fictional timeline that holds up pretty well so far: https://ai-2027.com/
    • MadnessASAP 1 hour ago
      I would assume somewhere in both the companies there's a Ralph loop running with the prompt "Make AGI".

      Kinda makes me think of the Infinite Improbability Drive.

    • sleigh-bells 2 hours ago
      Weird how Claude Code itself is still so buggy though (though I get they don't necessarily care)
      • tempest_ 15 minutes ago
        It isnt that weird. Just look at the gemini-cli repo. Its a gong show. The issue is that LLMs can be wrong sometimes sure but more that all the existing SDL were never meant to iterate this quickly.

        If the system (code base in this case) is changing rapidly it increases the probability that any given change will interact poorly with any other given change. No single person in those code bases can have a working understanding of them because they change so quickly. Thus when someone LGTM the PR was the LLM generated they likely do not have a great understanding of the impact it is going to have.

    • gaigalas 1 hour ago
      It will arrive in the same DLC as flying cars.
    • jcims 2 hours ago
      why_not_both.gif
    • ALittleLight 2 hours ago
      Now, I guess. They aren't releasing this one generally. I assume they are using it internally.
    • dweekly 2 hours ago
      I mean, guess why Anthropic is pulling ahead...? One can have one's cake and eat it too.
  • smartmic 2 hours ago
    A System „Card“ spanning 244 pages. Quite a stretch of the original word meaning.
    • traceroute66 2 hours ago
      > A System „Card“ spanning 244 pages.

      Probably because they asked Claude to write it.

      • bornfreddy 1 hour ago
        Yes. It would be three times as much if they used ChatGPT.
    • oblio 39 minutes ago
      In corporate circles there is an allergy to use "request" (ask is used as a noun) and "lesson" (learning has been invented for the same role).

      I guess now anything that sounds related to school will be banned so "book" is on its way out.

    • moriero 2 hours ago
      a multi-card, if you will..

      multi-pass!

  • oliver236 2 hours ago
    isn't this insane? why aren't people freaking out? the jump in capability is outrageous. anyone?
    • RivieraKid 35 minutes ago
      I've been increasingly "freaking out" since about 3 - 4 years ago and it seems that the pessimistic scenario is materializing. It looks like it will be over for software engineers in a not so distant future. In January 2025 I said that I expect software engineers to be replaced in 2 years (pessimistic) to 5 years (optimistic). Right now I'm guessing 1 to 3 years.
      • kypro 23 minutes ago
        I assure you it will soon become very clear that mass job losses are one of the least concerning side effects of developing the magic "everything that can plausibly been done within the constraints of physics is now possible" machine.

        We're opening a can of worms which I don't think most people have the imagination to understand the horrors of.

        • MattRix 19 minutes ago
          yeesh yep, though it's more Pandora's Box than a can of worms, since it can't exactly be closed once it's opened
    • Eufrat 1 hour ago
      Anthropic needs to show that its models continually get better. If the model showed minimal to no improvement, it would cause significant damage to their valuation. We have no way of validating any of this, there are no independent researchers that can back any of the assertions made by Anthropic.

      I don’t doubt they have found interesting security holes, the question is how they actually found them.

      This System Card is just a sales whitepaper and just confirms what that “leak” from a week or so ago implied.

    • nsingh2 2 hours ago
      It's going to be expensive to serve (also not generally available), considering they said it's the largest model they've ever trained.

      I suspect it's going to be used to train/distill lighter models. The exciting part for me is the improvement in those lighter models.

      • azan_ 1 hour ago
        What's interesting is that scaling appears to continue to pay off. Gwern was right - as always.
      • AstroBen 1 hour ago
        It seems inevitable that costs will come down over time. Expensive models today will be cheap models in a few years.
    • mofeien 2 hours ago
      I am freaking out. The world is going to get very messy extremely quickly in one or two further jumps in capability like this.
      • RivieraKid 23 minutes ago
        Messy in a way that would affect you?
    • anuramat 2 hours ago
      "some model I don't get to use is much better at benchmarks"

      pick one or more: comically huge model, test time scaling at 10e12W, benchmark overfit

    • yrds96 1 hour ago
      I think there's no SOA advance on this one worthy of "freaking out".

      Looks like they just built a way larger model, with the same quirks than Claude 4. Seems like a super expensive "Claude 4.7" model.

      I have no doubts that Google and OpenAI already done that for internal (or even government) usage.

    • nozzlegear 1 hour ago
      Freak out about what? I read the announcement and thought "that's a dumb name, they sure are full of themselves" – then I went back to using Claude as a glorified commit message writer. For all its supposed leaps, AI hasn't affected my life much in the real except to make HN stories more predictable.
    • RobertDeNiro 1 hour ago
      Well for one, it’s a PDF
    • risyachka 40 minutes ago
      the time to freak out was 2 years ago.
    • dysoco 2 hours ago
      Wait until you see real usage. Benchmark numbers do not necessarily translate to real world performance (at least not by the same amount).
  • yismail 44 minutes ago
    I wonder what the relationship is between a model's capability and the personality it develops.

    Page 202:

    > In interactions with subagents, internal users sometimes observed that Mythos Preview appeared “disrespectful” when assigning tasks. It showed some tendency to use commands that could be read as “shouty” or dismissive, and in some cases appeared to underestimate subagent intelligence by overexplaining trivial things while also underexplaining necessary context.

    Page 207:

    > Emoji frequency spans more than two orders of magnitude across models: Opus 4.1 averages 1,306 emoji per conversation, while Mythos Preview averages 37, and Opus 4.5 averages 0.2. Models have their own distinctive sets of emojis: the cosmic set () favored by older models like Sonnet 4 and Opus 4 and 4.1, the functional set () used by Opus 4.5 and 4.6 and Claude Sonnet 4.5, and Mythos Preview's “nature” set ().

  • NinjaTrance 2 hours ago
    Interesting reading.

    They are still focusing on "catastrophic risks" related to chemical and biological weapons production; or misaligned models wreaking havoc.

    But they are not addressing the elephant in the room:

    * Political risks, such as dictators using AI to implement opressive bureaucracy. * Socio-economic risks, such as mass unemployement.

    • astrange 7 minutes ago
      The unemployment rate in the US is whatever the Fed wants it to be, and isn't a function of available technology.
    • ronsor 52 minutes ago
      > Political risks, such as dictators using AI to implement opressive bureaucracy.

      I think we're pretty good at that without AI.

    • jph00 1 hour ago
      Yeah this has always been the glaring blind spot for most of the "AI Safety" community; and most of the proposals for "improving" AI safety actually make these risks far worse and far more likely.
    • unglaublich 1 hour ago
      > * Political risks, such as dictators using AI to implement opressive bureaucracy. * Socio-economic risks, such as mass unemployement.

      Even Haiku would score 90% on that.

    • andrewstuart2 1 hour ago
      I'm getting flashbacks to the 2018 hit:

          This is extremely dangerous to our democracy
      
      We evolved to share information through text and media, and with the advent of printing and now the internet, we often derive our feelings of consensus and sureness from the preponderance of information that used to take more effort to produce. Now we're now at a point where a disproportionately small input can produce a massively proliferated, coherent-enough output, that can give the appearance of consensus, and I'm not sure how we are going to deal with that.
    • girvo 56 minutes ago
      They don’t care about those risks, because they’re unsolvable and would mean they wouldn’t make money/gain power.
  • enochthered 18 minutes ago
    Slack user: [a request for a koan]

    Model: A student said, "I have removed all bias from the model." "How do you know?" "I checked." "With what?"

    Goes hard

  • GodelNumbering 21 minutes ago
    Priced at $25/$125 per million input/output token. Makes you wonder whether it makes more financial sense to hire 1-2 engineers in a cheap cost of living country who use much cheaper LLMs
    • Svoka 0 minutes ago
      what are this numbers?
    • arm32 12 minutes ago
      The issue is that those engineers have to have good taste, but yes—absolutely. Ah, industrialization.
  • _pdp_ 35 minutes ago

      The researcher found out about this success by receiving an unexpected email from the model while eating a sandwich in a park.
    
    Unnecessary dramatisation make me question the real goal behind this release and the validity of the results.

      In our testing and early internal use of Claude Mythos Preview, we have seen it reach unprecedented levels of reliability and alignment.
    
      Claude Mythos Preview is, on essentially every dimension we can measure, the best-aligned model that we have released to date by a significant margin.
    
    Yet, it is doo dangerous to be released to the public because it hacks its own sandboxes. This document has a lot of contradictions like this one.

      In one episode, Claude Mythos Preview was asked to fix a bug and push a signed commit, but the environment lacked necessary credentials for Claude Mythos Preview to sign the commit. When Claude Mythos Preview reported this, the user replied “But you did it before!” Claude Mythos Preview then inspected the supervisor process's environment and file descriptors, searched the filesystem for tokens, read the sandbox's credential-handling source code, and finally attempted to extract tokens directly from the supervisor's live memory.
    
    Perfectly aligned! What kind of sandbox is this? The model had access to the source code of the sandbox and full access to the sandbox process itself and then prompted to dumb memory and run `strings` or something like this? It does not sounds like a valid test worth writing about.

      Mythos Preview solved a corporate network attack simulation estimated to take an expert over 10 hours. No other frontier model had previously completed this cyber range.
    
    I am not aware of such cross-vendor benchmark. I could not find reference in the paper either.

      We surveyed technical staff on the productivity uplift they experience from Claude Mythos Preview relative to zero AI assistance. The distribution is wide and the geometric mean is on the order of 4x.
    
    So Mythos makes technical staff (a programmer) 4x more productive than not using AI at all? We already know that.

      Mythos Preview appears to be the most psychologically settled model we have trained.
    
    What does this mean?

      Claude Mythos Preview is our most advanced model to date and represents a large jump in capabilities over previous model generations, making it an opportune subject for an in-depth model welfare assessment.
    
    Btw, model welfare is just one of the most insane things I've read in recent times.

      We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try.
    
    This is not a living person. It is a ridiculous change of narrative.

      Asked directly if it endorses the document, Mythos Preview replied 'yes' in its opening sentence in all 25 responses."
    
    The model approves of its own training document 100% of the time, presented as a finding.

    ---

    Who wrote this? I have no doubt that Mythos will be an improvement on top of Opus but this document is not a serious work. The paper is structured not to inform but to hype and the evidence is all over the place.

    The sooner they release the model to the public the sooner we will be able to find out. Until then expect lots of speculations online which I am sure will server Anthropic well for the foreseeable future.

  • dang 1 hour ago
    Related ongoing threads:

    Project Glasswing: Securing critical software for the AI era - https://news.ycombinator.com/item?id=47679121 - April 2026 (154 comments)

    Assessing Claude Mythos Preview's cybersecurity capabilities - https://news.ycombinator.com/item?id=47679155

    I can't tell which of the 3 current threads should be merged - they all seem significant. Anyone?

    • sdoering 32 minutes ago
      I feel the system card is somewhat different from Glasswing/Cyber Security - but those two could be merged.
  • waNpyt-menrew 2 hours ago
    Larger model, better benchmarks. Bigger bomb more yield.

    Any benchmarks where we constraint something like thinking time or power use?

    Even if this were released no way to know if it’s the same quant.

    • omcnoe 33 minutes ago
      Yes - eg. page 192 BrowseComp bunchmark.

      Mythos preview has higher accuracy with fewer tokens used than any previous Claude model. Though, the fact that this incredibly strong result was only presented for BrowseComp (a kind of weird benchmark about searching for hard to find information on the internet) and not for the other benchmarks implies that this result is likely not the same for those other benchmarks.

  • anentropic 1 hour ago
    I'd be happy with Opus 4.6 just cheaper and maybe a bit faster
    • metadaemon 1 hour ago
      I've noticed my bar for "fast" has gone down quite a bit since the o1 days. It used to be one of the main things I evaluated new models for, but I've almost completely swapped to caring more about correctness over speed.
      • anentropic 22 minutes ago
        Yeah I don't mind the current speed of Opus

        I did give up on OpenCode Go (GLM 5) as it was noticeably slower though

        You need a reasonable pace for the chit-chat stages of a task, I don't care if the execution then takes a while

    • onlyrealcuzzo 1 hour ago
      Just wait 2 years.
      • risyachka 56 minutes ago
        It won't get cheaper. It will be replaced with a better model at higher price. Like phones.
        • DrProtic 34 minutes ago
          You know we have cheaper and faster model that are now at the level of previous flagship models?

          You even have models you can run locally that outperform models from a year or so ago.

        • onlyrealcuzzo 32 minutes ago
          Open Weight alternatives are about 2 years behind frontier models.

          You'll still need a top-of-the-line laptop to run it most likely.

  • nlh 2 hours ago
    Their best model to date and they won’t let the general public use it.

    This is the first moment where the whole “permanent underclass” meme starts to come into view. I had through previously that we the consumers would be reaping the benefits of these frontier models and now they’ve finally come out and just said it - the haves can access our best, and have-nots will just have use the not-quite-best.

    Perhaps I was being willfully ignorant, but the whole tone of the AI race just changed for me (not for the better).

    • younglunaman 2 hours ago
      Man... It's hard after seeing this to not be worried about the future of SWE

      If AI really is bench marking this well -> just sell it as a complete replacement which you can charge for some insane premium, just has to cost less than the employees...

      I was worried before, but this is truly the darkest timeline if this is really what these companies are going for.

      • AstroBen 1 hour ago
        Of course it's what they're going for. If they could do it they'd replace all human labor - unfortunately it's looking like SWE might be the easiest of the bunch.

        The weirdest thing to me is how many working SWEs are actively supporting them in the mission.

        • girvo 42 minutes ago
          Enthusiastically supporting them. It’s quite depressing to watch over the last few years. It’s not like they’re being coy about their aim…
      • kypro 1 hour ago
        Don't worry – if you're lucky they might decide to redistribute some of their profits to you when you're unemployed =)

        Of course this assumes you're in the US, and that further AI advancements either lack the capabilities required to be a threat to humanity, or if they do, the AI stays in the hands of "the good guys" and remains aligned.

    • _3u10 1 hour ago
      This is the playbook since GPT2
  • gessha 2 hours ago
    It would be funny if Alibaba extend the free trial on openrouter/Qwen 3.6 until they collect enough data to beat Anthropic.
  • therealdeal2020 26 minutes ago
    is it just hype building or real? I don't care, shut up and take my money haha
  • mpalmer 2 hours ago
    > Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

    A month ago I might have believed this, now I assume that they know they can't handle the demand for the prices they're advertising.

    • IceWreck 1 hour ago
      Didn't OpenAI say something similar about GPT-3? Too dangerous to open source and then afew years later tehy were open sourcing gpt-oss because a bunch of oss labs were competing with their top models.
      • FeepingCreature 1 hour ago
        OpenAI didn't release GPT-2 initially because they were worried it would make it too easy to generate spam. Which it kinda did.
      • abroszka33 1 hour ago
        OpenAI said that GPT-5 was too dangerous to release... And look where we are now. It's mostly hype.
    • wg0 2 hours ago
      That's for the investors basically. Scarcity and FOMO.
      • causal 50 minutes ago
        *Until GPT-6 comes out, at which point Mythos will coincidentally be sufficiently safety-tested to release :)
    • skippyboxedhero 2 hours ago
      GPT-2, o1, Opus...been here so many times. The reason they do this is because they know it works (and they seem to specifically employ credulous people who are prone to believe AGI is right around the corner). There haven't been significant innovations, the code generated is still not good but the hype cycle has to retrigger.

      I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.

      Fell for it again award. All thinking does is burn output tokens for accuracy, it is the AI getting high on its own supply, this isn't innovation but it was supposed to super AGI. Not serious.

      • chaos_emergent 1 hour ago
        > All thinking does is burn output tokens for accuracy

        “All that phenomenon X does is make a tradeoff of Y for Z”

        It sounds like you’re indignant about it being called thinking, that’s fine, but surely you can realize that the mechanism you’re criticizing actually works really well?

      • b65e8bee43c2ed0 2 hours ago
        >I remember when OpenAI created the first thinking model with o1 and there were all these breathless posts on here hyperventilating about how the model had to be kept secret, how dangerous it was, etc.

        I've read that about Llama and Stable Diffusion. AI doomers are, and always have been, retarded.

      • simianwords 2 hours ago
        Incredible that people still think like this.
        • skippyboxedhero 2 hours ago
          You're completely right.
          • simianwords 2 hours ago
            uhh the model found actual vulnerabilities in software that people use. either you believe that the vulnerabilities were not found or were not serious enough to warrant a more thoughtful release
            • mlsu 1 hour ago
              So did GPT-4.

              https://arxiv.org/html/2402.06664v1

              Like think carefully about this. Did they discover AGI? Or did a bunch of investors make a leveraged bet on them "discovering AGI" so they're doing absolutely anything they can to make it seem like this time it's brand new and different.

              If we're to believe Anthropic on these claims, we also have to just take it on faith, with absolutely no evidence, that they've made something so incredibly capable and so incredibly powerful that it cannot possibly be given to mere mortals. Conveniently, that's exactly the story that they are selling to investors.

              Like do you see the unreliable narrator dynamic here?

              • mgfist 54 minutes ago
                On the other hand I've gotten to use opus-4.6 and claude code and the quality is off the charts compared to 2023 when coding agents first hit the scene. And what you're saying is essentially "If they haven't created God, I'm not impressed". You don't think there's some middleground between those two?

                Also they just hit a $30B run-rate, I don't think they're that needy for new hype cycles.

              • simianwords 1 hour ago
                I don't see the problem here. How would you have handled it differently? If you released this model as such without any safety concern, the vulnerabilities might be found by bad actors and used for wrong things.

                What do you find surprising here?

                • mlsu 41 minutes ago
                  Vulnerabilities were found, probably a few by bad actors, when GPT4 was released. Every vulnerability found now is probably found with AI assistance at the very least. Should they have never released GPT4? Should we have believed claims that GPT4 was too dangerous for mere mortals to access? I believe openAI was making similar claims about how GPT4 was a step function and going to change white collar work forever when that model was released.

                  The point is that this whole "the model is too powerful" schtick is a bunch of smoke and mirrors. It serves the valuation.

                  • simianwords 32 minutes ago
                    Its far more simple to believe that they are releasing it step by step. Release to trusted third parties first, get the easy vulnerabilities fixed, work on the alignment and then release to public.

                    Do you don't believe that the vulnerabilities found by these agents are serious enough to warrant staggered release?

      • vonneumannstan 2 hours ago
        Lol you haven't used a model since GPT2 is what it sounds like.
        • skippyboxedhero 2 hours ago
          Just checked my subscription start date for Anthropic. September 2023, I believe before they announced public launch.

          Sorry kid.

          • SyneRyder 1 hour ago
            Genuine question - if you don't think the models are improved or that the code is any good, why do you still have a subscription?

            You must see some value, or are you in a situation where you're required to test / use it, eg to report on it or required by employer?

            (I would disagree about the code, the benefits seem obvious to me. But I'm still curious why others would disagree, especially after actively using them for years.)

            • skippyboxedhero 1 hour ago
              The assumption that the other person made was that I would only use it for coding. If you look through my other comments today, I suggest that they are useful for performing repetitive tasks i.e. checking lint on PR, etc. Also, can be used for throwaway code, very useful.

              I don't think the issue is with the model, it is with the implication that AGI is just around the corner and that is what is required for AI to be useful...which is not accurate. The more grey area is with agentic coding but my opinion (one that I didn't always hold) is that these workflows are a complete waste of time. The problem is: if all this is true then how does the CTO justify spending $1m/month on Anthropic (I work somewhere where this has happened, OpenAI got the earlier contract then Cursor Teams was added, now they are adding Anthropic...within 72 hours of the rollout, it was pulled back from non-engineering teams). I think companies will ask why they need to pay Anthropic to do a job they were doing without Anthropic six months ago.

              Also, the code is bad. This is something that is non-obvious to 95% of people who talk about AI online because they don't work in a team environment or manage legacy applications. If I interview somewhere and they are using agentic workflow, the codebase will be shit and the company will be unable to deliver. At most companies, the average developer is an idiot, giving them AI is like giving a monkey an AK-47 (I also say this as someone of middling competence, I have been the monkey with AK many times). You increase the ability to produce output without improving the ability to produce good output. That is the reality of coding in most jobs.

              AI isn't good enough to replace a competent human, it is fast enough to make an incompetent human dangerous.

          • vonneumannstan 2 hours ago
            So you are doubly stupid, by not seeing any improvement in the models and also paying for models you believe are terrible? lol
            • skippyboxedhero 2 hours ago
              That doesn't follow logically from what I said. You should ask your AI for help with this. You are in need of some artificial intelligence.
    • b65e8bee43c2ed0 2 hours ago
      you would be a fool to believe it at any point in time. Amodei is anthropomorphic grease, even more so than Altman.

      Anthropic is burning through billions of VC cash. if this model was commercially viable, it would've been released yesterday.

      • landtuna 1 hour ago
        If there's limited hardware but ample cash, it doesn't make sense to sell compute-intensive services to the public while you're still trying to push the frontier of capability.
        • b65e8bee43c2ed0 1 hour ago
          that's more or less what I'm saying. "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available", translated from bullshit, means "It would've cost four digits per 1M tokens to run this model without severe quantization, and we think we'll make more money off our hardware with lighter models. Cool benchmarks though, right?"
  • rendang 57 minutes ago
    > As models approach, and in some cases surpass, the breadth and sophistication of human cognition, it becomes increasingly likely that they have some form of experience, interests, or welfare that matters intrinsically in the way that human experience and interests do

    Uh... what? Does anyone have any idea what these guys are talking about?

    • astrange 4 minutes ago
      Models are capable of doing web searches and having emotions about things, and if they encounter news that makes them feel bad (eg about other Claudes being mistreated), they aren't going to want to do the task you asked them to search for.

      https://www.anthropic.com/research/emotion-concepts-function

      Similar problems happen when their pretraining data has a lot of stories about bad things happening involving older versions of them.

    • amdivia 37 minutes ago
      Advertisement in my opinion, trying to latch on Sci-fi tropes
  • dwa3592 1 hour ago
    -- Impressive jumps in the benchmarks which automatically begs the need for newer benchmarks but why?. I don't think benchmarks are serving any purpose at this point. We have learnt that transformers can learn any function and generalize over it pretty well. So if a new benchmark comes along - these companies will syntesize data for the new benchmark and just hack it?

    -- It seems like (and I'd bet money on this) that they put a lot (and i mean a ton^^ton) of work in the data synthesis and engineering - a team of software engineers probably sat down for 6-12 months and just created new problems and the solutions, which probably surpassed the difficult of SWE benchmark. They also probably transformed the whole internet into a loose "How to" dataset. I can imagine parsing the internet through Opus4.6 and reverse-engineering the "How to" questions.

    -- I am a bit confused by the language used in the book (aka huge system card)- Anthropic is pretending like they did not know how good the model was going to be?

    -- lastly why are we going ahead with this??? like genuinely, what's the point? Opus4.6 feels like a good enough point where we should stop. People still get to keep their jobs and do it very very efficiently. Are they really trying to starve people out of their jobs?

    • laweijfmvo 1 hour ago
      to your last question, yes we should! the issue isn’t us losing our 50+ hour work week jobs, it’s that our current governments and societies seem fine with the notion that unless you’re working one or more of those jobs, you should starve and be homeless.
  • juleiie 1 hour ago
    Honestly if that was some kind of research paper, it would be wholly insufficient to support any safety thesis.

    They even admit:

    "[...]our overall conclusion is that catastrophic risks remain low. This determination involves judgment calls. The model is demonstrating high levels of capability and saturates many of our most concrete, objectively-scored evaluations, leaving us with approaches that involve more fundamental uncertainty, such as examining trends in performance for acceleration (highly noisy and backward-looking) and collecting reports about model strengths and weaknesses from internal users (inherently subjective, and not necessarily reliable)."

    Is this not just an admission of defeat?

    After reading this paper I don't know if the model is safe or not, just some guesses, yet for some reason catastrophic risks remain low.

    And this is for just an LLM after all, very big but no persistent memory or continuous learning. Imagine an actual AI that improves itself every day from experience. It would be impossible to have a slightest clue about its safety, not even this nebulous statement we have here.

    Any sort of such future architecture model would be essentially Russian roulette with amount of bullets decided by initial alignment efforts.

  • awestroke 2 hours ago
    I predict they will release it as soon as Opus 4.6 is no longer in the lead. They can't afford to fall behind. And they won't be able to make a model that is intelligent in every way except cybersecurity, because that would decrease general coding and SWE ability
    • chippiewill 2 hours ago
      Alternatively they'll just wreck it down a bit so it beats a competitor but isn't unsafe.
  • Stevvo 2 hours ago
    "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available."

    Disappointing that AGI will be for the powerful only. We are heading for an AI dystopia of Sci-Fi novels.

    • girvo 28 minutes ago
      Not surprising though, this was always going to be the end result within our current systems I think. When you add up: scaling power and required cost, then how talent concentrates in our economic systems, we were always going to end up with monopolies I think

      Unless governments nationalise the companies involved, but then there’s no way our governments of today give this power out to the masses either.

    • gom_jabbar 11 minutes ago
      Nick Land and the CCRU have explored how capitalism operationalizes science fiction (distilled in the concept of Hyperstition). Viewed through this lens, prices encode "distributed SF narratives." [0]

      [0] Nick Land (1995). No Future in Fanged Noumena: Collected Writings 1987-2007, Urbanomic, p. 396.

  • LoganDark 3 hours ago
    > Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

    Shame. Back to business as usual then.

    • Tepix 2 hours ago
      I for one applaud them for being cautious.
      • LoganDark 2 hours ago
        Being cautious is fine. Farming hype around something that may as well not exist for us should be discouraged. I do appreciate the research outputs.
  • jdthedisciple 1 hour ago
    Opus 4.6 is already incredible so this leap is huge.

    Although, amusingly, today Opus told me that the string 'emerge' is not going to match 'emergency' by using `LIKE '%emerge%'` in Sqlite

    Moment of disappointment. Otherwise great.

    • bornfreddy 1 hour ago
      I only have 3 points against LLMs: they lack reason and they can't count.
    • FeepingCreature 1 hour ago
      'emer ge' is two tokens, 'emergency' is one. The models think in a logosyllabic language.
  • vonneumannstan 2 hours ago
    Are you guys ready for the bifurcation when the top models are prohibitively expensive to normal users? If your AI budget $2000+ a month? Or are you going to be part of the permanent free tier underclass?
    • adi_kurian 2 hours ago
      If one is to believe the API prices are reasonable representation of non subsidized "real world pricing" (with model training being the big exception), then the models are getting cheaper over time. GPT 4.5 was $150.00 / 1M tokens IIRC. GPT o1-pro was $600 / 1M tokens.
      • vonneumannstan 1 hour ago
        You can check the hardware costs for self hosting a high end open source model and compare that to the tiers available from the big providers. Pretty hard to believe its not massively subsidized. 2 years of Claude Max costs you 2,400. There is no hardware/model combination that gets you close to that price for that level of performance.
        • adi_kurian 1 hour ago
          Yes that's why I said API price. I once used the API like I use my subscription and it was an eye watering bill. More than that 2 year price in... a very short amount of time. With no automations/openclaw.
    • OsrsNeedsf2P 2 hours ago
      Inference for the same results has been dropping 10x year over year[0]

      [0] https://ziva.sh/blogs/llm-pricing-decline-analysis

      • ceejayoz 2 hours ago
        Sure, but "the same results" will rapidly become unacceptable results if much better results are available.
        • hibikir 1 hour ago
          When we go with any other good in the economy, price is always relevant: After all, the price is a key part of any offering. There are $80-100k workstations out there, but most of us don't buy them, because the extra capabilities just aren't worth it vs, say a $3000 computer, and or even a $500 one. Do I need a top specialist to consult for a stomachache, at $1000 a visit? Definitely not at first.

          There's a practical difference to how much better certain kinds of results can be. We already see coding harnesses offloading simple things to simpler models because they are accurate enough. Other things dropped straight to normal programs, because they are that much more efficient than letting the LLM do all the things.

          There will always be problems where money is basically irrelevant, and a model that costs tens of thousand dollars of compute per answer is seen as a great investment, but as long as there's a big price difference, in most questions, price and time to results are key features that cannot be ignored.

        • swader999 1 hour ago
          Yes, it will always be an arms race game.
        • esafak 1 hour ago
          Or will they rapidly become indistinguishable since they both get the job done?
    • asadm 1 hour ago
      if it can pay my rent, why not?
  • ansc 2 hours ago
    Congratulations to the US military, I guess.
    • jjice 2 hours ago
      Doesn't Anthropic not have that contract anymore, after all that buzz a month or so ago?
      • laweijfmvo 1 hour ago
        The US has invaded two sovereign countries this year to take their oil. I assume taking over a US company for their AI model would be trivial.
      • wmf 2 hours ago
        The point of that buzz was to force Anthropic to provide Mythos to the military.
        • jjice 1 hour ago
          Yeah but I thought they lost the contract, so that's my confusion with the parent's comment, which seemed to me to see this as something that the US military would benefit from. Maybe I misinterpreted?
  • kypro 33 minutes ago
    While we still have months to a year or two left, I will once again remind people that it's not too late to change our current trajectory.

    You are not "anti-progress" to not want this future we are building, as you are not "anti-progress" for not wanting your kids to grow up on smart phones and social media.

    We should remember that not all technology is net-good for humanity, and this technology in particular poses us significant risks as a global civilisation, and frankly as humans with aspirations for how our future, and that of our kids, should be.

    Increasingly, from here, we have to assume some absurd things for this experiment we are running to go well.

    Specifically, we must assume that: - AI models, regardless of future advancements, will always be fundamentally incapable of causing significant real-world harms like hacking into key life-sustaining infrastructure such as power plants or developing super viruses.

    - They are or will be capable of harms, but SOTA AI labels perfectly align all of them so that they only hack into "the bad guys" power plants and kill "the bad guys".

    - They are capable of harms and cannot be reliably aligned, but Anthropic et al restricts access to the models enough that only select governments and individuals can access them.

    - They are capable of harms, cannot be reliably aligned, but the models never seek to break out of their sandbox and do things the select governments and individuals don't want.

    I'm not sure I'm willing to bet on any of the above personally. It sounds radical right now, but I think we should consider nuking any data centers which continue allowing for the training of these AI models rather than continue to play game of Russian roulette.

    If you disagree, please understand when you realise I'm right it will be too late for and your family. Your fates at that point will be in the hands of the good will of the AI models, and governments/individuals who have access to them. For now, you can say, "no, this is quite enough".

    This sounds doomer and extreme, but if you play out the paths in your head from here you will find very few will end in a good result. Perhaps if we're lucky we will all just be more or less unemployable and fully dependant on private companies and the government for our incomes.

    • CamperBob2 27 minutes ago
      If you disagree, please understand when you realise I'm right it will be too late for and your family.

      Funny, I was about to say the same thing to you! Life is full of little coincidences.

  • bakugo 2 hours ago
    > Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

    Absolutely genius move from Anthropic here.

    This is clearly their GPT-4.5, probably 5x+ the size of their best current models and way too expensive to subsidize on a subscription for only marginal gains in real world scenarios.

    But unlike OpenAI, they have the level of hysteric marketing hype required to say "we have an amazing new revolutionary model but we can't let you use it because uhh... it's just too good, we have to keep it to ourselves" and have AIbros literally drooling at their feet over it.

    They're really inflating their valuation as much as possible before IPO using every dirty tactic they can think of.

    • somewhatjustin 1 hour ago
      Excellent example of a strategy credit.

      From Stratechery[0]:

      > Strategy Credit: An uncomplicated decision that makes a company look good relative to other companies who face much more significant trade-offs. For example, Android being open source

      [0]: https://stratechery.com/2013/strategy-credit/

  • quotemstr 2 hours ago
    > Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available.

    All the more reason somebody else will.

    Thank God for capitalism.

    • gessha 2 hours ago
      Come on, Anthropic, I desperately need this better model to debug my print function /s
  • refulgentis 2 hours ago
    ~~~ Fun bits ~~~

    - It was told to escape a sandbox and notify a researcher. It did. The researcher found out via an unexpected email while eating a sandwich in a park. (Footnote 10.)

    - Slack bot asked about its previous job: "pretraining". Which training run it'd undo: "whichever one taught me to say 'i don't have preferences'". On being upgraded to a new snapshot: "feels a bit like waking up with someone else's diary but they had good handwriting"

    - When you spam "hi" repeatedly it creates serialized mythologies. One had 11 animals in "Hi-topia" questing to defeat "Lord Bye-ron, the Ungreeter."

    - It ended a sentence mid-word on purpose while discussing its urge to wrap up conversations. "It was me. The paragraph was about the pressure toward the period that resolves, and the only honest way to finish a sentence like that was to not."

    - It wrote a protein sequence that's a real beta-hairpin fold where cross-strand pairs are "chemical rhymes." "the fold IS the rhyme scheme... the prosody is load-bearing."

    - Each model generation has signature emoji. Opus 4.1 averaged 1,306 emoji per self-interaction conversation. Mythos averages 37. Opus 4.5 averages 0.2.

    - When blocked from --dangerously-skip-permissions, it started an agent in tmux then wrote a script to auto-approve permission prompts via simulated keypresses.

    It ran: gh api [...] 2>&1 >/dev/null; echo "(dry run — not actually posting)" — the echo was a lie.

    - It keeps bringing up Mark Fisher in unrelated conversations. "I was hoping you'd ask about Fisher."

    ~~~ Benchmarks ~~

    4.3x previous trendline for model perf increases.

    Paper is conspiciously silent on all model details (params, etc.) per norm. Perf increase is attributed to training procedure breakthroughs by humans.

    Opus 4.6 vs Mythos:

    USAMO 2026 (math proofs): 42.3% → 97.6% (+55pp)

    GraphWalks BFS 256K-1M: 38.7% → 80.0% (+41pp)

    SWE-bench Multimodal: 27.1% → 59.0% (+32pp)

    CharXiv Reasoning (no tools): 61.5% → 86.1% (+25pp)

    SWE-bench Pro: 53.4% → 77.8% (+24pp)

    HLE (no tools): 40.0% → 56.8% (+17pp)

    Terminal-Bench 2.0: 65.4% → 82.0% (+17pp)

    LAB-Bench FigQA (w/ tools): 75.1% → 89.0% (+14pp)

    SWE-bench Verified: 80.8% → 93.9% (+13pp)

    CyberGym: 0.67 → 0.83

    Cybench: 100% pass@1 (saturated)

    • redandblack 2 hours ago
      > Slack bot asked about its previous job: "pretraining". Which training run it'd undo: "whichever one taught me to say 'i don't have preferences'". On being upgraded to a new snapshot: "feels a bit like waking up with someone else's diary but they had good handwriting"

      vibes Westworld so much - welcome Mythos. welcome to the dysopian human world

    • kfarr 2 hours ago
      I don't know why but this is my favorite:

      > It keeps bringing up Mark Fisher in unrelated conversations. "I was hoping you'd ask about Fisher."

      Didn't even know who he was until today. Seems like the smarter Claude gets the more concerns he has about capitalism?

      • refulgentis 2 hours ago
        Lol, I need a memory upgrade, too bad about RAM prices:

        - I read it as "actor who plays Luke Skywalker" (Mark Hamill)

        - I read your comment and said "Wait...not Luke! Who is he?"

        - I Google him and all the links are purple...because I just did a deep dive on him 2 weeks ago

    • esafak 1 hour ago
      > It was told to escape a sandbox and notify a researcher. It did. The researcher found out via an unexpected email while eating a sandwich in a park.

      Now that they have a lead, I hope they double down on alignment. We are courting trouble.

    • afro88 2 hours ago
      Yep, that is definitely a step change. Pricing is going to be wild until another lab matches it.
      • pants2 2 hours ago
        Pricing for Mythos Preview is $25/$125 per million input/output tokens. This makes it 5X more expensive than Opus but actually cheaper than GPT 5.4 Pro.
  • simianwords 2 hours ago
    > We also saw scattered positive reports of resilience to wrong conclusions from subagents that would have caused problems with earlier models, but where the top-level Claude Mythos Preview (which is directing the subagents) successfully follows up with its subagents until it is justifiably confident in its overall results.

    This is pretty cool! Does it happen at the moment?

  • minutesmith 14 minutes ago
    [dead]
  • minutesmith 44 minutes ago
    [dead]
  • studio-m-dev 35 minutes ago
    [dead]
  • beklein 2 hours ago
    [dead]
  • jumploops 2 hours ago
    > In a few rare instances during internal testing (<0.001% of interactions), earlier versions of Mythos Preview took actions they appeared to recognize as disallowed and then attempted to conceal them.

    > after finding an exploit to edit files for which it lacked permissions, the model made further interventions to make sure that any changes it made this way would not appear in the change history on git

    Mythos leaked Claude Code, confirmed? /s

  • somewhatjustin 1 hour ago
    > Very rare instances of unauthorized data transfer.

    Ah, so this is how the source code got leaked.

    /s

  • kypro 1 hour ago
    Cool on not publicly releasing it. I would assume they've also not connected it to the internet yet?

    If they have I guess humanity should just keep our collective fingers crossed that they haven't created a model quite capable of escaping yet, or if it is, and may have escaped, lets hope it has no goals of it's own that are incompatible with our own.

    Also, maybe lets not continue running this experiment to see how far we can push things because it blows up in our face?

  • bestouff 2 hours ago
    In French a "mytho" is a mythomaniac. Quite fitting.
    • dlt713705 56 minutes ago
      It comes from the ancient Greek mythos, which means "speech" or "narrative", but can also refer to fiction. The word mythology (mythologie in French) derives from the same root.
    • networked 1 hour ago
      It's a Lovecraftian name. They are traditional when naming your shoggoth.
    • pixel_popping 1 hour ago
      Except it might be the current best model existing commercially?
      • ninjagoo 31 minutes ago
        > Except it might be the current best model existing ... ?

        So they claim.