map::operator[] should be nodiscard

(quuxplusone.github.io)

62 points | by jandeboevrie 5 days ago

12 comments

  • nialv7 9 hours ago
    C++ operator [] is poorly designed: index, and index+assignment should be two different operators, and indexing alone should never insert new entries into the map.

    Languages like D [0] or Rust [1] get this right.

    [0]: https://dlang.org/spec/operatoroverloading.html#index_assign... [1]: https://doc.rust-lang.org/std/ops/trait.IndexMut.html

    • gpderetta 8 hours ago
      You could (and I would) make the opposite statement: upsert should be the default operator and if you want lookup only or insert only you call different operators.

      I find it annoying that I often have to reach to defaultdict in Python to get this behavior.

      • tialaramex 6 hours ago
        C++ could offer the entry API here, so you can get back a type representing the result of finding where this key would go, and then either it has a key+value pair you can mutate if you want, or it has a blank state allowing you to write a new key+value pair if that's what you want, without redoing the potentially expensive find operation to figure out where to put the new/updated pair
      • hmry 6 hours ago
        I certainly use defaultdict often in Python too, but not more often than the regular dict. Maybe 90% dict and 10% defaultdict. So from my POV lookup only should definitely be the default.
    • Tempest1981 3 hours ago
      iirc, there are 5 ways to put something into a std::map

        operator[], insert(), emplace(), try_emplace(), insert_or_assign()
      
      And 2 of them don't overwrite an existing value.

      Lots of people are surprised that insert() can fail. And even more surprised that a RHS [] inserts a default value. I'm not a fan of APIs that surprise.

  • junon 13 hours ago
    For the Rust inclined, [[nodiscard]] is #[must_use], if you were confused.

    Anyway, this article illustrates a great reason why C++ is a beautiful mess. You can do almost anything with it, and that comes at a cost. It's the polar opposite ethos of "there should be one clear way to do something" and this sort of thing reminds me why I have replaced all of my systems language needs with Rust at this point, despite having a very long love/hate relationship with both C and C++.

    Totally agree it should be marked as nodiscard, and the reasoning for not doing so is a good example of why other languages are taking over.

    • m-schuetz 9 hours ago
      I'm not a fan of nodiscard because it's applied way too freely, even if the return value is not relevant. E.g. WebGPU/WGSL initially made atomics nodiscard simply because they return a value, but half the algorithms that use atomics only do so for the atomic write, without needing the return value. But due to nodiscard you had to make a useless assignment to an unused variable.
    • bayesnet 12 hours ago
      It’s also worth noting that in rust you don’t need to be as worried about marking a function #[must_use] if there is a valid reason some of the time to discard the value. One can just assign like so `let _ = must_use_fn()` which discards the value and silences the warning. I think this makes the intent more clear than casting to void as TFA discusses.
      • on_the_train 12 hours ago
        There is in c++, too (std::ignore). Not sure why the author decided to go with the ancient void cast
        • vitus 9 hours ago
          std::ignore's behavior outside of use with std::tie is not specified in any finalized standard.

          https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p29... aims to address that, but that won't be included until C++26 (which also includes _ as a sibling commenter mentions).

        • aw1621107 10 hours ago
          I believe C++26 now allows _ as a placeholder name [0]:

          > We propose that when_ is used as the identifier for the declaration of a variable, non static class member variable, lambda capture or structured binding. the introduced name is implicitly given the [[maybe_unused]] attribute.

          > In contexts where the grammar expects a pattern matching pattern,_ represents the wildcard pattern.

          Some of the finer details (e.g., effect on lifetime and whether a _ variable can be used) differ, though.

          [0]: https://github.com/cplusplus/papers/issues/878

          • tialaramex 9 hours ago
            Specifically, Rust's _ is not a variable, it is a pattern that matches anything and so let _ = isn't an assignment it's specifically the explicit choice not to assign this value. If we wrote a "dummy" variable the compiler is forbidden from dropping the value early, that "dummy" is alive until it leaves scope, whereas if we never assigned the value it's dropped immediately.

            In modern Rust you don't need the let here because you're allowed to do the pattern match anywhere, and as I said _ is simply a pattern that matches anything. So we could omit the let keyword, but people don't.

    • pjmlp 12 hours ago
      I disagree with the conclusion, other languages are taking over because they have the advantage of not having 40 years of production code history, and those adopting them don't care about existing code.

      You will find similar examples in Python, Java, C#,... and why not everyone is so keen into jumping new language versions.

    • the_mitsuhiko 12 hours ago
      Interestingly Index::index is also usually not marked as `#[must_use]` in Rust either.
      • junon 12 hours ago
        I don't believe you can mark trait methods with #[must_use] - it has to be on the implementation. Not near a compiler to check at the moment.

        In the case of e.g. Vec, it returns a reference, which by itself is side-effect free, so the compiler will always optimize it. I do agree that it should still be marked as such though. I'd be curious the reasons why it's not.

        • steveklabnik 10 hours ago
          This is just my take, but I think historically the Rust team was hesitant to over-mark things #[must_use] because they didn't want to introduce warning fatigue.

          I think there's a reasonable position to take that it was/is too conservative, and also one that it's fine.

        • the_mitsuhiko 11 hours ago
          But it's also not marked at the implementation for HashMap's Index impl for instance.
          • tialaramex 8 hours ago
            This didn't seem like a footgun to me, hats["Jim"]; will panic if, in fact "Jim" isn't one of the keys, but what did the hypothetical author expect to happen when they write this? HashMap doesn't implement IndexMut so hats["Jim"] = 26; won't even compile.
  • reactordev 14 hours ago
    My pet peeve with c++ is exactly this. Either it’s not wise to call release, or it is (under circumstances) and yet the developer has no idea whether their scenario applies (tip: it doesn’t, 90% of the time).

    The stdlib is so bloated with these “Looks good, but wait” logic bombs.

    I wish someone would just draw a line in the sand and say “No, from here on out, this is how this works and there are no other scenarios in which there needs a work around”. This is why other systems languages are taking off (besides the expressiveness or memory safety bandwagon) is because there are clear instructions in the docs on what this does with examples of how to use it properly.

    Most c++ codebases I’ve seen the last 10 years are decent (a few are superb) and I get that there’s old code out there but at what point do we let old dogs die?

    • GuB-42 12 hours ago
      C++ has always been a "kitchen sink" language, it is used in many different ways and drawing any line may alienate an entire industry.

      > This is why other systems languages are taking off

      Great! It is not a competition. If you think that Rust is a better choice, use Rust, don't make C++ into Rust. Or maybe try Carbon, it looks like it is the language you want. But if you have some old dogs you want to keep alive, then use C++, that's what it is for.

      • reactordev 12 hours ago
        I get it, I do. There’s a lot of old code out there. My point wasn’t that old dogs are bad. My point was about changing how we care for them.

        If you have old code that you want to compile, use -c98 or whatever to peg it to that. Leave the rest of us alone to introduce more modern ways of things. I’d even be happy to see removal of things.

    • pjmlp 12 hours ago
      > This is why other systems languages are taking off

      For the time being that are still being written with C++ infrastructure though.

      It would be great if those wannabe C++ replacements were fully bootstraped.

      • reactordev 12 hours ago
        Go compiles go, not sure what you mean by wannabe c++.

        There’s a frontend to gcc for go and working on rust. Is it the use of gcc you dislike? You’re going to have to explain some more.

        We’re stuck on ASM/ELF. We’re stuck on C of some kind. Maybe in the future LLMs can help us write low-level / high expressiveness code but until we get rid of 1970s “personal computer” decisions in silicon, we’re stuck with it.

        • throwaway17_17 11 hours ago
          What is the proposed replacement for ASM (in particular) and C in the context of the bootstrapping process? Then why lump ELF (unless you don’t mean the executable format) in with the low level language?

          Historically, pjmlp has pushed very strongly for languages attempting to take the place of C and C++ at the infrastructure layer can not claim to have supplanted those two until their own compiler and related infrastructure is not dependent on C++ (LLVM in particular). I tend to sympathize with this view, it is really hard to take a language’s claim to have supplanted C++ and be the only fit for use language going forward, but then is dependent on millions of lines of the languages they disparage.

          As a counter however, it’s rather difficult to expect a language to overcome thousands of person-years of work on a compiler like LLVM and tens of thousands of person-years on Linux. The newer languages should be able to make an articulate case that throwing away so much work is not a viable approach and just use what exists now but keep the new languages on all greenfield projects.

          • steveklabnik 10 hours ago
            As another counter, Rust has never claimed to have "supplanted" C++. So holding it to that standard is holding it to a non-goal for itself in the first place.
            • throwaway17_17 10 hours ago
              tl;dr - Rust wants to be a foundational language of the computing stack and holding it to the standard of being bootstrapped, instead of relying on another language, is a reasonable critique.

              Clearly Rust the language and the associated organization would not make that claim. Particularly where supplanted is a past tense verb and indicates that it is a completed project.

              However, despite overblown complaints about the RESF, the community both in commentary and in practice has been extremely vocal that any language that does not have Rust’s memory safety model is not suitable for any new project or further use in existing projects. And while the RIIR meme is for the most part a message board strawman, again, the community surrounding Rust is busy reimplementing coreutils in Linux, putting Rust in the kernel, and rewriting the userland executables that most Linux workflows are based around (ripgrep being the most successful in this group).

              It is clear that Rust, the community of users (if not the language as an independent entity) clearly wants to supplant both C and C++ at all levels of the computing stack. The push for Rust in the Linux kernel is enough evidence to support the concept at the most pervasive level.

              Continuous references to wide spread adoption and endorsements by ‘big tech’ is used to frame Rust as the only viable option going forward. Blog posts, Reddit threads, and board comments all routinely take the stance that memory-safety (as defined by Rust) is ‘table stakes’ for any development occuring in current year.

              It feels disingenuous to pretend that Rust is not trying to become the industry standard language in the way C and C++ is today and has been for multiple decades. And given that aim, I think talking about Rust, as the name for both the language and its community of users and supporters, is working to supplant C and C++.

              Given all that, I find it fair to discuss the fact that while busy trying to maneuver itself into every space in the tech industry (from embedded all the way up to the front end for web web apps) and find some success in doing so Rust is still reliant on C++ infrastructure particularly for compilation. I was responding to a pair of comments about the desire to see languages that want to be the bedrock of the computing stack bootstrapped. I think Rust absolutely wants to be such a bedrock language and as such, I don’t think wanting it to be bootstrapped and not reliant on the C++ it want to replace is an unreasonable standard to hold the language to.

              • steveklabnik 8 hours ago
                "Rust is a good language and we should use it to write software" is not the same thing as "lol C++ sucks and nobody should use it for anything ever."

                Engineering is all about tradeoffs. Responding to perceived zealotry with more, but different, zealotry makes it harder to have actual discussions.

                LLVM is best in class at what it does. Until someone else decides to make something like LLVM in Rust, it's not realistic to use something else. That's just engineering. The choices here directly refute these sorts of zealotry claims, that is, it's not incoherence in what's being done, it's that you are attributing something to a large group of people who have a wide variety of beliefs. Overall, people are more pragmatic than you're giving them credit for, that's why rustc uses LLVM.

              • LegionMammal978 8 hours ago
                Is offering an alternative to LLVM not precisely one of the purposes of the rustc_codegen_cranelift backend [0]? It still doesn't have 100% feature parity, but I believe it's able to fully bootstrap the compiler at this point. Writing a rustc backend isn't trivial, but it isn't as impossible as you make it out to be.

                [0] https://github.com/rust-lang/rustc_codegen_cranelift

                • throwaway17_17 8 hours ago
                  I’m not sure what I wrote to give the impression that Rust was unable to write a compiler, let alone implied it was impossible. Rust is certainly full featured enough to write a very well performing compiler. I find my comment more an indictment, and viewed uncharitably an accusation of hypocrisy, of the language org’s oversight that they are so heavily invested in LLVM (but if I was leveling such an accusation it would not be just because it’s a C++ project)

                  My comment was focused on the fact that Rust is not using a Rust compiler and therefore is relying on deep and complex C++ infrastructure while working to supplant the same at the lowest levels of the computing stack.

                  I was also commenting, up the thread, in a chain of comments about a perceived shortcoming of Rust’s implementation (i.e. it’s not being bootstrapped) and why some people view that as a negative.

                  • tialaramex 7 hours ago
                    All of the front-end is in fact pure Rust, I know that because I am one of the huge number of authors. The backend, thus the code generation and many optimisations of the sort AoCO is about is LLVM.

                    We absolutely know that if Rust didn't offer LLVM you'd see C++ people saying "Rust doesn't even have a proper optimiser". So now what you're asking for isn't just "a Rust backend" which exists as others have discussed, but a Rust alternative to LLVM, multiple targets, lots of high quality optimisations, etc. presumably as a drop-in or close approximation since today people have LLVM and are in production with the resulting code.

                    • reactordev 4 hours ago
                      Ignore them. Keep going. Debates like these are “Since you said X, Y can’t be true” kind of debates. As long as you have access to be able to do assembly, you should be able to do this. I say you because this is way out of my wheelhouse. I just want a cleaner, less mine-field laden, OO language that compiles to machine code. That’s it. We can stick a feather in this until this time a decade from now when we complain about it again.
        • pjmlp 11 hours ago
          Go is not a wannabe C++ replacement.

          It could be, but its designers aren't keen in modern language design.

          First it needs to fulfill more use cases than Docker and Kubernetes ecosystem.

          And while TinyGO and TamaGo exist, they require custom runtimes, and Assembly tricks that C++ supports at the language level, or even Rust does better than Go.

          It is better than using Oberon-07 minimalist design, though.

          • reactordev 7 hours ago
            I agree 100%. Just pointing out there’s efforts in this area for better or worse.

            My rant is really about sensible defaults that should enforce security and standards (stdlib after all) instead of having to juggle archaic edge cases from hardware of 30 years ago or adding more keyword sugar to your signature to make it through.

            Go is fun to write though.

          • throwaway17_17 9 hours ago
            Can you expand on Oberon-07 minimalism in the context of bootstrapping or working at the lowest level of abstractions?

            Your posts about the Wirth tradition languages and their implementations are typically well founded and I haven’t read much on this aspect. If you just have a reference you’d suggest that would be more than enough (if you don’t want to take time explaining what has been written elsewhere).

      • SJC_Hacker 12 hours ago
        > t would be great if those wannabe C++ replacements were fully bootstraped

        This would require (re)writing the OS in the replacement language

        Also need assembler to be taken seriously, which Rust can’t do last I checked

        • throwaway17_17 11 hours ago
          Can you explain the assembler bit? Are you talking about the handling of inline assembly? I though Rust allowed that in unsafe code.
          • steveklabnik 10 hours ago
            It does! And it's in the language proper, unlike being a compiler extension like it is in C.
            • tialaramex 10 hours ago
              And Rust also has a good story for all the accompanying baggage such as naked assembler functions, whereas even with your extension in C or C++ there may just be a shrug emoji, or some blog posts because hey it's not part of the language.
        • pjmlp 11 hours ago
          Nah, they could start by not depending on LLVM/GCC and do their whole compiler back to back.
          • lenkite 10 hours ago
            Isn't that what Rust is attempting with Cranelift ? They had making Cranelift backend "production-ready" for development use as a goal for 2026. I am guessing it will be a few years beyond that before it is made available for general production-ready use.

            I think Zig might possibly beat Rust's timeline here for a "No C/C++" toolchain. That is if its lead doesn't burn himself out.

  • fn-mote 14 hours ago
    Title could be “ugly C++ idioms prevent map::operator[] from being [[nodiscard]]”.

    Many of the uses are in Google’s codebase.

    Overall very technical- interesting if you are a library writer or maybe if you care about long term improvements in your C++’legacy codebase.

  • dundarious 10 hours ago
    There is no need to compromise in order to support pre-C++17, you don't need `try_emplace` when the value is like bool and hence doesn't benefit from move semantics -- plain old `insert` is exactly equivalent, and has existed since std::map's inception.
  • themafia 13 hours ago
    Ah, and because this is C++, the standard map having typed template parameters, which could be a non pointer, they're forced to make operator[] have this semantic:

    Returns a reference to the value that is mapped to a key equivalent to key or x respectively, performing an insertion if such key does not already exist.

    Which is a bit of a surprise coming from mostly C and Go.

    • ahartmetz 10 hours ago
      This "create a default-constructed value just so you can return a reference" logic is pretty terrible tbh. For insert: first create a default-constructed value, then assign to it. For retrieval: in the not found case, (permanently) insert a default-constructed value into the map. Need to return a valid reference!

      Qt containers do it better: upsert with insert() and retrieve with value(), which, in the not found case, will return a default-constructed value (or a caller-supplied value) but without inserting it into the map.

  • rwmj 13 hours ago
    https://en.cppreference.com/w/cpp/language/attributes/nodisc... .. in case anyone else was wondering. It seems to mean the compiler should warn if you ignore the result except by an explicit cast to void.
    • larusso 13 hours ago
      Thanks. I was wondering what this means practically. So they rolled this back because Google who compile with warnings as errors can’t fix these lines? Must be great to be Google and on all these boards. I on the other hand have to deal constantly with breaking changes left and right because someone decided, among them Google (16KB page tables anyone), that going forward stuff works differently.
      • compiler-guy 8 hours ago
        Google could easily change these lines. The question is, should it?

        One thing about Google living so close to head with its libc++ is that it encounters the issues downstream users will encounter, just long before everyone else. It saw this development within a day or two of the or getting merged.

        The idiom is unfortunately common in C++ codebases around the world so this was a good predictor that many other users will be broken. It isn’t necessarily erroneous, unlike many of the other no-discard additions made in this patch series.

        So the question becomes, “Are the false positives worth the true positives?” Not just for Google, but for the entire user base.

        It is reasonable to disagree on this, and often library writers up to date on the latest and greatest miss the issues this sort of change will cause.

  • j1elo 12 hours ago
    C++ could try to approach the "stability without stagnation" model.

    Add an opt-in compiler flag --edition='26' which, when used, applies the breaking changes defined for C++26. Then users like Google or others who have been (ab)using some features for their side effects can decide to stay on the older versions.

    • pjmlp 12 hours ago
      It already exists, --std=c++26.
  • ivanjermakov 12 hours ago
    Another default that Zig got right: every non-void result must be handled.

    https://github.com/ziglang/zig/issues/219

    • leecommamichael 6 hours ago
      This is one of those preferences that will never fail to split the room. I appreciate both routes depending on the domain, but I do have a preference. As a games and UI app developer, I find required-handling-by-default adds too much friction and disrupts my flow. Rust and Zig (through different means) create friction like this in an effort to make low-level code "easier", but only if the code is "correct" according to the language. As a dev that spent a lot of time with Swift, and loving the language's ability to express APIs, I came to appreciate compilers with the quality that "if the code runs, it's probably correct," and yet my preference did not land on Rust or Zig, but with Odin. I sat down with Odin 5 or so years ago and it felt like the friction was exactly where I wanted it to be for the software I write.
    • kyralis 11 hours ago
      I would not agree. There are plenty of times I've written and used functions that have informational return values that are beneficial in certain cases and unnecessary in most. This is why most languages have chosen annotations to allow for the function author to indicate intent.
    • drnick1 4 hours ago
      It’s a terrible idea, what if you aren’t interested in the return value and can’t rewrite the function because it is in a third party library? Assigning that to a variable that won’t be used is terribly inefficient and inelegant.
    • m-schuetz 9 hours ago
      Terrible for atomic functions where you are often only interested in atomically writing something, but not interested in the return value they also provide.
  • jesse__ 4 hours ago
    lol .. add one more reason to the overflowing fountain of reasons to not use anything in std
  • dooglius 13 hours ago
    `try_emplace` is not a huge improvement since it overloads the existing keyword "try" to mean something pretty different. Should be `emplace_if_absent`/`insert_if_absent` but changing the API of stdlib would require going through a huge formal process