• 18107@aussie.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    In this case the limit was entirely arbitrary.

    The programmers were told to pick a limit and they liked 256. There are issues with having a large number of people in a group, but it wasn’t a hardware limit for this particular case.

  • ch00f@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    I’m typing this on a 64 bit device. Why anyone would limit something to an 8 bit number in 2025 is really odd.

    • magic_lobster_party@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      2 months ago

      It’s for their servers. I guess it might have to do with cache optimization reasons. For performance reasons, they want to ensure they can fit as much as possible in the cache. One extra byte can throw the memory alignment off, which cause wasted space in cache.

      Just my guess. There might be other reasons.

      • ch00f@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        A single username will use up more memory than an 8-bit limitation to the number of users will save.

    • MudMan@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      2 months ago

      On a device with many gigabytes of RAM and probably terabytes of storage.

      I guess when you have billions of users, and presumably tens or hundreds of billions of instances of a thing living in your sever every bit adds up? I don’t even know where to even start doing the napkin math for something like that.

      • boonhet@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        100 billion messages per day and over half of them in groups apparently. It’s a lot, but 3 bytes per message is still not a lot of data. I’d guess they pack the metadata as tight as possible.

    • Cousin Mose@lemmy.hogru.ch
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      I get what you’re saying but I don’t like this line of thinking. In the tech industry there is far too much bloat that we just accept due to cheap memory and storage.

      • Justin@lemmy.jlh.name
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        There’s much better algorithmic and datatype optimizations to be made than to design your app around saving 3 bytes that most runtimes probably represent as a long long anyways

        • Cousin Mose@lemmy.hogru.ch
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          2 months ago

          True but more generally things like Electron apps, not precompiling classes in interpreted languages’ Docker images, looping through millions of records without plucking only the data you need, etc seem to be widespread and shrugged off.

          While writing code you can get in the habit of doing things efficiently and long-term the cost savings pile up. Obviously caring about only this one specific case will hardly accomplish much on its own.

    • Honytawk@feddit.nl
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      Whatsapp has 2 billion users.

      The difference is 16 billion bits compared to 128 billion bits, or about 16 GB and that is just for the number.

      When working with big sizes, memory optimization is key.

  • spongebue@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    So, I get that 256 is a base 2 number. But we’re not running 8-bit servers or whatever here (and yes, I understand that’s not what 8-bit generally refers to). Is there some kind of technical limitation I’m not thinking of where 257 would be any more difficult to implement, or really is it just that 256 has a special place in someone’s heart because it’s a base 2 number?

    • jaaake@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      The issue isn’t storing each individual ID, it’s all of the networking operations that are done and total things that are stored/cached per user in each chat. All of those things are handled and stored as efficiently as possible. Sure they could set it to any number, but 256 is a nice round one when considering everything that is happening and the use cases involved. They have user research data and probably see that 128 is too close to a group size that happens with some regularity, but group sizes very rarely get close to 256, and 512 is right out.

    • mEEGal@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      when writing somewhat low-level code, you always make assumptions about things. in this case, they chose to manage 256 entries in some array; the bound used to be lower.

      but implicitly there’s a tradeoff, probably memory / CPU utilisation in the server.

      it’s always about the tradeoff between what the users want, what is easier for you to maintain, what your infrastructure can provide, etc.

    • SparroHawc@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      There’s often a lot of fun cheats you can use - bitwise operators, etc - if your numbers are small powers of two.

      Also it’s easier to organize memory, if you’re doing funky memory management tricks, if the memory you’re allocating fits nicely into the blocks available to you which are always in powers of two.

      They’re not necessarily great reasons if you’re using a language with sufficient abstraction, but it’s still easier in most instances to use powers of two anyways if you’re getting into the guts of things.

    • AbsolutelyNotAVelociraptor@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      Because 256 is exactly one byte. If you want to add a 257th member, you need a whole second byte just for that one person. That’s a waste of memory, unless you want to go to the 64k barrier of users per chat.

      • Zagorath@aussie.zone
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        Except that they’re almost certainly just using int, which is almost certainly at least 32 bits.

        256 is chosen because the people writing the code are programmers. And just like regular people like multiples of 10, programmers like powers of 2. They feel like nice round numbers.

        • verstra@programming.dev
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          Well, no. They are not certainly using int, they might be using a more efficient data type.

          This might be for legacy reasons or it might be intentional because it might actually matter a lot. If I make up an example, chat_participant_id is definitely stored with each message and probably also in some index, so you can search the messages. Multiply this over all chats on WhatsApp, even the ones with only two people in, and the difference between u8 and u16 might matter a lot.

          But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

          • Zagorath@aussie.zone
            link
            fedilink
            English
            arrow-up
            0
            ·
            2 months ago

            They are not certainly using int

            Probably why I said “almost certainly”. And I stand by that. We’re not talking about chat_participant_id, we’re talking about GROUP_CHAT_LIMIT, probably a constant somewhere. And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

            • Passerby6497@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              edit-2
              2 months ago

              And we’re talking about a value that would require a 9-bit unsigned int to store it, at a minimum (and therefore at least a 16-bit integer in sizes that actually exist for types). Unless it’s 8-bit and interprets a 0 as 256, which is highly unorthodox and would require bespoke coding basically all over instead of a basic num <= GROUP_CHAT_LIMIT.

              I think you’re just very confused friend, or misunderstanding how binary counting works, because why in the 9 hells would they be using 9 bits (512 possible values) to store 8 bits (256 possible members) of data?

              I think you’re confusing indexing (0-255) with counting (0-256), and mistakenly including a negation state (counting 0, which would be a null state for the variable) in your conception of the process. Because yes, index 255 is in fact count 256 and 0 would actually be 1. Index = count -1

              • Zagorath@aussie.zone
                link
                fedilink
                English
                arrow-up
                0
                ·
                2 months ago

                I’m imagining something like this:

                def add_member(group, user):
                    if (len(group.members) <= GROUP_CHAT_LIMIT):
                        ...
                

                If GROUP_CHAT_LIMIT is 8 bits, this does not work.

                • Passerby6497@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  0
                  ·
                  2 months ago

                  So add a +1 like you would for any index to count comparison?

                  I guess I’m failing to see how this doesn’t work as long as you properly handle the comparison logic. Maybe you can explain how this doesn’t work…

            • boonhet@sopuli.xyz
              link
              fedilink
              English
              arrow-up
              0
              ·
              edit-2
              2 months ago

              Orrrr they have a u8 chat_participant_id of some kind and a binary data format for message passing. The GROUP_CHAT_LIMIT const may have a bigger data type, but they may very well be trying to conserve 3 bytes per message. Ids can easily start at 0.

              150 gigs of bandwidth saved per day doesn’t seem like a whole lot at their scale, but if they archive all the metadata, that’s over 50 terabytes a year saved on storage - multiplied by how many copies they have of their data. Still not a lot tbh, but if they also conserve data in every other place they can, they could be saving petabytes per year in storage.

              Still weird because then they’d have to reuse ids when people leave, otherwise you could join and leave 255 times to disable a group lol

          • MyBrainHurts@lemmy.ca
            link
            fedilink
            English
            arrow-up
            0
            ·
            2 months ago

            But I understand how a TypeScript or Java dev could think that the difference between 1 and 4 bytes is negligible.

            Shots fired.

            • ByteJunk@lemmy.world
              link
              fedilink
              English
              arrow-up
              0
              ·
              2 months ago

              Fair point, but still better than wasting a nuclear power plant worth of electricity to solve math homework with an LLM

            • jaybone@lemmy.zip
              link
              fedilink
              English
              arrow-up
              0
              ·
              2 months ago

              All these tough guys think you can’t bit shift in Java, never worked on a project with more than two people. Many such cases.

        • Lodespawn@aussie.zone
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          It’ll have to do with packet headers, 8 bits is a lot for an instant message packet header.

        • jaybone@lemmy.zip
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          It’s not that they “like it”. It’s ultimately a hardware limitation. Of course we can have 64 bit integers, or however many bits. It’s an appealing optimization.

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          For high volume wire formats using uint8 instead of uint32 can make a huge difference when considering the big picture. Not everyone is working on bootcamp level software.

      • spongebue@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        If each user is assigned a number as to where they’re placed in the group, I guess. But what happens when people are added and removed? If #145 leaves a full group, does #146 and beyond get decremented to make room for the new #256? (or #255 if zero-indexed). It just doesn’t seem like something you’d actually see in code not designed by a first semester CS student.

        Also, more importantly, memory is cheap AF now 🤷‍♂️

        • SandmanXC@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          While I completely agree with the sentiment, snorting too much “memory is cheap AF” could lead to terminal cases of Electron.

        • ViatorOmnium@piefed.social
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          Memory and network stop being cheap AF when you multiply it by a billion users. And Whatsapp is a mobile app that’s expected to work on the crappiest of networks and connections.

        • morphballganon@lemmynsfw.com
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          There would be no need to decrement later people because they’re definitely referred to using pointers. You’d just need to update the previous person’s pointer to the new next person.

    • Ekky@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      Jup, lots of people are talking 64-bit architecture and RAM optimization, whereas the number in question most likely is related to IPv4 packets, which were made for (and to my knowledge still use) octets/8-bit blocks.

  • Chozo@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    2 months ago

    Source.

    This isn’t a “tech article”, it’s an article about tech. This is a normie article from a normie news outlet for normie readers.

    Also from the article:

    A previous version of this article said it was “not clear why WhatsApp settled on the oddly specific number.” A number of readers have since noted that 256 is one of the most important numbers in computing, since it refers to the number of variations that can be represented by eight switches that have two positions - eight bits, or a byte. This has now been changed. Thanks for the tweets. DB

    • AlexanderTheDead@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      It doesn’t really matter that it’s a “normie article for normie readers”. Writing articles is journalism. Not knowing 256 offhand? Permissible. Being a journalist who wrote an article and didn’t even do the bare bones of research? You’re still a bad journalist, and as callous as it is, you should lose your job and livelihood. Bad journalism is too dangerous to just let it fester like this.

      • Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        The newspaper he was writing for is a major publication he absolutely could have asked someone.

        The problem here is the newspaper didn’t care enough about the article to put anyone on it who is even remotely familiar with technology. They probably thought of it as just some throwaway piece to fill out a bit of space. Which to be fair it would have been had it not been for that comment.

    • wuzzlewoggle@feddit.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      One of the most important numbers? I’d argue the most important number in computing is either 1 or 0…

    • Mark with a Z@suppo.fi
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      That weird ass explanation with switches and “one of the most important numbers” still sounds absolutely clueless.

      • wabasso@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        2 months ago

        I liked the switches analogy! Generally about binary though; I agree it doesn’t connect back to the number of users application.

        And yeah most important number…sounds like they were quoting an LLM.

    • AFK BRB Chocolate (CA version)@lemmy.ca
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      That quote really is the problematic part. The part about switches is fine - it’s an attempt to explain tech to a “normie.” But for a tech writer to ever say it’s not clear why they settled on 256 is worse than embarrassing. They had to be corrected by tweets.

      Anyone whose ever had an intro to computers class has had a computing professional explain computers using simple language and analogies. That’s the way this kind of thing should work. It sounds like this author has no more clue about computing than the target audience, which isn’t going to work out well for the reader.

    • sp3ctr4l@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      2 months ago

      It used to be common for uh, writers, journalists, to have at least basic familiarity with what they’re writing or reporting on.

      Its not like this is journalistic malpractice, spreading lies, fabricating a quote, supporting a bs narrative by being very selective with context and such…

      … but it is pretty embarrassing.

      People seem to constantly confuse ‘i use computer technology’ with ‘i understand how computer technology works’.

      Like uh, Gen Z and A are the most digital, online generations yet… but many of them can’t type on a keyboard, have no idea what a file/folder structure is.

      • AnarchistArtificer@slrpnk.net
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        I think you’re highlighting two different problems here.

        I agree that Gen Z and younger are, on average, far worse at basic computer skills than many seem to assume. It makes me reflect on my tech-learning throughout my childhood, as a Millennial. I think that part of it is that many erroneously assume that because Gen Z has grown up online, that this will lead to proficiency, but the kind of tech they’ve been exposed to is largely walled gardens and oversimplified UIs. That assumption of proficiency leads to scenarios where their lack of skill is only discovered when they enter college, or the workplace. I am astounded at the prospect of people not even knowing the difference between “Cut and Paste” and “Copy and Paste”. It’s grim.

        The poor quality of journalism may be linked to this, but I think it’s larger than that. It seems like it’s not a great time to be a journalist at the moment (my writer friends tell me that increasingly, the only work they’re able to find is copy-editing AI shit). Private equity is fucking up so much of the world — journalism included. Polygon is an example of an outlet that was apparently sustainably profitable, before it was sold and experienced mass lay-offs; an individual company’s success doesn’t matter to the big conglomerate that owns it. I know that other journalistic companies have fallen to the same fate too.

        It also seems that tech journalism ends up being especially shit. I didn’t start noticing it properly until I watched this podcast episode from “Tech Won’t Save Us”. The TL;DW of it is that tech journalists like Kara Swisher like to pretend that they speak truth to power, and fire hard-hitting questions at big tech people, when that’s patently bullshit and it’s clear that they only get the access that they do by playing softball with the powerful. We can’t blame a few individuals for the entirety of the tech journalism problem, but I reckon it’s a big part of it when so many of the established, big names in this space don’t seem interested in actually doing tech journalism (and smaller names who want to ask journalistically interesting questions don’t get platforms or access to ask those questions).

        Our information ecosystem is not in a great place. I’ve found it tremendously beneficial to curate the news and information I’m exposed to (praise be RSS), but that has been a gradual process of actively working to notice good journalism in the world and build up my mental “rolodex” of people whose perspectives I trust to be worthwhile (even if I don’t necessarily agree with said perspectives). However, this is an area that I care deeply about, and thus it feels worthwhile to spend that energy to curate my infosphere. Most people won’t have the inclination or energy to do this work, which is unfortunate.

    • deltapi@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      No, you can’t have a group of zero, so the counter doesn’t need to waste a position counting zero.

      • seejur@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        You cannot also have a group of 1, therefore either is 255 or 257. 256 is oddly specific (or the code was made by an intern)

      • HereIAm@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        If you ever create a system where the number of users is “group.members - 1” everywhere in the code, I’d be very disappointed in you and deny that PR.

        On another note; I doubt WhatsApp are so concerned with performance they are actually limiting the number of group members by the data type.

        • BillBurBaggins@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          2 months ago

          But it wouldn’t be like that though would it. It would be public group.members() and the u8 would be private.

          If all the millions of groups are saved on a central database then making the size a u8 isn’t really that weird

          • HereIAm@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            ·
            2 months ago

            I hadn’t thought about it on their server side tbf. But the more i think about it maybe there are other compounding reasons to keep group sizes small, such as the exponential number of links in a growing network and such. But, that is all beyond my knowledge area.

  • xeekei@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    You know you’re a tech nerd when 256 sounds more even than 250 or 300. 😅

  • BilboBargains@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    I remember being puzzled by this and many other numbers that kept cropping up. 32, 64, 128, 256, 1024, 2048… Why do programmers and electronic engineers hate round numbers? The other set of numbers that was mysterious was timber and sheet materials. They cut them to 1220 x 2440mm and thicknesses of 18 and 25mm. Are programmers and the timber merchants part of some diabolical conspiracy?

      • BilboBargains@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        Much later in my career I came to appreciate the beauty of this system and the link with hexadecimal. I had to debug a network transmitted CRC that was endian flipped and in that process learned that in the Galois Field of two, 1+1=0 which feels delightfully nonsensical to a luddite.

    • Worx@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      32, 64, 128 etc. are all round numbers, counting in binary. They are powers of two. Since computers work in binary, they make logical sense.

      1220mm is 4ft, and 18 and 25mm are three-quarters of an inch, and an inch respectively.

      • Scubus@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        They were making a joke. That being said, im not familiar with lumber or imperial<->metric conversions so their second point was lost on me, so thanks.

      • jj4211@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        2 months ago

        Pretty much this…

        Once upon a time, sure, you might have used an 8 bit char to store an array index and incur a 256 limit for actual reasons…

        But nowadays, you do it because 256 is a “cool techy limit”. Developers are almost all dealing with at least 32 bit values, and the actual constraints driving smaller values generally have nothing to do with some power of two limitation.

  • rarbg@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    2 months ago

    A previous version of this article said it was “not clear why WhatsApp settled on the oddly specific number.” A number of readers have since noted that 256 is one of the most important numbers in computing, since it refers to the number of variations that can be represented by eight switches that have two positions - eight bits, or a byte.

    Lol, weird way to say that 256 is a power of two, and computers operate in base two.

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      It’s a pretty succinct explanation that links what it is to something most people have heard of (a byte).

    • InternetCitizen2@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      2 months ago

      Tbf saying it that way brings a visual metaphor. Simply giving it as a mathematical definition would leave it feeling just as arbitrary.

    • thisisnotgoingwell@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      2 months ago

      It used to be a way bigger deal when computers were very memory scarce, if you needed to say, represent 1024 values, that means you’d use 10 bits or 2 bytes, the remaining 6 bits could be used to store other related information like flags but more often than not it would be waste (unused values that still have to be represented as 0s)

      These numbers are pretty arbitrary nowadays but they still show up a lot in computing. They didn’t choose 256 so they could represent it in a byte, the real reason is probably that groups larger than 256 can’t realistically be managed by users.

      That’s my 2¢ anyways.