• VintageGenious@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    77
    arrow-down
    15
    ·
    19 hours ago

    Because you’re using it wrong. It’s good for generative text and chains of thought, not symbolic calculations including math or linguistics

    • Grandwolf319@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      15
      ·
      16 hours ago

      Because you’re using it wrong.

      No, I think you mean to say it’s because you’re using it for the wrong use case.

      Well this tool has been marketed as if it would handle such use cases.

      I don’t think I’ve actually seen any AI marketing that was honest about what it can do.

      I personally think image recognition is the best use case as it pretty much does what it promises.

    • Prandom_returns@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      8 hours ago

      So for something you can’t objectively evaluate? Looking at Apple’s garbage generator, LLMs aren’t even good at summarising.

      • lime!@feddit.nu
        link
        fedilink
        English
        arrow-up
        13
        ·
        edit-2
        17 hours ago

        i’m still not entirely sold on them but since i’m currently using one that the company subscribes to i can give a quick opinion:

        i had an idea for a code snippet that could save be some headache (a mock for primitives in lua, to be specific) but i foresaw some issues with commutativity (aka how to make sure that a + b == b + a). so i asked about this, and the llm created some boilerplate to test this code. i’ve been chatting with it for about half an hour and testing the code it produces, and had it expand the idea to all possible metamethods available on primitive types, together with about 50 test cases with descriptive assertions. i’ve now run into an issue where the __eq metamethod isn’t firing correctly when one of the operands is a primitive rather than a mock, and after having the llm link me to the relevant part of the docs, that seems to be a feature of the language rather than a bug.

        so in 30 minutes i’ve gone from a loose idea to a well-documented proof-of-concept to a roadblock that can’t really be overcome. complete exploration and feasibility study, fully tested, in less than an hour.

      • L3s@lemmy.worldM
        link
        fedilink
        English
        arrow-up
        28
        arrow-down
        8
        ·
        edit-2
        19 hours ago

        Writing customer/company-wide emails is a good example. “Make this sound better: we’re aware of the outage at Site A, we are working as quick as possible to get things back online”

        Dumbing down technical information “word this so a non-technical person can understand: our DHCP scope filled up and there were no more addresses available for Site A, which caused the temporary outage for some users”

        Another is feeding it an article and asking for a summary, https://hackingne.ws/ does that for its Bsky posts.

        Coding is another good example, “write me a Python script that moves all files in /mydir to /newdir”

        Asking for it to summarize a theory or protocol, “explain to me why RIP was replaced with RIPv2, and what problems people have had since with RIPv2”

          • L3s@lemmy.worldM
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            edit-2
            12 hours ago

            My experience has been very different, I do have to sometimes add to what it summarized though. The Bsky account mentioned is a good example, most of the posts are very well summarized, but every now and then there will be one that isn’t as accurate.

        • snooggums@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          5
          ·
          edit-2
          18 hours ago

          The dumbed down text is basically as long as the prompt. Plus you have to double check it to make sure it didn’t have outrage instead of outage just like if you wrote it yourself.

          How do you know the answer on why RIP was replaced with RIPv2 is accurate and not just a load of bullshit like putting glue on pizza?

          Are you really saving time?

          • L3s@lemmy.worldM
            link
            fedilink
            English
            arrow-up
            5
            ·
            edit-2
            18 hours ago

            Yes, I’m saving time. As I mentioned in my other comment:

            Yeah, normally my “Make this sound better” or “summarize this for me” is a longer wall of text that I want to simplify, I was trying to keep my examples short.

            And

            and helps correct my shitty grammar at times.

            And

            Hallucinations are a thing, so validating what it spits out is definitely needed.

            • snooggums@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              8
              ·
              18 hours ago

              How do you validate the accuracy of what it spits out?

              Why don’t you skip the AI and just use the thing you use to validate the AI output?

              • L3s@lemmy.worldM
                link
                fedilink
                English
                arrow-up
                3
                arrow-down
                2
                ·
                17 hours ago

                Most of what I’m asking it are things I have a general idea of, and AI has the capability of making short explanations of complex things. So typically it’s easy to spot a hallucination, but the pieces that I don’t already know are easy to Google to verify.

                Basically I can get a shorter response to get the same outcome, and validate those small pieces which saves a lot of time (I no longer have to read a 100 page white paper, instead a few paragraphs and then verify small bits)

            • snooggums@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              4
              ·
              edit-2
              18 hours ago

              If the amount of time it takes to create the prompt is the same as it would have taken to write the dumbed down text, then the only time you saved was not learning how to write dumbed down text. Plus you need to know what dumbed down text should look like to know if the output is dumbed down but still accurate.

      • chaosCruiser@futurology.today
        link
        fedilink
        English
        arrow-up
        5
        arrow-down
        1
        ·
        edit-2
        17 hours ago

        Here’s a bit of code that’s supposed to do stuff. I got this error message. Any ideas what could cause this error and how to fix it? Also, add this new feature to the code.

        Works reasonably well as long as you have some idea how to write the code yourself. GPT can do it in a few seconds, debugging it would take like 5-10 minutes, but that’s still faster than my best. Besides, GPT is also fairly fluent in many functions I have never used before. My approach would be clunky and convoluted, while the code generated by GPT is a lot shorter.

        If you’re well familiar with the code you’ve working on, GPT code will be convoluted by comparison. If so, you can ask GPT for the rough alpha version, and you can do the debugging and refining in a few minutes.

        • Windex007@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          5
          ·
          17 hours ago

          That makes sense as long as you’re not writing code that needs to know how to do something as complex as …checks original post… count.

          • TimeSquirrel@kbin.melroy.org
            link
            fedilink
            arrow-up
            2
            ·
            16 hours ago

            It can do that just fine, because it has seen enough examples of working code. It can’t directly count correctly, sure, but it can write “i++;”, incrementing a variable by one in a loop and returning the result. The computer running the generated program is going to be doing the counting.

      • The Hobbyist@lemmy.zip
        link
        fedilink
        English
        arrow-up
        6
        ·
        19 hours ago

        One thing which I find useful is to be able to turn installation/setup instructions into ansible roles and tasks. If you’re unfamiliar, ansible is a tool for automated configuration for large scale server infrastructures. In my case I only manage two servers but it is useful to parse instructions and convert them to ansible, helping me learn and understand ansible at the same time.

        Here is an example of instructions which I find interesting: how to setup docker for alpine Linux: https://wiki.alpinelinux.org/wiki/Docker

        Results are actually quite good even for smaller 14B self-hosted models like the distilled versions of DeepSeek, though I’m sure there are other usable models too.

        To assist you in programming (both to execute and learn) I find it helpful too.

        I would not rely on it for factual information, but usually it does a decent job at pointing in the right direction. Another use i have is helpint with spell-checking in a foreign language.

      • chiisana@lemmy.chiisana.net
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        5
        ·
        19 hours ago

        Ask it for a second opinion on medical conditions.

        Sounds insane but they are leaps and bounds better than blindly Googling and self prescribe every condition there is under the sun when the symptoms only vaguely match.

        Once the LLM helps you narrow in on a couple of possible conditions based on the symptoms, then you can dig deeper into those specific ones, learn more about them, and have a slightly more informed conversation with your medical practitioner.

        They’re not a replacement for your actual doctor, but they can help you learn and have better discussions with your actual doctor.

        • Wogi@lemmy.world
          link
          fedilink
          English
          arrow-up
          6
          arrow-down
          7
          ·
          19 hours ago

          So can web MD. We didn’t need AI for that. Googling symptoms is a great way to just be dehydrated and suddenly think you’re in kidney failure.

          • chiisana@lemmy.chiisana.net
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            3
            ·
            18 hours ago

            We didn’t stop trying to make faster, safer and more fuel efficient cars after Model T, even though it can get us from place A to place B just fine. We didn’t stop pushing for digital access to published content, even though we have physical libraries. Just because something satisfies a use case doesn’t mean we should stop advancing technology.

            • snooggums@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              18 hours ago

              AI is slower and less efficient than the older search algorithms and is less accurate.

            • Wogi@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              18 hours ago

              We also didn’t make the model T suggest replacing the engine when the oil light comes on. Cars, as it happens, aren’t that great at self diagnosis, despite that technology being far simpler and further along than generative models are. I don’t trust the model to tell me what temperature to bake a cake at, I’m sure at hell not going to trust it with medical information. Googling symptoms was risky at best before. It’s a horror show now.

      • Voyajer@lemmy.world
        link
        fedilink
        English
        arrow-up
        20
        arrow-down
        1
        ·
        edit-2
        19 hours ago

        This but actually. Don’t use an LLM to do things LLMs are known to not be good at. As tools various companies would do good to list out specifically what they’re bad at to eliminate requiring background knowledge before even using them, not unlike needing to somehow know that one corner of those old iPhones was an antenna and to not bridge it.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          18 hours ago

          Yup, the problem with that iPhone (4?) wasn’t that it sucked, but that it had limitations. You could just put a case on it and the problem goes away.

          LLMs are pretty good at a number of tasks, and they’re also pretty bad at a number of tasks. They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about. They’re pretty good at generating code, just don’t trust the code to be perfect.

          You wouldn’t use a chainsaw to build a table, but it’s pretty good at making big things into small things, and cleaning up the details later with a more refined tool is the way to go.

          • snooggums@lemmy.world
            link
            fedilink
            English
            arrow-up
            4
            arrow-down
            3
            ·
            18 hours ago

            They’re pretty good at summarizing, but don’t trust the summary to be accurate, just to give you a decent idea of what something is about.

            That is called being terrible at summarizing.

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              2
              ·
              17 hours ago

              That depends on how you use it. If you need the information from an article, but don’t want to read it, I agree, an LLM is probably the wrong tool. If you have several articles and want go decide which one has the information you need, an LLM is a pretty good option.

      • TheGrandNagus@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        ·
        18 hours ago

        I think there’s a fundamental difference between someone saying “you’re holding your phone wrong, of course you’re not getting a signal” to millions of people and someone saying “LLMs aren’t good at that task you’re asking it to perform, but they are good for XYZ.”

        If someone is using a hammer to cut down a tree, they’re going to have a bad time. A hammer is not a useful tool for that job.