If this is the way to superintelligence, it remains a bizarre one. “This is back to a million monkeys typing for a million years generating the works of Shakespeare,” Emily Bender told me. But OpenAI’s technology effectively crunches those years down to seconds. A company blog boasts that an o1 model scored better than most humans on a recent coding test that allowed participants to submit 50 possible solutions to each problem—but only when o1 was allowed 10,000 submissions instead. No human could come up with that many possibilities in a reasonable length of time, which is exactly the point. To OpenAI, unlimited time and resources are an advantage that its hardware-grounded models have over biology. Not even two weeks after the launch of the o1 preview, the start-up presented plans to build data centers that would each require the power generated by approximately five large nuclear reactors, enough for almost 3 million homes.

https://archive.is/xUJMG

  • Kongar@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    45
    ·
    14 days ago

    I’ve been playing around with AI a lot lately for work purposes. A neat trick llms like OpenAI have pushed onto the scene is the ability for a large language model to “answer questions” on a dataset of files. This is done by building a rag agent. It’s neat, but I’ve come to two conclusions after about a year of screwing around.

    1. it’s pretty good with words - asking it to summarize multiple documents for example. But it’s still pretty terrible at data. As an example, scanning through an excel file log/export/csv file and asking it to perform a calculation “based on this badge data, how many people and who is in the building right now”. It would be super helpful to get answers to those types of questions-but haven’t found any tool or combinations of models that can do it accurately even most of the time. I think this is exactly what happened to spotify wrapped this year - instead of doing the data analysis, they tried to have an llm/rag agent do it - and it’s hallucinating.
    2. these models can be run locally and just about as fast. Ya it takes some nerd power to set these up now - but it’s only a short matter of time before it’s as simple as installing a program. I can’t imagine how these companies like ChatGPT are going to survive.
    • 9488fcea02a9@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      44
      ·
      14 days ago

      This is exactly how we use LLMs at work… LLM is trained on our work data so it can answer questions about meeting notes from 5 years ago or something. There are a few geniunely helpful use cases like this amongst a sea of hype and mania. I wish lemmy would understand this instead of having just a blanket policy of hate on everything AI

      the spotify thing is so stupid… There is simply no use case here for AI. Just spit back some numbers from my listening history like in the past. No need to have AI commentary and hallucination

      The even more infuriating part of all this is that i can think of ways that AI/ML (not necesarily LLMs) could actually be really useful for spotify. Like tagging genres, styles, instruments, etc… “Spotify, find me all songs by X with Y instrument in them…”

      • conciselyverbose@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        42
        ·
        edit-2
        14 days ago

        The problem is that the actual use cases (which are still incredibly unreliable) don’t justify even 1% of the investment or energy usage the market is spending on them. (Also, as you mentioned, there are actual approaches that are useful that aren’t LLMs that are being starved by the stupid attempt at a magic bullet.)

        It’s hard to be positive about a simple, moderately useful technology when every person making money from it is lying through their teeth.

      • HubertManne@moist.catsweat.com
        link
        fedilink
        arrow-up
        4
        ·
        14 days ago

        This is to me what its useful for. So much reinventing the wheel at places but if the proper information could be found quickly enough then we could use a wheel we already have.