• shagie@programming.dev
      link
      fedilink
      arrow-up
      4
      ·
      1 year ago

      I find it difficult to give too much weight to the “generating a LLM based on Stack Overflow content without attribution is wrong” when people are knowingly and intentionally violating the CC-BY-SA license in their own code.

      • davehtaylor@beehaw.org
        link
        fedilink
        arrow-up
        5
        ·
        1 year ago

        Two wrongs don’t make a right.

        It’s also totally fucking different when someone on SO asks for help for their homework or for help with an nginx server on their home network, and when some tech firm decides to scrape 15 years worth of information created by countless people, and then spit it back out pretending like it’s some novel solution.

        As I said in my original comment, I’m no fan of SO. But the behavior of neither the site nor the people who lurk and copy justify what LLMs are doing.

        • shagie@programming.dev
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          We should pursue with equal effort license violations of permissively licensed material no matter what the source. Ignoring it for some while preaching fire and brimstone for others weakens the strength of the argument and the license on which they are founded.

          When trying to enforce a license, if it is possible to say “you are doing exactly what you accuse us of doing” it makes it more difficult to prosecute.

          While two wrongs don’t make a right, two wrongs will substantially complicate prosecuting just one of them.

          I am not arguing about the morality of one or the other… or how insignificant one of them is in comparison to the other.

          My issue with just pointing to the LLM is about the integrity and enforceability of open source licenses.