Sure, then whoever uses it to extract that text is infringing. If I memorize a copyrighted text, my brain is not an infringement; but if I publicly perform a recitation of that text, that act is infringing.
Really the precedent of search engines (and card catalogs and concordances before them) should yield the same result. Building an index of information about copyrighted works is librarianship; running off new copies of those works is infringement.
On the other hand, AI transparency is also an interesting problem. It may be that one day we can look at a set of neural network weights – or a human brain! – and say “these patterns here are where this system memorized Ginsberg’s ‘Kaddish’.” I hope we will not conclude that brains must be lobotomized to remove copyrighted memorized texts.
If we treat model like a brain that “memorize” copyrighted text and generate new text based on that, your statement is valid. However, this will also prohibit any copyright claims on the model’s output, as the act of memorization isn’t a work. Only work can infringe on other works, which should the output of models defined as “work” is still under heavy debate. Even if it is defined as a work, can a model gain copyright while not being a legal person? Who should bear the liability then? What if the output is modify by an editor? This rabbit hole digs deep.
I think that actually was ruled on a few months ago. No the model cannot hold copyright. Nor can the person that commissioned the model to create the work. I think where things are still a bit grey (someone correct me if I’m wrong), is when a person creates a work with the assistance of AI whereas it’s a mix of human and AI generated content.
Sure, then whoever uses it to extract that text is infringing. If I memorize a copyrighted text, my brain is not an infringement; but if I publicly perform a recitation of that text, that act is infringing.
Really the precedent of search engines (and card catalogs and concordances before them) should yield the same result. Building an index of information about copyrighted works is librarianship; running off new copies of those works is infringement.
On the other hand, AI transparency is also an interesting problem. It may be that one day we can look at a set of neural network weights – or a human brain! – and say “these patterns here are where this system memorized Ginsberg’s ‘Kaddish’.” I hope we will not conclude that brains must be lobotomized to remove copyrighted memorized texts.
If we treat model like a brain that “memorize” copyrighted text and generate new text based on that, your statement is valid. However, this will also prohibit any copyright claims on the model’s output, as the act of memorization isn’t a work. Only work can infringe on other works, which should the output of models defined as “work” is still under heavy debate. Even if it is defined as a work, can a model gain copyright while not being a legal person? Who should bear the liability then? What if the output is modify by an editor? This rabbit hole digs deep.
I think that actually was ruled on a few months ago. No the model cannot hold copyright. Nor can the person that commissioned the model to create the work. I think where things are still a bit grey (someone correct me if I’m wrong), is when a person creates a work with the assistance of AI whereas it’s a mix of human and AI generated content.