This is what happens to your brain when you read computer code

V H@lemmy.stad.social · 3 months ago

The age matters less than the power-dynamics of her being his nanny.

V H@lemmy.stad.social · 11 months ago

Quick iteration is definitely the big thing. (The eye is fun because it’s so “badly designed” - we’re stuck in a local maxima that just happens to be “good enough” for us to not overcome the big glaring problems)

And yes, if all the inputs are corrupted, the output will likely be too. But 1) they won’t all be, and as long as there’s a good mix that will “teach” the network over time that the difference between a “corrupted cat” and an “uncorrupted cat” are irrelevant, because both will have most of the same labels associated with them. 2) these tools work by introducing corruption that humans aren’t meant to notice, so if the output has the same kind of corruption it doesn’t matter. It only matters to the extent the network “miscorrupts” the output in ways we do notice enough so that it becomes a cost drag on training to train it out.

But you can improve on that pretty much with feedback: Train a small network to recognize corruption, and then feed corrupted images back in as negative examples to teach it that those specific things are particularly bad.

Picking up and labelling small sample sets of types of corruption humans will notice is pretty much the worst case realistic effect these tools will end up having. But each such countermeasure will contribute to training sets that make further corruption progressively harder. Ultimately these tools are strictly limited because they can’t introduce anything that makes the images uglier to humans, and so you “just” need to teach the models more about the limits of human vision, and in the long run that will benefit the models in any case.

V H@lemmy.stad.social · 1 year ago

It doesn’t need to “develop its own style”. That’s the point. The more examples of these adversarial images are in the training set, the better it will learn to disregard the adversarial modifications, and still learn the same style. As much as you might want to stop it from learning a given style, as long as the style can be seen, it can be copied - both by humans and AI’s.

V H@lemmy.stad.social · 1 year ago

As long as people aren’t ready for it, then it doesn’t solve the immediate problem that needs to be solved today.

V H@lemmy.stad.social · 1 year ago

I live in the UK and I don’t drink beer because I generally, ironically, think beer overall tastes like piss, and yet I still know Tsingtao. It has fairly substantial market recognition in a lot of countries.

V H@lemmy.stad.social · 1 year ago

You wouldn’t want to. If you just feed it to the models, then if there are enough of these images to matter the model will learn to ignore the differences. You very specifically don’t want to prevent the model from learning to overcome these things, exactly because if you do you’re stuck with workarounds like that forever, but if you don’t the model will just become more robust to noisy data like this.

V H@lemmy.stad.social · 1 year ago

An AI model will “notice them” but ignore them if trained on enough copies with them to learn that they’re not significant.

V H@lemmy.stad.social · edit-2 1 year ago

Yes: Train on more images processed by this.

In other words: If the tool becomes popular it will be self-defeating by producing a large corpus of images teaching future models to ignore the noise it introduces.

There are likely easier “quick fixes” while waiting for new models, but this is the general fix that will work against almost any adversarial attack like this.

There might be theoretical attacks that’d be somewhat more difficult to overcome to the extent of requiring tweaks to the models, but given that there demonstrably exists a way of translating text to images that overcomes any such adversarial method that isn’t noticeable to humans, given that humans can, there will inherently always be a way to beat them.

V H@lemmy.stad.social · 1 year ago

That’s hilarious, given that if these tools become remotely popular the users of the tools will provide enough adversarial data for the training to overcome them all by itself, so there’s little reason to anyone with access to A100’s to bother trying - they’ll either be a minor nuisance used a by a tiny number of people, or be self-defeating.

V H@lemmy.stad.social · edit-2 1 year ago

To me, that’s not an argument for regulating AI, though, because most regulation we can come up with will benefit those with deep enough pockets to buy themselves out of the problem, while solving nothing.

E.g. as I’ve pointed out in other debates like this, Getty Images has a market cap of <$2bn. OpenAI may have had a valuation in the $90bn range. Google, MS, Adobe all also have shares prices that would trivially allow them to purchase someone like Getty to get ownership of a large training set of photos. Adobe already has rights to a huge selection via their own stock service.

Bertelsmann owns Penguin Random-House and a range ofter publishing subsidiaries. It’s market cap is around 15 billion Euro. Also well within price for a large AI contender to buy to be able to insert clauses about AI rights. (You think authors will refuse to accept that? All but the top sellers will generally be unable to afford to turn down a publishing deal, especially if it’s sugar-coated enough, but they also sit on a shit-ton of works where the source text is out-of-copyright but they own the right to the translations outright as works-for-hire)

That’s before considering simply hiring a bunch of writers and artists to produce data for hire.

So any regulation you put in place to limit the use of copyrighted works only creates a “tax” effectively.

E.g. OpenAI might not be able to copy artist X’s images, but they’ll be able to hire artist Y on the cheap to churn out art in artist X’s style for hire, and then train on that. They might not be able to use author Z’s work, but they can hire a bunch of hungry writers (published books sells ca 200 copies on average; the average full time author in the UK earns below minimum wage from their writing) as a content farm.

The net result for most creators will be the same.

Even wonder why Sam Altmann of OpenAI has been lobbying about the dangers of AI? This is why. And its just the start. As soon as these companies have enough capital to buy themselves access for data, regulations preventing training on copyrighted data will be them pulling up the drawbridge and making it cost-prohibitive for people to build open, publicly accessible models in ways that can be legally used.

And in doing so they’ll effectively get to charge an “AI tax” on everyone else.

If we’re going to protect artists, we’d be far better off finding other ways of compensating them for the effects, not least because it will actually provide them some protection.

V H@lemmy.stad.social · edit-2 1 year ago

You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner

If you were to train human children on an endless series of pictures with signatures in the corner, do you seriously think they’d not emulate signatures in the corner?

If you think that, you haven’t seen many children’s drawings, because children also often pick up that it’s normal to put something in the corner, despite the fact that to children pictures with signatures is a tiny proportion of visual input.

Or how it is at least possible to get the model to output something extremely close to the training data

People also mimic. We often explicitly learn to mimic - e.g. I have my sons art folder right here, full of examples of him being explicitly taught to make direct copies as a means to learn technique.

We just don’t have very good memory. This is an argument for a difference in ability to retain and reproduce inputs, not an argument for a difference in methods.

And again, this is a strawman. It doesn’t even begin to try to answer the questions I asked, or the one raised by the person you first responded to.

That at least proves that the process is quite different to the process of human learning.

Neither of those really suggests that all (that diffusion is different to humans learn to generalize images is likely true, what you’ve described does not provide even the start of any evidence of that), but again that is a strawman.

There was no claim they work the same. The question raised was how the way they’re trained is different from how a human learns styles.

V H@lemmy.stad.social · 1 year ago

Society is built to distribute wealth, so that everyone can live a decent life.

As a goal, I admire it, but if you intend this as a description of how things are it’d be boundlessly naive.

V H@lemmy.stad.social · 1 year ago

Human brains clearly work differently than AI, how is this even a question?

It’s not all that clear that those differences are qualitatively meaningful, but that is irrelevant to the question they asked, so this is entirely a strawman.

Why does the way AI vs. the brain learn make training AI with art make it different to a person studying art styles? Both learn to generalise features that allows them to reproduce them. Both can do so without copying specific source material.

The term “learning” in machine learning is mainly a metaphor.

How do the way they learn differ from how humans learn? They generalise. They form “world models” of how information relates. They extrapolate.

Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

This is the only uncontroversial part of your answer. The main reason why courts will treat human and AI actions different is simply that they are not human. It will for the foreseeable future have little to do whether the processes are similar enough to how humans do it.

V H@lemmy.stad.social · 1 year ago

They don’t even need to detect them - once they are common enough in training datasets the training process will “just” learn that the noise they introduce are not features relevant to the desired output. If there are enough images like that it might eventually generate images with the same features.

V H@lemmy.stad.social · 1 year ago

Trying to detect poisoned images is the wrong approach. Include them in the training set and the training process itself will eventually correct for it.

I think if you build more robust features

Diffusion approaches etc. do not involve any conscious “building” of features in the first place. The features are trained by training the net to match images with text features correctly, and then “just” repeatedly predict how to denoise an image to get closer to a match with the text features. If the input includes poisoned images, so what? It’s no different than e.g. compression artifacts, or noise.

These tools all try to counter models trained without images using them in the training set with at most fine-tuning, but all they show is that models trained without having seen many images using that particular tool will struggle.

But in reality, the massive problem with this is that we’d expect any such tool that becomes widespread to be self-defeating, in that they become a source for images that will work their way into the models at a sufficient volume that the model will learn them. In doing so they will make the models more robust against noise and artifacts, and so make the job harder for the next generation of these tools.

In other words, these tools basically act like a manual adversarial training source, and in the long run the main benefit coming out of them will be that they’ll prod and probe at failure modes of the models and help remove them.

V H@lemmy.stad.social · 1 year ago

I’m just very tickled at how much it backfired - Lewis turned outright anti-Catholic. If I’d been a religious man I might have tried to read something into that (but I’m not, so).

V H@lemmy.stad.social · 1 year ago

Yes, she is free to be a giant asshole with a persecution complex. And we are free to call her one.

V H@lemmy.stad.social · 1 year ago

The funny thing is we can blame Tolkien for that. It was Tolkien who got Lewis to convert, though he became a protestant while Tolkien was a Catholic, and hilariously Tolkien found Lewis’ use of Christian symbolism too overdone and lacking in subtlety.

V H@lemmy.stad.social · 1 year ago

I’ve never read the books, but I did enjoy the movies, and it’s really disappointing. I have the DVDs, so I guess I could still watch those knowing it won’t signal any continued demand the way streaming them would, but still.

V H@lemmy.stad.social · edit-2 1 year ago

That is a reason for arguing that people don’t always make smart choices. It is however not an argument for claiming how people vote does not show what their preference is at the time of voting, which is what is relevant here.

It’s perfectly fine to argue you think it’s stupid of people to want to read about Musk, but the votes clearly show they do in fact want to.