‘Cyberpunk 2077’ Used AI For A Dead Voice Actor’s Performance, With Permission

stopthatgirl7@kbin.social · 1 year ago

‘Cyberpunk 2077’ Used AI For A Dead Voice Actor’s Performance, With Permission

kromem@lemmy.world · 1 year ago

In general lay audiences have a very weird relationship to AI as a topic, likely in large part to decades of effectively propaganda from SciFi anthropomorphizing it as an effective ‘other’ to have as a threat for human protagonists.

The reality of how this all is going to play out is that low skill work like walla walla filler for audioscapes will be AI generated instead of library sourced, which is going to sound better and not really make much difference to labor markets.

Middle range performances like NPCs will be AI generated from libraries where the actual voice actors creating that voice will be paid out residuals for the duration of voice generation, and you’ll likely have better performances than most side quest NPCs by the next generation of consoles.

High end performance for key characters will still be custom hires directed for the specific role, likely with additional contract terms for extended generation for things like Bethesda’s radiant questlines.

The latter is going to be the thing that’s going to be the biggest hurdle to figure out terms for, as what would be ideal for the player would be having a near infinite variety of branching questlines in an open world that would be fully voiced, but if each branch was considered its own X hours of generation under contract that wouldn’t be feasible and would ultimately price human actors out of the market down the road in favor of fully artificial alternatives. So it will probably be something like X hours of parallel generation (i.e. infinite variety but maybe only an additional 200 hours worth in a playthrough priced at 200 hours of generation).

But as can be seen in the article, it’s not as simple as waving a hand and having AI voice lines - this was work done on top of a different actor’s performance to bring the voice in line with the original performer.

And given there’s still going to be a few years as the tech improves with significant overlap of needing to work with actors to get performances right, this is all going to get managed in acceptable ways.

You don’t see people losing their minds over improved facial animation rigs taking away mocap sessions from actors, even though that’s a reality of improved tech. But it doesn’t have the scary ‘AI’ in the name (even though the tech is generally going to lean more and more on machine learning), so it flies under the radar.

Ultimately, being able to take a static voice performance into dynamic extended content is going to be one of the best things to ever happen to video games, and given how much of that is going to rely on human performance and union buy in, I wouldn’t even be surprised if the eventual leading product offering ends owned and operated by the trade unions or a number of the actors themselves.

TheHarpyEagle@lemmy.world · 1 year ago

the actual voice actors creating that voice will be paid out residuals for the duration of voice generation

In a perfect world, sure. In reality, we’ve seen that paying residuals is something companies won’t do if they can possibly help it. It’s one of the very issues being fought over in the strike negotiations right now.

And given there’s still going to be a few years as the tech improves with significant overlap of needing to work with actors to get performances right, this is all going to get managed in acceptable ways.

I admire your optimism, but I can’t share it. We’re so poorly prepared to deal with job losses associated with AI and automation in general, and I don’t see any movement on that front. If you’re relying on unions to get it done, know that they already have an uphill battle getting things they should’ve had years ago, let alone future protections against a rapidly changing market.

kromem@lemmy.world · 1 year ago

The thing is, for the next 3-5 years, talent holds all the cards here.

It will eventually flip such that the 20 of the 80/20 of a key character performance can be fully automated, but that’s years away.

Until then, studios that want high quality AI generated performances are going to need to be working intimately with the talent that can produce the baseline to scale out from.

And the whole job loss thing is honestly overblown. Of course there’s going to be companies chasing short term profits in exchange for long term consequences, but the vast majority of those are going to continue to blow up in their faces.

In reality, rather than labor demand staying constant as supply increases with synthetic labor, what’s going to happen is that labor demand spikes rapidly as supply increases.

You won’t see a game with a 40 person writing team reducing staff to 4 people to produce the same scope of game, you’ll see a 40 person writing team working to create a generative AI pipeline that enables an order of magnitude increase in world detail and scope.

It’s almost painful playing games these days seeing the gap between what’s here today and what’s right around the bend. Things like Cyberpunk 2077 having incredibly detailed world assets, dialogue, and interiors in the places the main and side quests take you, but NPCs on the sidewalk that all say the same things in voices that don’t even match the models, and most buildings off the beaten path being inaccessible.

Think of just how much work went into RDR2 having each NPC have unique-ish responses to a single round of dialogue from the PC, and how much of a difference that made to immersion but still only surface deep.

Rather than “let’s fire everyone and keep making games the same way we did before” staffing is going to continue to be around the same, but you’ll see indie teams able to produce games closer to bigger studios today and big studios making games that would be unthinkable just a few years ago.

The bar adjusts as technology advances. It doesn’t remain the same.

Yes, large companies are always going to try to pinch every penny. But the difference between a full voice synthesis performance over the next few years that skirts unions and a performer-integrated generation platform that’s tailored to the specific characters being represented is going to be night and day, and audiences/reviewers aren’t going to react well to cut corners as long as flagship examples of it being done right are being released too.

The fearmongering is being blown out of proportion, and at a certain point it actually becomes counterproductive, as if too many within a given sector buy into the BS such that they simply become obtusely contrarian to progress rather than adaptive, you’ll see the same shooting themselves in the foot as the RIAA/MPAA years ago fighting tooth and nail to prevent digital distribution, leaving the door open to 3rd parties to seize the future, rather than building and adapting into the future themselves (which would have been the most profitable approach in retrospect).