Scene dialog lines

Audioware allows to use custom audio in Quest Scenes dialog lines, which can be freely mixed with vanilla ones.

Preambule

A scene a.k.a scnSceneResource contains multiple nodes a.k.a scnSectionNode chained together to create for example a dialogue between V and NPCs, gestures, animation or effect sequences.

Thanks to WolvenKit Editor you can already visualize Quests, including their Phases and Scenes.

Scene Section node can contain single dialog line event a.k.a scnDialogLineEvent.

multiple nodes, single event

But a single Scene Section node can also contain multiple successive dialog line events.

single node, multiple events

This concept applies for discussions over the holocall, during braindances, yet using different scene nodes that won't be covered here.

Creating a dialogue

This documentation won't go over the basics of .quest, .questphase and .scene as REDmodding Wiki already does.

Aside from nodes, a dialog line itself requires roughly 4 components:

actors
audio asset(s) for the voice
subtitle(s) to display on-screen
lipsync animation(s) to be performed by the actor

Info

Implementing the first 3 ones is relatively easy while lipsync requires additional consideration, as developed below.

Consider first that, for most of the dialog lines in the game, you probably don't need any lipsync.

Tip

If you think about it one second, most of V's dialogues are made in first-person perspective (fpp).

So unless V stands nearby a reflecting surface, a mirror, or sits inside a car (where third-person perspective or tpp, can be activated) you probably don't need to worry about lipsync.

It's gonna be the same for a large portion of NPCs such as: devices, drones, droids, cyborgs and mechs do not have lips.

Scavs with digital masks, humanoids with clothing or cyberware covering their lips also fall into this category.

Since the topic is hairy, a sample example .scene is provided in the repo, on which the following documentation is directly based so that you can follow along, a courtesy of MrBill.

Actor

In order to voice a dialogue you must first define the actors.

At a minimum, there's a speaker (the one who talks), and an addressee (the one being talked to).

The actors field in the scnSceneResource defines all the actor(s), except for the player's actor(s).

non-player actors

The player's actor(s) can be found under playerActors.

player actors

Voice

For example, here's how adding a custom dialog line audio for Panam in Audioware looks like:

YAML

# resources\r6\audioware\dialogtest\dialogs.yaml
version: 1.0.0
dialogs:
  13159259729229609924:
    en-us: ./en-us/vo/panam_f_b69f1500c3d57bc4.Mp3

In this snippet the RUID 13159259729229609924 is actually the hash of the hexadecimal b69f1500c3d57bc4. It is used to establish a relationship between resources. This is how the RED engine knows which subtitle to pick for a given voice, which lipsync anim to play, etc.

Tip

In the RED engine, the same naming convention is usually observed for audio assets:

{actor}_{f,m,i}_{hex}.wem e.g. panam_f_b69f1500c3d57bc4.wem

where f stands for female, m for male and i when the notion of gender is irrelevant.

The file path at the bottom indicates to Audioware where to load the audio asset from for a specific language, here en-us:

audio asset file path

Subtitle

Audioware does not handle the subtitles for this feature, because RED engine already perfectly does.

subtitles entries

First, add subtitle entries:

subtitles entries

Here you can notice the stringId 13159259729229609924 is the same as the one defined for the Voice.

Each entry is an (optionally genderized) subtitle for a given localization string ID (a.k.a locStringId or simply stringId).

subtitles map

Then map the subtitles:

subtitles map

It links the actual subtitles file to a subtitles group.

subtitles localization

Finally, don't forget to register the subtitles map in the .xl manifest:

YAML

# resources\audiowaredialogtest.archive.xl
# ...
localization:
  # ...
  subtitles:
    en-us: mod\audiowaredialogtest\localization\en-us\subtitles\dialogtest_subtitlemap.json

It links to the map for a given language.

Lipsync

At the time of writing, creating custom lipsync animation is not possible yet.

That being said, you can apply the trick that the movie industry has been using for over 50 years already when dubbing movies in foreign languages: find a lipsync animation that roughly matches both the duration and flow of the speech, and uses it.

It's not perfect, but when carefully picked it usually does the trick.

Tip

You will usually be tempted to create a custom audio for your sentence first, then search for the corresponding lipsync animations in SoundDB.

Although it might seem counter-intuitive at first, the reverse workflow generally yields a better outcome: find some interesting sentence which roughly match what you want your character to say, and then generate the closest possible sentence to match the lipsync!

lipsync animation

First define the .anims:

lipsync anims

It contains raw lipsync animation and the rig for the 3D model.

lipsync map

Then, define the .lipmap:

lipsync map

It links the actors by their voice tag to the lipsync animation and a scene for a given language.

Scene path is defined by its FNV1A64 Hash. Hash for the scene can be generated by WolvenKit Hash Tool

lipsync map localization

Don't forget to register the .lipmap in the .xl manifest:

YAML

# resources\audiowaredialogtest.archive.xl
# ...
localization:
  # ...
  lipmaps:
    en-us: mod\audiowaredialogtest\localization\en-us\lipsync\dialogtest.lipmap

voiceover map

By default the RED engine won't play lipsync at all if there's no associated .wem(s).

This is what you can define here:

voiceover map

It links the localization string ID to the audio asset for each gender.

silent .wem

Another peculiarity of the RED engine is that it will start playing a lipsync anim as long as a .wem exits, but if its duration does not match with lipsync anim it will abruptly stop playing it before reaching its end.

This is why, at the time of writing, creating silent .wem with matching duration is a necessary evil.

The most common way to do it is with Wwise.

install Wwise

Go to their download section, you will also need to create an account.

create new project

You need to create a project to handle the conversion between .wav to .wem.

You can use the default suggested settings, but don't forget to specify the Conversion Settings to Vorbis Quality High.

project conversion settings

convert silent audio

Let's imagine you need a 8s long silent .wem as a placeholder for your lipsync anim.

create an empty Sound SFX Container under the Default Work Unit in Actor-Mixer Hierarchy.
rename the container to e.g. 8000ms
import a silent .wav: for convenience you can find a 1h-long .wav in the repo, a courtesy of DBK, that you can simply trim to the appropriate duration.
don't forget to use Sound SFX
if you need to do multiple silent .wem of different durations, it is fine to reuse the same asset.
then trim the audio, e.g. to 8 seconds.
before converting, make sure the Conversion Settings for the container is set to Vorbis Quality High
convert the trimmed .wav to .wem
copy the converted .wem from Wwise cache (.cache folder by default). Each asset is identified by its hexadecimal, so it's important to select the appropriate one, usually the latest.
finally paste it in your WolvenKit archive.

voiceover map localization

Don't forget to register the .json in the .xl manifest:

YAML

# resources\audiowaredialogtest.archive.xl
# ...
localization:
  # ...
  vomaps:
    en-us: mod\audiowaredialogtest\localization\en-us\vo\voiceovermap.json

string id variants map

Last but not least, there's an additional file which contains the duration (or length) of the string IDs. If you browse vanilla files, you can find it under e.g. base\localization\en-us\stringidvariantlengthsreport.json.

At the time of writing ArchiveXL does not support creating custom ones just yet, so here's how you can do with Codeware instead:

Redscript

class ATStringidVariantLengthsReportService extends ScriptableService {
  private cb func OnLoad() {
    GameInstance
      .GetCallbackSystem()
      .RegisterCallback(n"Resource/PostLoad", this, n"OnPostLoad")
      .AddTarget(
        ResourceTarget.Path(r"base\\localization\\en-us\\stringidvariantlengthsreport.json")
      );
  }

  private cb func OnPostLoad(event: ref<ResourceEvent>) {
    let resource: ref<JsonResource> = event.GetResource() as JsonResource;
    let map: ref<locVoiceoverLengthMap> = resource.root as locVoiceoverLengthMap;

    let panamline1: locVoLengthEntry;
    panamline1.stringId = HashToCRUID(13159259729229609924ul);
    panamline1.femaleLength = 4.9;
    panamline1.maleLength = 4.9;

    ArrayPush(map.entries, panamline1);
  }
}

Showcase

And here's the result, thanks to MrBill for the .scene comparing:

sentence	audio	lipsync
vanilla	vanilla wem	vanilla
vanilla	audioware mp3	vanilla
modded	audioware mp3	vanilla

YouTube demo

Overview

Ok, this was a mouthful!

Here's a quick schema summarizing the resources and their dependencies.

scene resources

Big tokens of appreciation to MisterChedda, MrBill, DBK and Dedra without whom this feature would probably have never been finished!