5 Introduction to Audio Editing


As with editing visual assets, editing the soundtrack of your production is not only about removing or improving poor audio assets, it’s also about creating a soundtrack that helps you to tell a story or accurately represent an event.

Editing systems offer the ability to add various layers or ‘tracks’ of sound which can be blended or mixed to create a more complete atmosphere or mood to a programme (sometimes entirely replacing the sound associated with the video), so that images can be cut together without a jarring soundtrack.

Audio editing can also be used to place objects or characters within space by their relative position to the point of view of the pictures. In simple terms, sounds being generated by distant objects should be less distinct to sounds being generated by closer objects. For example, a wide shot of a group of people around the bride and groom would sound odd from the groom’s radio mic. Similarly, a close up interview recorded on a telephoto lens should not use the on-camera mic as the viewer expects to hear the sound just as ‘close-up’ as the image.

Providing your viewer with the correct audio landscape will help you to set the scene and tell your story. In documentary-style productions, editing the audio is the first part of the process, as this will often require using narration from many different takes to create a naturally flowing story. This will then require cutaways to ‘hide the joins’.

Most programmes will contain at least two layers of sound, namely ‘feature’ and ‘background’. The feature sound will be the main audio track containing the information which gives your viewer the main body of information. This might be a narration or dialogue from characters, but could equally be the sound of cars racing around a track. Background sound is general ambient sound of the scene, either recorded alongside the feature sound or added separately from a wild track recording. In addition to feature and background sound, atmospheric sounds can also be added or mixed to your soundtrack. These include music and sound effects, and are added to either create mood or enhance the realism of the soundtrack.

5.1 Audio Cutting


Like visual asset editing, most of your audio edits will consist of straight cuts – and there are a few considerations to take into account in ensuring that these edit points are not noticeable or jarring.

Unless you are trying to create a specific effect, the general level of sound on adjoining clips should be roughly equal and contain the same tonal qualities. It’s also important not to cut in or out on a point where a feature sound is either starting or finishing. For instance, cutting short a feature character saying something; or cutting mid-toll of the bells being rung outside a church at a wedding. Both these will be jarring for the viewer.

5.2 Audio Fades & Cross Dissolves


An Audio Fade is similar to a visual fade, in that it’s a gradual increase or decrease of sound level and will often correspond with a fade in the picture information. A gradual fade from mute to full volume represents a subtle introduction to a scene or opening of a programme; whereas a fade from full volume to mute signifies an ending to a scene or programme. A fade to mute and then directly fading back up to full volume (normally accompanied with corresponding visual fades) usually signifies a change in time.

A Cross Dissolve is a gradual mix between two soundtracks. As the incoming track volume increases, the outgoing track volume will decrease. Again, this will often correspond with a dissolve in the picture information.

5.3 Split Audio Edits


Whilst most audio edits will correspond with a visual edit, the videographer can split or stagger the audio edit from the visuals for effect. This is achieved by introducing the audio from the upcoming shot whilst still retaining the picture from the current shot or visa versa.

Split Audio Edits are also referred to as ‘J-Cuts’ and ‘L-Cuts’ which is a reference to the process of editing celluloid film, whereby the end of the film would be physically cut in either an ‘L’ or ‘J’ shape (the audio track being located on the lower section of celluloid and the shape of the cut either extends or cuts short the audio). This convention is still maintained now by the fact that the end of each edit will appear as L or J-shaped on many NLE timelines.

This technique is especially useful in compressing time within a sequence, or by using it to signify a transition in time or location. For example, in a wedding video the videographer could take the viewer from the preparations at the bride’s house to the guests arriving at the church by introducing the sound of the bells tolling at the church just before the end of the last shot of the bride at home.

Similarly, when cutting a conversation or interview sequence together, hearing the first word or two of the next interviewee before you cut to a shot of them can create a natural sense of pace. This occurs naturally when listening to a group of people. We hear them talk, then we take a ‘mental blink’ as we turn to look at them, and finally they’re in vision. To add suspense or emotional weight, cutting to a listener who is about to speak whilst listening to the previous speaker tells our viewers their reaction is important.

5.4 Mixing


The process of ‘Mixing’ a soundtrack is essentially adding various audio assets to the programme and adjusting their levels to create a coherent and balanced soundtrack. More details on the mixing, monitoring, graphic equalisation and sweetening of the soundtrack can be found in the Audio chapter (section 8. Audio in Post Production).

Lesson tags: Training
Back to: IOV Approved Training Level 1 > Post Production

©2025 copyright IoV

Contact us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Log in with your credentials

Forgot your details?