Open/close side menu
C.C. Hogan

How to do Punch and Roll Recording

Comment on this article

Punch and Roll is a method of recording that entails rolling back a short way into a recording, playing and then punching into record at a set point.

As a method, it has existed ever since the early days of tape recording and became the standard way of recording long narrative pieces and singers on multitrack tape machines when the Sync head was invented.  (Sync was simply a way of making the record head of a tape machine work as a play head. When you hit "record" it became a record head again. Don't worry about  it!)

On Digital Audio Workstations it is even easier, depending on the system.

For a video demo of punch and roll, scroll down...

Why do Punch and Roll?

There are two simple reasons:

Firstly, it will reduce your editing. If you are winding back and punching in, then when you come to edit you are only cleaning up the recording, not wading through takes that are stretched out one after another.

Secondly, because you are picking up by hearing what came before, you are more likely to match your read and your re-record will be much, much better.

Before punch and record, producers would record take after take. This did have some advantages as they had several takes to choose from. Punching in on analogue tape is destructive, you are erasing the previous take and so losing that version. This might not be so good. But with professional digital systems, punching in is non-destructive and the previous takes are kept, stacked up UNDER the recording, if you can imagine that. So, if you need to search through your takes for a better read, they are there, somewhere!

Which systems should I use?

Some DAWs to punch and roll natively, and with some it is a messy workaround. To be honest, if you are intending to do narration as anything apart from just your one book, invest in a DAW that does. So, Cubase, Studio One, ProTools, Logic, Reaper and so on. Many of these do a cheaper cut-down version or even a free version which is enough for narration. Cubase Artist is particularly good and benefits from the powerhouse audio engine of Cubase Pro (which I use), but is much cheaper.

What is important, from my point of view, is working in a system where punching into record is non-destructive.  As a professional engineer, using punch and roll on a system that erases the bit of the previous take, is going back to the dark ages of recording on tape.  Why would anyone want to do that?

On Cubase, for instance (and most work like this), each time you record, a new file is created. These files are shown as a series of blocks or events along the time line. 

A highlighted block in Cubase

Each time you go into record at the same place, another file is created, but if you do this a lot, you might not be able to see them. The hashing on the image above is where one file is overlapping another, but only the most recent one can be heard - the one on top, as it were. (Think layers in Photoshop).

However, the old takes are still there unless you deleted them or pressed undo to undo the recording. In Cubase, there is something called lanes which shows in more detail what is going on. Other DAWS have very similar systems.

The lanes of a track

Now it becomes much clearer and you can see exactly what is happening. (Note, Cubase also has a comprehensive versioning system which is another way of working - not needed for narrations, possibly).

Every time I hit record, a new file is created. Because they inevitably overlap, this creates a new lane. However, this is only a visual representation. Once I edit that lot above, get it nice and clean with crossfades between edits if needed, then I can tell it to "clean up lanes", and it will all be shown on about three lanes.

(Note: I do not display the lanes when editing, just the main track, and editing is blindingly fast with Cubase. I only use the lanes if I want to dig through the layers to find an alternative take.)

What about dealing with all these hundreds of files?

The audio pool in cubase

Oh, wow, there are a lot of them. A few hundred on a long chapter. Did I really punch in that often? Apparently.

To be honest, don't worry about it. Forget it is even happening. When working on this kind of professional system, it is all about the final mix. Once you have done all your editing, sorted out the EQ and the loudness, then you will press "export" and your edit will be exported to one single file ready for submission.  The original is left wonderfully intact and you can go back and play with it as much as you like.

How do I do punch and record on my DAW?

This varies from system to system, but they do have certain things in common.

  • Punch-in point
  • Pre-Roll

The punch-in point is a marker that tells the system "I want to go into record here." You set the punch-in point then press play. When the timeline reaches that point, it will go into record on the armed track.  On some systems that is all you need do.

Pre-Roll is a separate setting that sets how many seconds before the punch-in point you hear.  Then when you press play, the system rewinds that amount of time before the punch-in point and goes into play, flipping into record at the marker.

The good thing about Pre-Roll is that it is an automatic, get-on-with-it sort of thing. The bad thing is that depending on why and how you are punching in, you might want to be listening to just a couple of seconds or you might want to be listening to a whole paragraph first.

You can also set a punch-out point.  This is where the system drops back out of record.  Useful for replacing a few words in the middle of something, but not always needed.  After all, this is non-destructive.  The original is still underneath.

Manual Punch in

There is another way of working. Having read several discussions about this, I suspect this is not the way many work, but I am not sure why.

This is terribly technical (not), and it goes like this:

  • You make some silly mistake.
  • You press stop.
  • You press rewind for a few seconds.
  • You press play.
  • You listen to the good bit and when you get to the mistake....
  • You press record.
  • You speak.

Difficult, huh?

Look, the point is that if you are recording a narration, and you want it to be really, really good, then when you finish recording, you will want to work through everything, editing as you go, and look at every gap, every breath, every beat of your heart, messing with them and adjusting them to get them "just so"; rerecording if needed. (Also dead easy on this type of system, you simply punch in and correct the mistake).

When we worked in the studio, we tried to record in such a way so that when we got to the end and pressed stop, that was it - no editing; finished. But, we weren't on our own. We had a top-grade professional voice over in the booth, we had a me-shaped person behind the desk who had recorded thousands (not a few hundred) of sessions and a producer (and even a production assistant and client) all of whom were keeping notes and shouting when anything went wrong. By the time we reached the end of the recording, we were pretty certain everything was okay. Even then, it all got checked back.

But when it is just you, you cannot afford to be so over-confident. I really know what I am doing from the production point of view, more than most, but I still work my way bit by bit through everything if I am the also the voice over. During the recording, I am busy speaking; I am going to miss stuff.

So, cleaning up where you have punched in is no big deal. It takes seconds to do. So you might as well work manually and not worry about the punch in being perfect from the techy point of view. Remember, this is non-destructive, so if you are bit out, it is so easily repaired.

Here is a video demo of me using manual punch and roll.

So, I can learn the techy bit, what about the me bit?

Punching in requires to you start speaking and match what went before in a fluid, seamless way. Some people, even experienced voices, find this bit the hardest, but trust me, it really is not difficult at all.

There are two things going on here.

Firstly, think of this as a conversation with yourself, or even anyone else. When you are talking, you instinctively pick up when the other person stops without thinking about it. It is something we do very naturally.

Secondly, think about a song. If you were singing and you wanted to pick up where you left off, you would sing along with yourself first and just keep going. Again, this is all instinctive stuff. It does not need some special training.

So, when you punch in, rather than just try and leap in and get it right, talk along with yourself during the pre-roll (either programmed or manual) and then just carry on going when you hit record.

Unlike with singers and musicians where we are working to a music track and will often punch in at a really strange place (whether the singer knows it or not), the chances are you will be punching in somewhere very logical, at the beginning of a sentence or at an important comma.

So, the pressure to come in hard and fast just isn't there. 

If you are programming a punch in point, it is normal to choose a point just before a breath, not after. So when you start speaking, you will be starting with a breath. Remember, this is non-destructive, so you can clean it up later - I often use the breath from the original recording.  It sounds more natural and the timing is better.

If you are talking along with yourself, the chances are that you will take a breath automatically, so it will all match easily.

And if you have marked up your script for recording properly, then it becomes easier still. You should be able to match what you did with ease.

I know some people are nervous about this, but honestly, just try it and you will see how easy it is.

After a while, you will probably not talk along with yourself, and your pre-rolls will become shorter and shorter, especially if you are using the manual method I recommend for narration when you are also the editor. If I make a mistake, I hit rewind and will go almost straight into record as I hear the last syllable of the last word. It is so quick, that I haven't lost momentum and will match perfectly.

Of course, like everything, practice helps

And what better way to practice than when recording an audiobook for real? 

Again, this is all non-destructive if you use the right DAW for the job; there is a limit to how wrong you can go.

So, what's stopping you?





    Please feel free to comment - no anger, no bad vibes, no trashing people. Just sit around, enjoy a flagon of beer and mull over the world. You can login with Disqus, Facebook, Twitter or Google.

    Series One & Two are out now!

    Start series one with Dirt for Free, and start reading the brand new series two with Girls of Dirt for only 99p!

    Girls of Dirt includes a recap of series one.

    Get it now at the Dirt website

    The Stink Is Here

    North London, 1976. The longest, hottest summer on record. The water is running out and the kids hate their parents. Which bunch of idiots would think it is a good idea to start a band?

    The Stink

    Visit The Stink Website