Adventures in Matroksa

I’ve recently re-encoded a whole load of MPEG2 video I’ve got into MPEG4 (using H.264). I thought that I’d put it all in MKV containers. Sadly, this turns out not to be the case: HandBrake’s presets explicitly set the output container — mostly to MP4. So I’m now left with a large number of videos in a format that I didn’t intend. What’s worse, I can’t easily flip forward and back in the files I do have, using either my Popcorn Hour or any of the video players I have on my Linux desktop. Here’s what I did to fix this sorry state of affairs.

Indexing

First thing: what’s going on with not being able to skip around in the videos? Even if I encode a DVD with the “chapter markers” option set in HandBrake, I can’t use the “Next Chapter” and “Previous Chapter” buttons in the Popcorn Hour.

It seems that the Matroska container format doesn’t require the writing of a position index. So HandBrake simply doesn’t bother. So, even though it writes the chapter markers, the metadata needed to find those places doesn’t exist. Fortunately, this seems to be a well-known issue, and is mentioned in several places on the HandBrake forums. Unfortunately, the answers are somewhat lacking in detail: “run it through mkvmerge”. Gee, thanks. What command line options do I use?

After reading through the (clear and well-written) docs for mkvmerge, I decided that actually I didn’t need any options:

 $ mv Babylon-5.S1-Ep01.mkv Babylon-5.S1-Ep01.tmp
 $ mkvmerge -o Babylon-5.S1-Ep01.mkv Babylon-5.S1-Ep01.tmp

Takes about 30 seconds to process, and I get all my chapter markers back. Success!

Conversion

Now, what about my other, larger problem? All these MP4 containers that should be Matroska…

Well, as it turns out, mkvmerge will in theory handle that, too. Exactly the same command line. Just feed it an MP4 file, and it spits out the MKV for you. Sadly, for me that doesn’t quite work. I get a playable file, but the quality is unwatchable. There’s a lot of dropped frames in it. Back to the drawing board.

There’s a pair of packages called mpeg4ip-utils and mpeg4ip-server, which contain tools for manipulating MP4 containers. You can list the Elementary Streams in an MP4 file:

$ mp4info Ashes-to-Ashes.S2-Ep01.mp4
mp4info version 1.6
Ashes-to-Ashes.S2-Ep01.mp4:
Track   Type    Info
1   video   H264 Main@3, 3515.440 secs, 1500 kbps, 720x576 @ 25.000000 fps
2   audio   MPEG-4 AAC LC, 3515.477 secs, 160 kbps, 48000 Hz
3   text
 Tool: HandBrake svnexported 2009012901

You can also extract individual ESes from it:

$ mp4creator -extract=1 Ashes-to-Ashes.S2-Ep01.mp4 Ashes-to-Ashes.S2-Ep01.track.1
$ mp4creator -extract=2 Ashes-to-Ashes.S2-Ep01.mp4 Ashes-to-Ashes.S2-Ep01.track.2

At this point, mkvmerge comes back in:

$ mkvmerge -o Ashes-to-Ashes.S2-Ep01.mkv Ashes-to-Ashes.S2-Ep01.track.*

That seems to work properly.

More indexing

But hang on… These videos I recorded off the TV don’t have any kind of chapter markers. I’d like to be able to skip back and forward in them sometimes, too. Can I automate the process of adding chapter markers every, say, 5 minutes?

mkvmerge, again, has a –chapters option that will take either an XML input file, or a simple input listing chapter times and names. Let’s try that:

$ cat >chapters
CHAPTER01=00:00:00.000
CHAPTER01NAME=Chapter 1
CHAPTER02=00:05:00.000
CHAPTER02NAME=Chapter 2
CHAPTER03=00:10:00.000
CHAPTER03NAME=Chapter 3
[...]
^D
$ mv Ashes-to-Ashes.S2-Ep01.mkv Ashes-to-Ashes.S2-Ep01.tmp
$ mkvmerge -o Ashes-to-Ashes.S2-Ep01.mkv --chapters chapters Ashes-to-Ashes.S2-Ep01.track.*

Well, it’s kind of a success. I can skip around the chapters all right, but from mplayer I get bitter complaints about B-frames not having reference frames, and when I do skip, I get several seconds of unwatchable delta mush before the next I-frame appears and it all clears up. What I need to do is find out where all the I-frames are.

Fortunately, mpeg4ip has most of the solution to that, too:

$ mp4videoinfo Ashes-to-Ashes.S2-Ep01.mp4 | grep IDR-I
sampleId      1, size 54936 time 0(0) SEI IDR-I
sampleId    251, size 37721 time 480000(10000) IDR-I
sampleId    501, size 34168 time 960000(20000) IDR-I
sampleId    751, size 34252 time 1440000(30000) IDR-I
sampleId    910, size 28287 time 1745280(36360) IDR-I
sampleId   1002, size  9066 time 1921920(40040) IDR-I
sampleId   1100, size 10122 time 2110080(43960) IDR-I
[...]

The figures in brackets just before “IDR-I” on each line are the time offsets in milliseconds for that I-frame. We can use that to make a list of the next I-frame after each 5-minute interval of the file, and construct a chapter list from that:

# Get the I-frame positions
mp4videoinfo Ashes-to-Ashes.S2-Ep01.mp4 \
    | grep IDR-I \
    | cut -d'(' -f2 \
    | cut -d')' -f1 >cues

interval=300000  # 5 minutes in milliseconds
mark=0
chapter=0
echo -n >chapters

# Convert that to the simple chapter file format
while read cue; do
    if [ $cue -ge $mark ]; then
        chapter=$(($chapter+1))

        hours=$(($cue/3600000))
        cue=$(($cue%3600000))

        minutes=$(($cue/60000))
        cue=$(($cue%60000))

        seconds=$(($cue/1000))
        cue=$(($cue%1000))

        millis=$cue

        printf 'CHAPTER%02d=%02d:%02d:%02d.%03d\n' $chapter $hours $minutes $seconds $millis >>chapters
        printf 'CHAPTER%02dNAME=Chapter %2d\n' $chapter $chapter >>chapters
        mark=$(($mark+$interval))
    fi
done 

All done.