FilmApart

20060707

Batch Transcode DV to OGG Theora

This scipt was necessary to replace all the spaces in the file names before the transcode of all DV media in the directory could be accomlished with the next script.
#!/usr/bin/perl -w
# nospace /this/dir /that/dir /those/too

use File::Find;
use strict;
die "usage: nospace dir[s]\n" unless @ARGV;

my %ext;

find(\&remspaces, @ARGV);

sub remspaces {
return if ($_ eq '.');
return if ($_ eq '..');
(my $new = $_) =~ tr/a-zA-Z0-9_.-/_/c;
my $duplicate = ($new ne $_ and -e $new);
my $try = $new;

$ext{"$File::Find::dir/$try"}++ if $duplicate;

while (my $count = $ext{"$File::Find::dir/$new"}++) {
(my $with_num = $new) =~ s/(?=\.|$)/_$count/;
$new = $with_num, last if not -e $with_num;
}

$ext{"$File::Find::dir/$try"}-- if $duplicate;

rename $_ => $new
or warn "can't rename $_ to $new: $!";
}

And this will output the OGG files to a folder on my desktop when run from the directory containing all the DV media.
#!/bin/bash
#==========================================
# process every file in current directory
# may use "break", "continue"
#==========================================
for FName in $(ls ); do
ffmpeg2theora -f dv -v 10 -a 10 --optimize --artist "Chris Hastings" --title "Alacridad $FName" --organization "http://alacridad.org" --copyright "© 2006 Chris Hastings" --license " True http://creativecommons.org/licenses/by-nc-sa/2.5/ This MovingImage is licensed to the public under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. " -o /home/papyromancer/Desktop/AlacridadOGG/$FName.ogg $FName
done

Processing 28.9GB of video began 2PM CDT July 7. I'm going to see how long it takes, but I'm guessing it'll be 5 Days 17 Hours...I'll report back on Tuesday. Then I'll have to run it again to see if enabling MMX extensions as detailed here will make any difference. Maybe the upload to the archive will be faster, it should only be around 7GB of footage that can then be frame accurately rendered in Cinelerra with an EDL applicable to either the DV or Ogg files.
octave:9> 840/28900
ans = 0.029066
octave:10> 4/.029
ans = 137.93
octave:11> 137.93/24.
ans = 5.7471
octave:12> .7471*24
ans = 17.930


Transcoding DV to OGG Theora

ffmpeg2theora was simple enough to install on Ubuntu (dapper) with the command:
sudo apt-get install ffmpeg2theora

6 second 21.9MB DV to 1.4MB OGG
papyromancer@Humboldt:~$ ffmpeg2theora /home/papyromancer/Desktop/footage.dv
Input #0, dv, from '/home/papyromancer/Desktop/footage.dv':
Duration: 00:00:06.3, start: 0.000000, bitrate: 28771 kb/s
Stream #0.0: Video: dvvideo, yuv411p, 720x480, 29.97 fps, 25000 kb/s
Stream #0.1: Audio: pcm_s16le, 32000 Hz, stereo, 1024 kb/s
Stream #0.2: Audio: pcm_s16le, 32000 Hz, stereo, 1024 kb/s
Pixel Aspect Ratio: 1.21/1 Frame Aspect Ratio: 1.82/1
Resize: 720x480
0:00:06.37 audio: 64kbps video: 1723kbps

Everything was automatically detected, even the aspect ratio. The video looks crisp and clean even with the massive compression. For a little bit higher quality I used a preset by calling the -p command line flag and wound up with a 2.4MB file.
papyromancer@Humboldt:~$ ffmpeg2theora -p pro /home/papyromancer/Desktop/footage.dv

ffmpeg2theora has lots of options listed in the man page. The next command takes advantage of many of these--Creative Commons licensing (by stipping newline markers from the xml template), Copyright, Title, and Artist metadata. Highest quality encoding of audio/video and optimised motion vector search took much longer than the previous examples and the resulting file remained small at 5.1MB.
papyromancer@Humboldt:~$ ffmpeg2theora -v 10 -a 10 --optimize --artist "Papyromancer" --title "Footage" --organization "http://FilmApart.org" --copyright "© 2006 Papyromancer" --license " True http://creativecommons.org/licenses/by-nc-sa/2.5/ This MovingImage is licensed to the public under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. " /home/papyromancer/Desktop/footage.dv
Input #0, dv, from '/home/papyromancer/Desktop/footage.dv':
Duration: 00:00:06.3, start: 0.000000, bitrate: 28771 kb/s
Stream #0.0: Video: dvvideo, yuv411p, 720x480, 29.97 fps, 25000 kb/s
Stream #0.1: Audio: pcm_s16le, 32000 Hz, stereo, 1024 kb/s
Stream #0.2: Audio: pcm_s16le, 32000 Hz, stereo, 1024 kb/s
Pixel Aspect Ratio: 1.21/1 Frame Aspect Ratio: 1.82/1
Resize: 720x480
0:00:06.37 audio: 371kbps video: 6397kbps

Media Annotation

This morning I was browsing the mail archives of the Open Media Coalition and I stumbled upon Annodex, a current project of the Google Summer of Code.

The software allows an Python-CGI enabled Apache webserver to access OGG Theora video files via a time URI. The time URI's may be listed in a file using Continuous Media Markup Language (CMML) and then streamed from the server as a single contiguous media file.

The software is still in development, and the Linux version of their Firefox plugin is in pre-alpha stage, So online editing (the software allows simple cuts) is not easy (the plugin does not yet allow ins and outs to be set within the browser) and must be performed from the command-line.

Hopefully, after the Summer of Code has ended, the Wikipedia will be hosting user edited video.