I was trying to watch a film, but the subtitles I had were 1 second behind the actors' lines.
Not content with finding other subtitles on the web, I opened up the python interpreter and loaded the file into
lines
.
A little code followed
import datetime
import re
lines = open('subs.srt', 'rb').read().splitlines()
with open('out.srt', 'wb') as outp:
for line in lines:
if subtime.findall(line):
time = datetime.datetime(1,1,1,*map(int, line[:8].split(':')))
time += datetime.timedelta(seconds=1)
outp.write('%02d:%02d:%02d%s
' % (
time.hour, time.minute, time.second, line[8:]))
else:
outp.write(line + ' ')
It was just a few lines of code, showing off quite well a lot of the capabilities of python I love most. Text processing is always a cinch.
Explaining the code
The format of the subtitles was:
[blank line]
ID
hh:mm:ss,ms: [text]
This explains why I had to check if the regex
findall
returned a match. The regex was
^\d\d:\d\d:\d\d
.
When this regex found a line with subtitle time written on it, I did the reading, updating and writing the time. Otherwise, I just copied the line verbatim to the output file.
I simply cut the line using slice syntax.
[:8]
and
[8:]
got me the line's contents up to the seventh character, and from the eight character onwards, respectively.
I used the first seven characters of the line, split by the colon
:
character, as arguments to the
datetime.datetime
constructor, in true functional fashion. I had to map a call to
int
to turn all these number strings into integers.
To update the seconds correctly, I had to create an instance of
datetime.timedelta
with seconds set to 1 (which was my estimate of how off the time was), and add it the the time I got from the split string.
Having forgotten how to do date formatting, I just used string formatting against time.hour, time.minute and time.second, and joined in the rest
[:8]
of the string in the same operation.
It was quite fun, but my friends eventually grew impatient so in the end no film was watched.