Jul. 8th, 2013

miriam_e: from my drawing MoonGirl (Default)
I love watching TED talks videos, however I am getting more and more deaf, so am finding it harder to catch all of what is being said, especially during shots that don't show the speaker's mouth. The TED talk download pages have transcripts embedded that include timing information, so I set about working out how to extract it and create srt format subtitles from it.

In the hope that I might be able to save myself some work I first looked around on the net to see if it had already been done. It has, but all examples that I found are really quite unsatisfactory. Generally requiring going to another site to get the subtitle file. There is a python script that nearly does what I want, but I couldn't get it to work as it required some libraries that I didn't have and a search didn't show me how to add them to python. (As an aside, my original enthusiasm for python has begun to fade. It has become increasingly bloated, unnecessarily complex and unwieldy, and new variants keep breaking old programs.)

I wanted something that I could simply point at a TED-talk page. The program would do the rest: It automatically downloads the video, renaming it to a more informative name then creates the .srt subtitle file with the same name so that playing the video will automatically display the subtitles.

It turned out to be easy to write, and uses only commands that are available to all Linux distributions. I use lots of comments so that I can come back months later and still work out how it operates, so hopefully you should be able to follow it. The comments give it the appearance of being bigger than it really is, often I broke up commands to multiple lines for the sake of clarity, commenting each, so that sometimes 1 line becomes about 20. It really is a simple program. Let me know if you come up with some improvements.
Read more... )

Profile

miriam_e: from my drawing MoonGirl (Default)
miriam_e

January 2026

S M T W T F S
    1 23
45678910
11121314151617
18192021222324
25262728293031

Style Credit

Expand Cut Tags

No cut tags
Page generated Jan. 4th, 2026 09:24 pm
Powered by Dreamwidth Studios