All the solutions I’m seeing are some third party service where I would have to upload my videos to them to get them transcribed.
All the solutions I’m seeing are some third party service where I would have to upload my videos to them to get them transcribed.
Maybe Whishper would be suitable?
Okay yeah, I spun up a docker instance and this is cool as fuck. It seems to be exactly what OP is looking for. This is cool enough to be a post on its own tbh. It would be perfect in a ytdl workflow, as you can do the transcription by linking a video. I’ve been holding off on adding youtube to my Jellyfin setup for just this sort of tool. I hope the add the GPU accelerated faster-whisper models soon.
Luckily I still had the project in my history! Glad it was useful.