12.09.2021, 17:37
12.09.2021, 17:38
12.09.2021, 17:39
Those are not command line tools and there are not small either.
-> forget it, I'll simply drop DVD input support when I drop mplayer/mencoder on MacOS.
Cu Selur
-> forget it, I'll simply drop DVD input support when I drop mplayer/mencoder on MacOS.
Cu Selur
12.09.2021, 17:40
https://www.videohelp.com/software/Subtitle-Workshop-XE
That's bad news. Problem with telecined DVDs remuxed to MKVs that reads as 24FPS progressive instead of 29,970 telecine interlaced still exists and the only way to open those DVDs correctly is open them as DVD input in Hybrid.
There is SUBtools app that can do the extraction and also can support bitmap to text OCR transformation https://emmgunn.com/wp/subtools-home/
but it is commercial low cost app. UPDATE. It can't extract from VOB
Here is also OCR Tesseract project page, just in case if you change your mind about image to text subtitle tool in Hybrid... But seems it works very slow.
https://tesseract-ocr.github.io/tessdoc/...ersion-302
May be useful for Blu-rays too...
(12.09.2021, 17:39)Selur Wrote: [ -> ]-> forget it, I'll simply drop DVD input support when I drop mplayer/mencoder on MacOS.
Cu Selur
That's bad news. Problem with telecined DVDs remuxed to MKVs that reads as 24FPS progressive instead of 29,970 telecine interlaced still exists and the only way to open those DVDs correctly is open them as DVD input in Hybrid.
There is SUBtools app that can do the extraction and also can support bitmap to text OCR transformation https://emmgunn.com/wp/subtools-home/
but it is commercial low cost app. UPDATE. It can't extract from VOB
Here is also OCR Tesseract project page, just in case if you change your mind about image to text subtitle tool in Hybrid... But seems it works very slow.
https://tesseract-ocr.github.io/tessdoc/...ersion-302
May be useful for Blu-rays too...
12.09.2021, 18:45
MPlayer is currently used for:
a. normal preview (will probably be unnecessary somewhere next year)
b. DVD handling (analyse DVD, extract title/pgcs/angles, extract audio/video/subtitles/chapters), sadly no real alternative there.
MPlayer has been abandoned for years.
mpv basically dropped most of the command line interface and funcionality regarding muxing/extracting and basically turned into just a player that is also no alternative.
FFmpeg does not have dvd input support and for years nobody cares about it. (https://trac.ffmpeg.org/ticket/3280)
Handbrake might be the only active crossplattform tool which supports DVD input (no Hybrid will not start using handbrake).
So, yes when I drop MPlayer since it gets harder and harder to build and needs more and more workarounds to not be used when dealing with newer formats DVD support will also be dropped. With a bit of luck I will have found the time to write some code to use mpv or Vapoursynth as normal/base preview tool, otherwise the normal preview will also be unavailable.
Cu Selur
a. normal preview (will probably be unnecessary somewhere next year)
b. DVD handling (analyse DVD, extract title/pgcs/angles, extract audio/video/subtitles/chapters), sadly no real alternative there.
MPlayer has been abandoned for years.
mpv basically dropped most of the command line interface and funcionality regarding muxing/extracting and basically turned into just a player that is also no alternative.
FFmpeg does not have dvd input support and for years nobody cares about it. (https://trac.ffmpeg.org/ticket/3280)
Handbrake might be the only active crossplattform tool which supports DVD input (no Hybrid will not start using handbrake).
So, yes when I drop MPlayer since it gets harder and harder to build and needs more and more workarounds to not be used when dealing with newer formats DVD support will also be dropped. With a bit of luck I will have found the time to write some code to use mpv or Vapoursynth as normal/base preview tool, otherwise the normal preview will also be unavailable.
Cu Selur
12.09.2021, 18:59
Yep, things are complicated... I agree about normal preview - it is near useless in most cases.
From this article https://www.reddit.com/r/ffmpeg/comments...bsub_subs/ it appears that Subler also use OCR Tesseract. So i checked it and it works very fast. It takes just few seconds to convert SUB to SRT file.
Subler can't read VOB but can read MKV input with SUB/IDX files and can export SRT.
From this article https://www.reddit.com/r/ffmpeg/comments...bsub_subs/ it appears that Subler also use OCR Tesseract. So i checked it and it works very fast. It takes just few seconds to convert SUB to SRT file.
Subler can't read VOB but can read MKV input with SUB/IDX files and can export SRT.
12.09.2021, 19:08
Regarding OCR: it's better to use a designated application for OCR since OCR usually makes tons of mistakes which need manual fixing.
Adding ocr itself to Hybrid wouldn't be so hard using Vapoursynth (http://www.vapoursynth.com/doc/plugins/ocr.html), one would:
a. extract the subtitle
b. load subtitle in Vapoursynth onto a black/unicolor clip
c. use ocr.Recognize
no magic.
But after that user would need to be able to see and fix the subtitles which would be way more complicated (grammar and spell checking for xy languages, etc.). Hybrid is not a subtitle editor and I have no plan to implement one in it.
So, I know how to add OCR to Hybrid, but I won't add it for the time beeing since there are tons of other things that are more important as additional features, fixes etc.
Cu Selur
Adding ocr itself to Hybrid wouldn't be so hard using Vapoursynth (http://www.vapoursynth.com/doc/plugins/ocr.html), one would:
a. extract the subtitle
b. load subtitle in Vapoursynth onto a black/unicolor clip
c. use ocr.Recognize
no magic.
But after that user would need to be able to see and fix the subtitles which would be way more complicated (grammar and spell checking for xy languages, etc.). Hybrid is not a subtitle editor and I have no plan to implement one in it.
So, I know how to add OCR to Hybrid, but I won't add it for the time beeing since there are tons of other things that are more important as additional features, fixes etc.
Cu Selur
12.09.2021, 19:20
Yes, if it needs additional manual correction it don't fits well to Hybrid concept. It is strange that image to text engines still produce mistakes. Robots are so learned that can recognize any complicated text on captchas, but still can't read DVD subtitles without mistakes
Tested few more DVDs and can't see any mistakes in converted subtitles yet. Sure i didn't test full length movies, but at first look to random places there are no any missed or incorrect text at all. So i guess it all may depend of OCR engine.
Recognizing speed also may be different and depends of program design and optimization.
Probably soon or late i will transcode all those crappy DVD subtitles from my collection to text format.
Anyway even if it won't happened in Hybrid, currently i can recommend Subler with OCR Tesseract language data modules as free and stable tool to convert to IDX/SUB DVD subtitles to SRT.
Tested few more DVDs and can't see any mistakes in converted subtitles yet. Sure i didn't test full length movies, but at first look to random places there are no any missed or incorrect text at all. So i guess it all may depend of OCR engine.
Recognizing speed also may be different and depends of program design and optimization.
Probably soon or late i will transcode all those crappy DVD subtitles from my collection to text format.
Anyway even if it won't happened in Hybrid, currently i can recommend Subler with OCR Tesseract language data modules as free and stable tool to convert to IDX/SUB DVD subtitles to SRT.
12.09.2021, 20:44
Seems that OCR only don't like when normal text mixed with italic style text. So unfortunately it is not 100% problem free...
This:
Became this:
This:
Became this:
12.09.2021, 20:50
It all depends on the ocr engine it's settings, font databases etc.
-> it's usually not that simple.
-> it's usually not that simple.