README
WebVTT parser and segmenter
Parse WebVTT files, segments and generates HLS playlists for them.
Usage
For a WebVTT file:
WEBVTT
00:00:00.000 --> 00:00:01.000
Hello world!
00:00:30.000 --> 00:00:31.000 align:start line:0%
This is a subtitle
00:01:00.000 --> 00:01:01.000
Foo
00:01:50.000 --> 00:01:51.000
Bar
We can parse, segment and create HLS playlists:
const webvtt = require('node-webvtt');
const segmentDuration = 10; // default to 10
const startOffset = 0; // Starting MPEG TS offset to be used in timestamp map, default 900000
const parsed = webvtt.parse(input);
const segmented = webvtt.parse(input, segmentDuration);
const playlist = webvtt.hls.hlsSegmentPlaylist(input, segmentDuration);
const segments = webvtt.hls.hlsSegment(input, segmentDuration, startOffset);
Parsing
Parses the WebVTT file and returns an object with valid === true
if parsed correctly and an array of cues parsed.
Each cue can have:
identifier
- Id, if any of the cuestart
- Start time of cue in secondsend
- End time of cue in secondstext
- Text of the subtitlestyles
- If any of the cue
If the WebVTT file is invalid, the parser will throw a ParserError
exception. So for safety, calls to parse
should be in try catch
.
For the above example we'd get:
{
"valid":true,
"cues":[
{
"identifier":"",
"start":0,
"end":1,
"text":"Hello world!",
"styles":""
},
{
"identifier":"",
"start":30,
"end":31,
"text":"This is a subtitle",
"styles":"align:start line:0%"
},
{
"identifier":"",
"start":60,
"end":61,
"text":"Foo",
"styles":""
},
{
"identifier":"",
"start":110,
"end":111,
"text":"Bar",
"styles":""
}
]
}
Segmenting
Segments a subtitle according to how it should be segmented for HLS subtitles.
- Does a one pass of the cues for segmenting, this might have been a good idea or bad, only time will tell
- The One and Only Source of Truth is Apple's
mediasubtitlesegmenter
CLI
For the above example:
[
{ duration: 10, cues: [ [Object] ] },
{ duration: 30, cues: [ [Object] ] },
{ duration: 30, cues: [ [Object] ] },
{ duration: 41, cues: [ [Object] ] }
]
HLS playlist
Creates a subtitle playlist. For the above:
#EXTM3U
#EXT-X-TARGETDURATION:41
#EXT-X-VERSION:3
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-PLAYLIST-TYPE:VOD
#EXTINF:10.00000,
0.vtt
#EXTINF:30.00000,
1.vtt
#EXTINF:30.00000,
2.vtt
#EXTINF:41.00000,
3.vtt
#EXT-X-ENDLIST
HLS Segments
Creates a list of HLS segments for the subtitles, returning an array of them with filename
and content
.
[
{
"filename":"0.vtt",
"content":"WEBVTT\nX-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000\n\n00:00:00.000 --> 00:00:01.000\nHello world!\n"
},
{
"filename":"1.vtt",
"content":"WEBVTT\nX-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000\n\n00:00:30.000 --> 00:00:31.000 align:start line:0%\nThis is a subtitle\n"
},
{
"filename":"2.vtt",
"content":"WEBVTT\nX-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000\n\n00:01:00.000 --> 00:01:01.000\nFoo\n"
},
{
"filename":"3.vtt",
"content":"WEBVTT\nX-TIMESTAMP-MAP=MPEGTS:900000,LOCAL:00:00:00.000\n\n00:01:50.000 --> 00:01:51.000\nBar\n"
}
]
CLI
For segmenting a WebVTT file quickly, you can use the included CLI tool:
$ ./webvtt-segment.js -v --target-duration 10 -o ./subs subs.vtt
% ./webvtt-segment.js --help
Usage: webvtt-segment [options] <webvtt file>
Options:
-h, --help output usage information
-V, --version output the version number
-t, --target-duration [duration] Target duration for each segment in secods, defaults to 10
-o, --output-directory [dir] Output directory for segments and playlist
-v, --verbose Chatty output
-s, --silent No output
Development
This has been written with TDD so we've got a good coverage of the features.
npm install
npm test
mocha -w
<write failing test>
<write passing code>
<lather, rinse, repeat>
TODO
- Remove
valid
from parsing result, having a result means it's valid - Add more options to control output
- Better parsing
- Support more subtitles formats (at least SRT, maybe SSA/ASS)
- Iron out segmenting bugs with real playlists
- Refactor the mess that is the segmenter (yay, unit tests!)
- Nicer interface, don't be parsing again and again
- Do something to make the cli tool more accessible
- Code coverage reporting
References
- Anne van Kesteren's WebVTT validator
- WebVTT Ruby parser and segmenter
mediasubtitlesegmenter
from Apple- WebVTT: The Web Video Text Tracks Format