NHK Plus, NHKプラス
WEBVTT
21:15:03.927 --> 21:15:04.679
[CS][CS]
21:15:04.679 --> 21:15:12.054
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_7_2][WHF][COL_4][COL_5_1]♬〜
21:15:12.054 --> 21:15:15.425
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_5_2][CNF][COL_4][COL_5_1]うわ〜[MSZ] [NSZ]すごい![MSZ][APS_6_5][NSZ]うわっ[MSZ] [NSZ]ぶつかっちゃいそうですね!
21:15:15.425 --> 21:15:18.828
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][MSZ][APS_6_17][NSZ][CNF][COL_4][COL_5_1]いや〜[MSZ]。
21:15:18.828 --> 21:15:22.500
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_6_5][CNF][COL_4][COL_5_1]ここは川ではなくって[MSZ][APS_7_11][NSZ]海ということですか?
21:15:22.500 --> 21:15:26.735
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_3_2][YLF][COL_4][COL_5_1]そうです[MSZ]。 [NSZ]今日はですね[MSZ][APS_4_5][NSZ]海を旅するサケに迫っていきます[MSZ]。
21:15:26.735 --> 21:15:29.171
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_7_3][CNF][COL_4][COL_5_1]海の中の姿って見たことないですね[MSZ]。
21:15:29.171 --> 21:15:31.840
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][SSZ][APS_8_4][NSZ][YLF][COL_4][COL_5_1]ですよね[MSZ]。[SSZ][APS_10_20][NSZ][COL_0][CNF]はい[MSZ]。
21:15:31.840 --> 21:15:39.618
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][SSZ][APS_6_9][NSZ][GRF][COL_4][COL_5_1]川で生まれたサケの稚魚は[SSZ][APS_8_10][NSZ]春になると川を下り➡
21:15:39.618 --> 21:15:43.855
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][SSZ][APS_12_11][NSZ][GRF][COL_4][COL_5_1]大海原へ旅立ちます[MSZ]。
21:15:43.855 --> 21:15:49.555
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_6_9][GRF][COL_4][COL_5_1]実に2万[MSZ]km[NSZ]もの[MSZ][APS_7_19][NSZ]大回遊の始まりです![TIME_57][CS]
21:15:49.555 --> 21:15:51.865
[CS][CS]
these timestamps are completely wrong for some reason this is taken from サイエンスZERO 2万kmの旅!“サケの大回遊”に迫る
they display on the site like this:
The font is Hiragino TV Sans Rd S, you can pinch it from the dev
tools quite easily
Interestingly they also have an option to display the subs below
the video, which also disables a lot of the formatting
i don't think i've ever seen that before
looking at the raw line, we can see that it is composed of
text+tags, quite similar to ASS in fact
[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][APS_6_9][GRF][COL_4][COL_5_1]実に2万[MSZ]km[NSZ]もの[MSZ][APS_7_19][NSZ]大回遊の始まりです![TIME_57][CS]
but there are no styles with defaults to be reused here, everything
is re-specified every time, because >broadcastthe tags are listed in the ARIB STD-B24 standard, which defines
the binary subtitle format used in actual broadcasts
https://www.arib.or.jp/english/std_tr/broadcasting/std-b24.html
"Fascicle 1"
starting from Table 7-15 on page 89
\[APS_\d+_\d+\] -> \\N (regex)
(position change
-> newline)
[WHF] -> {\c&HFFFFFF&}
[YLF] -> {\c&H00FFFF&}
[CNF] ->
{\c&HFFFF00&}
[GRF] -> {\c&H00FF00&}
(white,
yellow,
cyan, and
green
coloured text respectively)[DRCS_
and replace with the corresponding
character from nhk's
lookup table (needs jp ip/vpn)[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][MSZ][APS_4_5][WHF][COL_4][COL_5_1]<[NSZ]フルカラーでよみがえった[SSZ][APS_10_7]てづか おさむ[APS_12_6][NSZ]手[DRCS_36_36_2_AADwAAAAAAAAAADwAAAAAAAAAADwD/////8AAADwD/////8AAADwDwAAAA8AAADwDwAAAA8AAADwDwAAAA8AAADwDwAAAA8AAADwAP////AAAADwAP////AAAP//8AAP8AAAAP//8AAP8AAAAADwAADwDwDwAADwAADwDwDwAADwAA8PDw8AAADwAA8PDw8AAADwD/AA8PAAAADwD/AA8PAAAADwAAAP8PAAAADwAAAP8PAAAADwAPDw8PAAAADwAPDw8PAAAADw8A8A8PAAAADw8A8A8PAAAAD/D/Dw8A8AAAD/D/Dw8A8AAP8AAAAP8A8AAP8AAAAP8A8AAAAAAADw8A8AAAAAAADw8A8AAAAAAP8A8ADwAAAAAP8A8ADwAAAA/wAA8AAAAAAA/wAA8AAAAAAAAAD/AAAAAAAAAAD/AAAA]治虫の名作をお楽しみ下さい[MSZ]>
find that mess in the lookup table[CS][CS][SWF_7][SDF_840_480][SDP_58_29][SHS_4][SVS_24][SSM_36_36][MSZ][APS_4_5][WHF][COL_4][COL_5_1]<[NSZ]フルカラーでよみがえった[SSZ][APS_10_7]てづか おさむ[APS_12_6][NSZ]手塚治虫の名作をお楽しみ下さい[MSZ]>
\[[^]]+\]
-> empty string (regex)
(strip all other tags as we don't want to deal with them)
^\\N
-> empty string (regex)
(remove
leading \N left over from the initial position tag)idk why they would do it like this
i can only imagine that ISDB is so deeply embedded in their
processes that they weren't able to let go of it
initially i thought of writing my own script to fully convert it
to ASS for rendering
but it occurs to me that since it maps directly to the ISDB binary
format, it may be better to convert it to a bitstream like in the
actual broadcasts.
And then you can take advantage of the many ARIB ->
something-else converters that already exist
-garret1317, April 2025