Here’s a minimal example of my problem:
$ echo '<video><source src="filename.mp4" type="video/mp4"></video>'
| pandoc -f html -t html
> (empty output)
It seems that the problem comes in the parsing stage. If I remove the from
type, pandoc happily passes the input through, only formatting it nicely. That might have been good enough, except I really need pandoc to parse the contents and include it when building the document tree, so that it is aware of necessary styling and such.
I tried this in their online sandbox as well, and see the following messages:
<video controls><source src="filename.mp4" type="video/mp4"></video>
---
> Skipped '<video controls>' at input line 1 column 1
> Skipped '<source src="filename.mp4" type="video/mp4">' at input line 1 column 17
> Skipped '</video>' at input line 1 column 61
(empty output)
So, basically, why is this tag being skipped?
What have I tried? I have tried variations on the input, like putting the video tag inside a paragraph and other things, but it always disappears.
I have also been fiddling with various flags, like --self-contained
or --embed-resources
, but I don’t really know what they’re trying to accomplish and they didn’t work anyway. The final pandoc-command in my Makefile
(the one currently swallowing the video-tags) has the --standalone
flag, but that seems beside the point here.
2
Answers
While this does not respond to the question "WHY", it does present an alternative that may help others encountering a similar problem: USE img instead of video.
Curiously, pandoc does the following, when converting
html -> html5
:It detects that the document is showing a video file, and replaces
img
withvideo
and adds a nice fallback link for browsers that don't support video, but does not remove the tag (yet).This helps me, since my full pipeline is
markdown -> html -> html
, and the first conversion only preserves whichever tags were in the Markdown code, and then the final one does theimg -> video
conversion. However, if for some wild reason you need to convert from and to HTML twice, then I do not know what to tell you;First for the why: videos are not part of pandoc’s internal document representation, so it is not entirely clear how that should be handled. Adding it as an image is reasonable, and you could raise a feature request for this.
As an alternative to the nice
<img>
workaround mentioned above, one could also enable theraw_html
format extension:This will ensure that unknown elements are simply passed through unchanged.