skip to Main Content

Here’s a minimal example of my problem:

$ echo '<video><source src="filename.mp4" type="video/mp4"></video>' 
    | pandoc -f html -t html
> (empty output)

It seems that the problem comes in the parsing stage. If I remove the from type, pandoc happily passes the input through, only formatting it nicely. That might have been good enough, except I really need pandoc to parse the contents and include it when building the document tree, so that it is aware of necessary styling and such.

I tried this in their online sandbox as well, and see the following messages:

<video controls><source src="filename.mp4" type="video/mp4"></video>
---
> Skipped '<video controls>' at input line 1 column 1
> Skipped '<source src="filename.mp4" type="video/mp4">' at input line 1 column 17
> Skipped '</video>' at input line 1 column 61
(empty output)

So, basically, why is this tag being skipped?

What have I tried? I have tried variations on the input, like putting the video tag inside a paragraph and other things, but it always disappears.

I have also been fiddling with various flags, like --self-contained or --embed-resources, but I don’t really know what they’re trying to accomplish and they didn’t work anyway. The final pandoc-command in my Makefile (the one currently swallowing the video-tags) has the --standalone flag, but that seems beside the point here.

2

Answers


  1. Chosen as BEST ANSWER

    While this does not respond to the question "WHY", it does present an alternative that may help others encountering a similar problem: USE img instead of video.

    Curiously, pandoc does the following, when converting html -> html5:

    <img src="filename.mp4"></img>
    
    <!-- converts to -->
    
    <video src="filename.mp4" controls="1" controls=""><a
    href="filename.mp4">Video</a></video>
    

    It detects that the document is showing a video file, and replaces img with video and adds a nice fallback link for browsers that don't support video, but does not remove the tag (yet).

    This helps me, since my full pipeline is markdown -> html -> html, and the first conversion only preserves whichever tags were in the Markdown code, and then the final one does the img -> video conversion. However, if for some wild reason you need to convert from and to HTML twice, then I do not know what to tell you;

    $ echo '<img src="file.mp4"></img>' 
        | pandoc -f html -t html5 
        | pandoc -f html -t html5
    > <a href="file.mp4">Video</a>
    

  2. First for the why: videos are not part of pandoc’s internal document representation, so it is not entirely clear how that should be handled. Adding it as an image is reasonable, and you could raise a feature request for this.

    As an alternative to the nice <img> workaround mentioned above, one could also enable the raw_html format extension:

    pandoc -f html+raw_html -t html ...
    

    This will ensure that unknown elements are simply passed through unchanged.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search