skip to Main Content

In my RMarkdown I have strings like "(i.e. tell" and "Dr. Charles" where the spaces followed by the dots show up as u00A0 in the HTML. This displays fine in some browsers but as strange question mark characters in other browsers:
strange question mark characters

I know that the u00A0 character is the non-blanks space character. Why is RMarkdown inserting this into the HTML when following a "." and not in other places ?

Thanks !

2

Answers


  1. This appears to be done by Pandoc when processing the Markdown file that RMarkdown produces. I don’t think there’s an option to disable this. The only workaround I can think of is to process the output file to change the non-breaking space back to a normal one.

    Login or Signup to reply.
  2. Using a non-breaking space after the dot of an abbreviation helps to distinguish it from a sentence-ending period. Pandoc has a list of abbreviations built-in, view it with pandoc --print-default-data-file=abbreviations.

    There are two ways to prevent this (listed below), but the main problem seems to be the browser rendering. This will typically happen if the browser is using the wrong encoding. R Markdown files are UTF-8 encoded, and the HTML file should contain a line like <meta charset="utf-8"> to make sure that there are no mishaps. Maybe you’re using a custom template that’s missing that line?

    There are two options to disable those nbsp characters:

    1. Escape the dot with a backslash, e.g. Dr. Charles.

    2. Make sure that pandoc is called with the --abbreviations=<EMPTY_FILE> parameter, where <EMPTY_FILE> should be just that, an empty file. On Mac or Linux one can use /dev/null for that, not sure about Windows, I believe nul should work. When in doubt, just create a new empty file and use that.

    However, the nbsp after abbreviations is good typographic style, and removing those might make the more difficult to read.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search