In my RMarkdown I have strings like "(i.e. tell" and "Dr. Charles" where the spaces followed by the dots show up as u00A0
in the HTML. This displays fine in some browsers but as strange question mark characters in other browsers:
I know that the u00A0
character is the non-blanks space character. Why is RMarkdown inserting this into the HTML when following a "." and not in other places ?
Thanks !
2
Answers
This appears to be done by Pandoc when processing the Markdown file that RMarkdown produces. I don’t think there’s an option to disable this. The only workaround I can think of is to process the output file to change the non-breaking space back to a normal one.
Using a non-breaking space after the dot of an abbreviation helps to distinguish it from a sentence-ending period. Pandoc has a list of abbreviations built-in, view it with
pandoc --print-default-data-file=abbreviations
.There are two ways to prevent this (listed below), but the main problem seems to be the browser rendering. This will typically happen if the browser is using the wrong encoding. R Markdown files are UTF-8 encoded, and the HTML file should contain a line like
<meta charset="utf-8">
to make sure that there are no mishaps. Maybe you’re using a custom template that’s missing that line?There are two options to disable those nbsp characters:
Escape the dot with a backslash, e.g.
Dr. Charles
.Make sure that pandoc is called with the
--abbreviations=<EMPTY_FILE>
parameter, where<EMPTY_FILE>
should be just that, an empty file. On Mac or Linux one can use/dev/null
for that, not sure about Windows, I believenul
should work. When in doubt, just create a new empty file and use that.However, the nbsp after abbreviations is good typographic style, and removing those might make the more difficult to read.