Whitespace in image paths


#1

Given:

![text](C:/Users/Mike/Pictures/Screenshots/Screenshot (1).png)

The space before (1) prevents this from being recognized as an image tag (CommonMark.Net engine). Tried URL encoding (%20) but that didn’t work. Are spaces not allowed in path names?


Autolink restrictions
#2

I think a space in not valid in a URL, and should be replaced by %20.


#3

% is a legal file name character in Windows.

Are implementations suppose to URL decode all paths?


#4

I was able to make it work using the following:

![text](file:C:/Users/Mike/Pictures/Screenshots/Screenshot%20(1).png)

#5

What if you wanted to specify a relative path, and so cannot start with file:? How would you put a space in the link?


#6

+++ Mike_Ward [Mar 28 15 16:58 ]:

Given:

![text](C:/Users/Mike/Pictures/Screenshots/Screenshot (1).png)

The space before (1) prevents this from being recognized as an image tag (CommonMark.Net engine). Tried URL encoding (%20) but that didn’t work. Are spaces not allowed in path names?

See http://spec.commonmark.org/0.18/#link-destination

You can include spaces if you put the whole thing inside pointy brackets
<C:/Users/Mike/Pictures/Screenshots/Screenshot (1).png>.

Or you could use an entity &#32;, though that’s uglier.


#7

Good tip about pointy brackets. Using

![Screenshot (7)](<C:/Users/Mike/Pictures/Screenshots/Screenshot (7).png>)

produces

<p><img src="C:/Users/Mike/Pictures/Screenshots/Screenshot%20(7).png" alt="Screenshot (7)" /></p>

Which does not display in a browser.

![Screenshot (7)](<file:C:/Users/Mike/Pictures/Screenshots/Screenshot (7).png>)

does work. I like the pointy bracket syntax vs. URL encoding but it still appears file: is required when drive letters are specified.

P.S. I realize this a corner case since relative paths are the norm.


#8

Why it should not? file://this/is/relative vs file:///this/is/absolute.


#9

No, both of those are absolute urls. If you have a protocol in the url, it’s absolute.

(More generally, a file/http/https url consists of a protocol, a domain, a path, a query, and a hash. You can omit any set of these starting from the beginning; including one means the following ones are automatically included, even if you’re “including” an empty value for them.)

Windows drive letters are extra weird wrt file urls, which is why you can’t start a link with them. You can definitely use relative urls for any part of the path below the drive letter, though.

As @PhilippeG said, spaces aren’t valid in URLs. Browsers allow them in a lot of circumstances, but CommonMark doesn’t recognize them as part of the url grammar for anything but angle-bracket links, as @jgm says. You can percent-encode the space, as you discovered, or hex-encode it; both will work. Bare percentages should be escaped in a url (as %25, or a hex entity), but again, browsers allow it as long as it doesn’t look like a hex escape (isn’t followed by two hex digits).


#10

I’m bumping this topic because I just got surprised at the change in the spec in 0.24. Pointy brackets are no longer allowed in urls, and therefore image paths.

http://spec.commonmark.org/0.24/#example-455

And from this discussion: Issues we MUST resolve before 1.0 release [8 remaining]

Inconsistent handling of spaces in links

<http://example.com/hey nice link>

There’s a very well understood way to encode spaces into links, our old pal %20 and spaces in links are some bad mojo anyway that we should not be encouraging. We should not allow spaces in links.

The issue for me is that now it’s much more difficult to add a reference to an image file on disk. It’s very common to have image names that have spaces in them (even macOS creates their screen shots named “Screen Shot 2017-08-05 at 11.07.11 AM.png” for example).

And while we could ask the user to replace each space with a ‘%20’, that seems excessive and unfriendly. I thought enclosing the url in pointy brackets was a workable, if not obscure, way of solving the problem ( I would have preferred to enclose the url in quotation marks, because I think that is more intuitive, but maybe conflicted in other situations).

So…i’m advocating for pointy brackets to return for urls. I agree percent encoding is better in general, but I also think it’s unfair to force a user to do it themselves when writing a plain text document.


#11

Perhaps your editor should encode those spaces for you? It seems your context is referring to local files on disk, not hyperlinks. That’s a very narrow use case that doesn’t reflect the wider internet.


#12

I agree, if you’re dragging/dropping links/images into a markdown aware editor, it should encode those. But since the whole purpose of Markdown/CommonMark is to be able to write using any text editor, you can’t rely on the editor doing it for you.

I would also argue that the usage of Markdown for documents that are not intended to live on the internet is important, from writing your next book and including image/figures, to journaling, taking business notes, etc.

Again, I’m not crazy about the pointy bracket syntax, but I think it provides a valid solution to those who want to use a plaintext editor and reference a local file.


#13

The iA Writer Figure syntax supports spaces in file names, e.g.

/My image.png

It does this by requiring that each file reference is placed on a new line. That doesn’t solve this problem with the regular image syntax of course, but for easy dragging and dropping, enabling the Figure extension could be a solution for users who don’t want to mess around with encoding or learning special angle bracket rules.


#14

It’s a good point that, when we’re referring to an image file on the file system, using %20 is quite unnatural.

This might be something to reconsider. (Feel free to put up an issue.)


#15

#16

@jgm I’ve opened https://github.com/commonmark/CommonMark/issues/503 regarding this issue.


#17

@chrisalley I’m not actually crazy about the IA Writer syntax in this case. Using the / to start the image name seems ambiguous: is it starting from the root of my filesystem, or some other app defined root? what about relative paths, such as imgs/some.jpg or ../imgs/some.jpg. Though I may have missed how that is dealt with in IA Writer.


#18

In iA Writer, images are located in the same folder as the document, because hierarchical directory structures are confusing. iA Writer creates a copy of the image in the same folder when the image is dragged into the editor. The way I see it, in advanced cases where the user needs more control over the image’s location, the user could revert back to the original Markdown image syntax or use HTML.


#19

I think it makes sense in your implementation, but that seems very specific to IA Writer. It doesn’t seem that portable, and at least to me, was a little confusing using the leading /.

I’ve just started working with a large corpus of markdown documents. They standardized on having an img directory to store any images. So they reference the image as img/photo.jp. It keeps the images from being mixed in with the list of documents, so it’s much easier to find something when you’re looking on your filesystem or git repo. At least for this use case, I find it works extremely well.

In any case, I would still like to see a consistent way of handling spaces in titles and urls, particularly using quotes, like "some photo.jpg" or "Photo Caption and Title"


#20

With content block syntax, you could embed an image located in a subfolder like this:

/img/photo.jpg

Where the Markdown document containing this syntax is located in the same folder as the img folder. So the root, defined by /, represents the folder that the Markdown document is located in, not the root of the file system.

What you can’t do is reference files higher up in the file tree - only files adjacent or beneath the document in the hierarchy. I like this constraint because it keeps everything related to the document in one folder, making the complete document easily portable by copying the folder. Although I can see how this constraint wouldn’t work for every scenario, it is perhaps useful in enough scenarios to warrant an extension that uses this less complex syntax.