The full width is relating to the list of stop chars for inlines, it’s an implementation detail that makes it annoying to build this plugin, but my new text post processor helps a bit (but if you want to bold or italic ruby text you would be stuck)
Whether you use furigana (the Japanese use case for ruby) or not is really dependent on text type and audience. If you were inclined to use furigana at all, though, 爨 is most definitely a kanji you would use it on, as it’s both hyōgai (non-standard) and very complicated. For example, you’d write [炊爨]【すいさん】 or [爨]【かし】ぐ to make things readable.
I’ve never seen anyone use any formatting within Ruby text, and both Japanese and Chinese traditionally lack both bold and italic, so it most definitely isn’t too important.
I don’t think there’s a reason to explicitly forbid it, though. Since formatting is available in regular CJK text (あいう), it might as well be valid within the base text of a Ruby tag. I can also imagine a scenario where someone would want to emphasize a certain part of pronunciation ("She actually says [寂]【さ**み**】しい in this case"), in which case formatting in the actual Ruby text would be useful. None of those would be particularly common, I don’t think, but it could become a minor thing people would stumble over every now and then.
The one thing that might get weird is ending a formatting block within a Ruby block. For example, *Italic outside[italic base]【italic ruby* normal ruby】. Either formatting could be disallowed within Ruby tags, or this could result in something like:
I suppose it could also introduce a new “context” where new formatting tags can be started but earlier ones can’t be completed… I’m sure you’re more experienced with that kind of thing than I am, though – it’s bound to have come up with other elements before. Stick to what CommonMark usually does, I guess.
In the Discourse context the main reason is cause I can not make this an inline rule, it would have to be a post process rule that walks through text nodes. ]【 etc, are skipped in inlines. So you would get no formatting in these tags.
I’m mainly afraid that sites/software/services will implement CommonMark 1.0 and be done with it, with most never migrating to later versions as the first “complete” version is “good enough”. In that case, features like this would be relegated to relative obscurity, which is a terribly sad thought.
【】 is commonly used for titles or categories (e.g. 【速報】(news)) in Japanese text. So I feel 【】for ruby is a bad idea. I don’t know why Japanese StackExchange chose it. Seriously I want to ask about this decision to StackOverflow’s developers.
Aozora Bunko also has ruby’s syntax for plain texts. It is《》. (description and example.) This syntax is literally the biggest example of ruby syntax in the world. Originally, 《》 is used by the Books for the Blind Association in Japan. And Aozora Bunko follows it. If you say “let’s use【】 for ruby”, then I will also say “why not 《》? ;-)”. Aozora Bunko doesn’t adopt Markdown though.
【】and 《》 are hard to type for normal keyboards. It’s also a bit hard even with Japanese IME. Currently all of syntaxes for Markdown are composed of singlebyte characters and these are easy to type with English keyboard. I hope this defacto standard will be kept.
I believe ruby is not only for japanese text. It is just a expression of all language’s text. Someday it may be used with a great idea for some language. I mean we don’t need to stick to specific language like Japanese.
Did I stop the discussion?
Sorry if it was offensive.
I just gave the above opinion as just a person in Japan.
Of course, it is just my personal opinion , and I don’t obsess it.
I hope that the discussion will resume again.
If we were strictly implementing a friendlier way to mark CJK pronounciations, then any of the syntax proposed by @tom-n would work.
Taiwanese Bopomofo also adapts a convention like 國語辭典(ㄍㄨㄛˊ ㄩˇ ㄘˊ ㄉㄧㄢˇ), so a syntax that counts N space-separated components in parentheses and backtrack-applying them to N kanji characters (\p{Han} in regexp) would be decent. The only problem is the potential complexity introduced to parser.
On the other hand, we could stick to language-neutral syntax, given that people are using ruby text more expressively nowadays. A good example is marking katakana word with their original language (アクセラレータ), or supplementing aboriginal place names with literal meaning (Iranmeylek), which are both not necessarily kanji.
Something like {漢字}(かん-じ) might be a good general syntax, and we may suggest a full-width standard alternative that is up to individual implementations to adapt.
I hope that we can make the syntax for ruby annotation as semantic general as much as possible.
Actually, I’m contemplating to use ruby annotation for classic text annotation, for example, to render better reading experience for many inline annotation or commentary. For example,
1 From Paul, an ·apostle [messenger] of Christ Jesus. ·I am an apostle because that is what God
wanted [L …by the will of God].
To ·God’s holy people [T the saints] living in Ephesus[a] [C a prominent city in the Roman province
of Asia, present-day western Turkey; Acts 19], ·believers in [or who are faithful to] Christ Jesus:
I’m looking for a markdown implementation that I can convert the above text with the annotation with ruby annotation, similar to the above examples.
I think that for my interest, the syntax {漢字}(かん-じ) would be great!
Please refer me to any potential lead. Thanks in advance!
An interesting feature of the HTML markup are <rp> tags designed to add parentheses or other fallback rendering for browsers old that don’t understand the ruby tags.
In similar spirit, whatever syntax extension you invent for markdown, many markdown converters will not understand it, and likely dump the syntax as-is to the output. Is that a consideration? Which, if any, of these syntaxes are acceptable fallbacks? For example the caret in syntaxes like “[図書館]^(としょかん)”, while is has a nice geeky meanins, probably looks silly if dumped as-is to the output?
I published a ruby library that supports ruby element in Markdown.
The syntax I chose is
[漢字(かんじ)]
Reasons being:
It looks like Markdown
It does not need to type any special brackets 【】
You can annotate each character
Linkify ruby text is the same as how you would do it in Markdown
An example
[漢字(かんじ)](https://jisho.org/search/漢字)
# Annotate each character
[漢(かん)][字(じ)]
# Link separately
[漢(かん)](https://jisho.org/search/漢)[字(じ)](https://jisho.org/search/字)
# Link together
[[漢(かん)][字(じ)]](https://jisho.org/search/漢字)
You can see how above renders at here. I have been using this to write 150 posts involves ruby markups and it works fine.
Curious what @jgm thinks here. This issue is very foreign to me as a non Japanese speaker but feels incredibly common to 120 so million people who would find this very handy.