Remove/reverse Hard wrapping in MD output

#1

Does anyone know of a way to remove hard wrapping in the output of a MD file within paragraphs?
Background - we receive English MD files to translate the content into Japanese and the creators of the MD files are using various different authoring tools which are adding line breaks in sentences at certain lengths. Some people can turn this off and others can’t in the tools they use but we don’t really have control of all the people creating the content.
We are trying to work out how we can reverse/remove these hard wrappings in the MD files in paragraphs. We could do it manually but it would take far too long. We have tried different regex but becomes to complex to not remove the ones that need to stay.
The closest I have come is opening the attached MD in Typora and saving as HTML (without styles). Opening the HTML in a browser as the browser seems to parse the files correctly and ignores the line breaks and then paste the html back into a new Typora page. However the table from the attached gets messed up and Typora uses a different flavour output of MD.

Can anyone think of any other creative ideas to help me achieve the removal all these unnecessary line breaks in the MD files? There may be a very simple solution that we just don’t know about as I am not an MD

Sample file
https://drive.google.com/file/d/1-mHxzyidcXsQ0wzOaKHKkM88hOf4SXv9/view?usp=sharing

0 Likes

#2

If you don’t mind having incidental changes
introduced to the markdown (e.g. different
bullet list markers, changing link styles,
header styles), you could try

pandoc -f gfm -t gfm --wrap=none

assuming what you want is commonmark with github
extensions.

I’m working on a Haskell commonmark parsing library
that will be usable for creating linters and
reformatters, but not there yet.

0 Likes

#3

Thanks jgm,
Your answer sounds good but has gone straight over my head as I am a complete beginner with no MD experience. I am just trying to solve a translation problem so I will need to get help somehow with what you just answered to see what the output would look like.

0 Likes

#4
  1. Get pandoc: https://pandoc.org/installing.html
  2. Orient yourself with the “getting started” instructions
  3. When you’re comfortable, try the command I suggested above. For example, if your markdown file is file.md, you could do
    pandoc -f gfm -t gfm --wrap=none file.md -o newfile.md
    
    in the terminal. This says “convert gfm to gfm, with unwrapped lines.” It will create a new file without wrapped lines. However, as I note, there may be other differences too—perhaps more than you want. It’s something you can try.
1 Like