Detox Content Production
Recently, in a Knowledge Manager Community meeting, someone raised the question “How can I convert Word files to Markdown without getting crazy?” A lively debate was the consequence, everybody had something to report about nested tables, unconvertable drawings, and the like. With the rise of OneNote, things got even worse — again, Microsoft successfully applied their old strategy of candying proprietary offerings (functionally often not the best) in a way that corporate decision makers are tempted to grab and buy without thinking about the consequences.
Coming from a technical background with the most complex document production scenarios — software code creation — , I always feel a pain when seeing these corporate knowledge debris fields. How many people are busy with converting Word to Excel, then to Powerpoint, and back? How many silos do exist where is no way of recycling written content or knowledge pieces except for copying, pasting, cleaning the mess and starting anew? How much did you curse about the document with nice tables that didn’t react to screen size changes on your mobille? How much training budget have you spent to teach people thousands of Office functions they most likely will never need and most likely will abuse heavily? How many IT projects did you generate to do “Text Mining” in old document dump sites on the central file server?
The quick ones detect that something is weird and start looking for alternatives. And suddenly, Markdown begins to leave the shadows of nerd’s nether world, creeping in in grassroots initiatives, into business environments, and suddenly — you have to convert Word to Markdown because it became a strange but vital part in your text production pipeline.
Trying to convert complexity into simple pieces doesn’t scale, and you always lose something on the way. Or even worse, someone is inventing a tool that brings special Markdown extensions to create unreadable pseudo-Markdown code to sell his post processor which regenerates the original look and feel of the office document. When you are on that track, let me tell you — you got it all wrong.
Say slowly “Mark-Down” and think about it. What does that mean, anyway? The inventor chose the name deliberately to make a point against the complex mark-up languages that define current documents — HTML, XML, and the like are text-based languages too. They all can manage semi-structured content but they are very costly to learn, to manage and to process. So, Markdown is the result of a thoughtful reflection of what is really needed (at least, following Pareto, in the usual 80% of cases).
Several thought threads are interesting to follow here
- Unstructured data is still data, not pixel garbage,
- Be prepared to manage the content stream, not so much the artifact,
- Who needs extensive content decoration anyway?
This post is a spontaneous writing in an unused slot between two video meeting sessions. I will follow these threads later on in separate articles, stay tuned :)