``
title:                  IDK Markup goals
last_modification_date: 2024/10/13
creation_date:          2024/10/13
summary:                From 1970 and the new age of computing until today, text markups emerged naturally. Today just a few of them are used largely. Because computer age is still a young discipline, these markups are not perfect and lack some features. In next years we will surely found better ones that will replace them entirely with better features, IDK is a try to be one of them. 
output_path:            "/html/idk/idk_goals.html"
category:               "idk;"
``

Why another markup?
  Being a person which make notes all the time, when reading pdf books, I was trying to make annotation
  We already have several markups out there which are used very heavily, Markdown and XML are most popular ones. XML is data oriented but is ineficient in several ways. 
  Mardown is a presentional markup, having a lot of what I would want but is not a semantic one. Almost, because for example, the primary limitation of Markdown is that it is linked to the HTML markup, which is far from perfect and serve other purpose now (HTML5) now than representing text. If you check "https://spec.commonmark.org/"(Commonmark specs) which is a better specification, it does exactly the same. Unlink HTML from the markup remove a lot of constraints and permit better data representation and links inside a text. 
  The IDK markup goal is to have both of the two worlds, a markup which balance towards a plain text format and towards data representation.
  
How it started
  Being someone which write up eveything I found interesting on my smartphone, computer or paper. I have a very need to be able to make a search for specific ones when I am working or reading on a subject. Sometimes this is directly written by myself but sometimes it come from a source I am reading. In this case I need an untrusive way to quickly pick the text and move on in order to avoid cutting the reading flow as possible.
  
  Text extraction software are clunky or slow or both
    I have tried several softwares for that.
      Numeric book data extraction: 
        a. __Web Browsers__
         . __SumatraPDF__
         . __Calibre__
         . __Adobe Acrobat__
         . __Foxit__
        some more ...
        
      All of them does not support text highlighting properly: clunky selection, data extraction not well designed or they completely lack it. For example Microsoft Edge drop them recently. The reason why they fail to provide these features is because these software are for different use than that, which made me think that a specialized one is needed (but not present in the market). 
        
      Note taking:
        a. __Obsidian__
         . __Logseq__
      They use a subset of Markdown, which is a great markup to specify semantic to a text. But it lack some features I would want.
      It is not as readable as it could be for my taste, because they provide visuals and user experience on top it is ok to use it for that, but Markdown is a very bad markup for data representation They go around that by leveraging the software but it still means that all the software is built around it, which make it hard to implement innovative features.
      A highly wanted feature would be to be able to highlights chunk of text from a numeric file and add some details to it, which we could link to other notes and be searchable, they don't have it. It means if we want to add the text taken from the source we must write ourselves all the details. It would'nt be the end of the end if we didn't losse a very important piece of data : its _context_.
      Any text written has a specific context, its date of creation telling us the state of the world when it was written, the whole text in which the chunk of text was written and much more.
      The data is here somewhat though, when we read an interesting text, some piece of it is more important than others, if we just extract them and leave the rest unsaved, it will miss a lot of the information. Maybe it makes sense when we write it on the fresh day, but read back the text in few years and the context is gone completely. It is not rare to read something you've written and don't have a clue why you wrote it. This happens a lot on quick notes, but still, being able redive on enought depth on the subject is key sometimes.
           
    They don't fit
      We can see that they are not the tools for my use case, which means that if I don't create one myself, there is low chance to find later one that will. Because this is very important to me, I can't live without it, I need to build one.
    
    Why?
      If you want a software   
    
  How to fix it:
    1. Seamless UX.
     . Performant.
     
What would we want as a text markup?
  1. Semantic
  2. Presentional
  3. Programmable
  4. Automatic data extraction
  5. Meta capacity
  
Data oriented
  The markup must provide ways to search and mark text as data, numeric, token, text. All of them should have ways to encode fine details in them.
  
Simple writing
  The markup should be usable at first try, added feature should not break that. If someone new to it is having hard time to understand why the way he wrotes the text fail or having lot of error while writing simple text, there is high probability that the markup should change.
  
While allowing complex ones
  It should have possibilities to encode complex writing and relations between part of texts.

A brise to read
  It should keep the basic writing features very easy to read, for comparison it should be a bit more readable than Markdown file format.

Good enough error/warning logs  
  Having very detailed logs is not the goal yet, but it must catch error with precise location and avoid stack trace like Java. The parsing should fix as possible as error it can, helping the writer to not much think about his writing.
  
Having two side faces
  It should be as a tool which have two personalities, one side will permit very easy writing, the other will be highly capable of transforming the text and represent its meaning.
    
Highly performant
  No compromise should be made on the performance, the parser should be able to parse hundreds of file, checks their links in a very short time scale. The purpose is to be able to use it for computation, for the bonus it will have responsive UX.

Simple to reproduce
  Something is right for every file format adoption, it must be easy to implement. To achieve that there is several ways, provide tools easy to use and integrate, be open. The markup should do all of them.

Roadmap
  1. Experimenting with the actual markup design.
  2. Fix the current specification (bugs, unperfect design).
  3. Add new features and specs to the markup.
  4. Finish the HTML/javascript conversion (IDK to HTML and HTML to IDK).
  5. Create tools for code editors integration.
  6. Create a stand alone software with a UI, which will use this markup but with enhanced writing and permit data visualization (graphs of data, permit saving of PDF chunk of text and give them details and more).
  
Having metawriting
  By having a possibily to do it will permit writer to alter the markup for their own needs. The way I see this is having a standard set of rules for the markup, then you permit anyone to add features which will never breaks the standard. Maybe a file that specify markup rules computation addition. Never allow to modify the standard is very important though.
  
Majority of features is available anywhere
  Majority of the IDK possibilities are available everywhere where it make sense. Id est a table's cell must have the formating capacity, footnote reference and more. So keep in mind that if you see a feature listed below there is high probability to be available for your exotic use case.

HTML conversion
    Be compliant to the standard
      IDK must produce HTML which is compliant to the specs, the output is checked regularly to see if it does.