lit-prog-book/lp-in-org.org

180 lines
13 KiB
Org Mode
Raw Normal View History

#+TITLE: Literate Programming in Org
#+AUTHOR: Howard Abrams
#+EMAIL: howard.abrams@gmail.com
#+DATE: 2016 Oct 06
#+DESCRIPTION: This file is used as the basis for the Info documentation
#+OPTIONS: ':t toc:t author:t email:t
#+LANGUAGE: en
#+MACRO: version 1.0
#+MACRO: updated last updated 1 August 2024
#+TEXINFO_FILENAME: lp-in-org.info
#+TEXINFO_HEADER: @syncodeindex pg cp
#+TEXINFO_HEADER: @syncodeindex vr cp
# NOTE: To create/view info, execute the following with `C-x C-e':
# (progn (find-file (org-texinfo-export-to-info)) (Info-mode) (Info-top-node))
This is a book on /Literate Programming in Emacs/ using Org Mode.
Use literate programming as a /style/ to aid in discovery, exploration and clarity of code.
#+begin_quote
Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
—Donald Knuth in "Literate Programming", The Computer Journal 27 (1984), p. 97. (Reprinted in /Literate Programming/, 1992, p. 99.)
#+end_quote
* Introduction
In a computer program (no matter what the computer language), we write code, as a first class citizen, without ornamentation, but we /comment/ the code with some sort of marker, e.g. a symbol to signify the start and end, like =/* and =*/=, or a single symbol, like =#= or =//= to highlight the rest of the line.
Literate programming is a style of coding where we change the paradigm as what would normally be the comments is the focus, and the code is ornamented. When Donald Knuth originally proposed the idea in 1984, text editing was still in an infant stage, and writing LP was klunky. However, with modern editors, like Emacs (can I really claim, with a straight face, that Emacs is modern), literate programming in org files can be smooth.
We assume the reader of this book to be fairly proficient with [[info:Emacs][Emacs]] keybindings, and at least, a passing familiarity with [[info:Org][editing Org Mode files]], but we dont assume, youve grokked the [[info:org#Working with Source Code][literate programming features]] of Org.
As you probably know, Org is large, and the features for writing, evaluating and connecting blocks of source code in a document are extensive, and documenting them all is a daunting task. This book attempts to both guide and inspire a programmer to enjoy coding in a /iterate way/.
** Background
Donald Knuth invented Literate Programming in the 1980s in an attempt to emphasize communication.
Playing with the idea that a "program" shouldn't be only computer instructions, but more like /literature/, he called his approach, [[http://en.wikipedia.org/wiki/Literate_programming][literate programming]].
In his 1984 essay "Literate Programming", republished in CSLI, 1992, pg. 99, he [[http://www.literateprogramming.com][wrote]]:
#+begin_quote
I believe that the time is ripe for significantly better documentation of programs, and that we can best achieve this by considering programs to be works of literature. Hence, my title: "Literate Programming."
The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible because its concepts have been introduced in an order that is best for human understanding, using a mixture of formal and informal methods that reinforce each other.
#+END_QUOTE
Wanting programs to be written for human understanding, with the order based on logic of the problem, and not constrained to deficiencies in the programming language, we create a /literate programming document/ that generates a document for people *and* the source code files.
The idea is to invert /code/ peppered with /comments/ to /prose/ interjected with /code/.
Originally, a pre-processing program would then write the code blocks out into a source code file (called /tangling/) and create a published document of both the prose and the code formatted for reading (called /weaving/).
What happened to his concept and why dont we program this way?
After introducing the concept in a white paper, he expanded the idea by publishing an example of how the source code would be written in [[http://en.wikipedia.org/wiki/Jon_Bentley][Jon Bentley]]s “Programming Pearls” column [Communications of the ACM 29, 5 (May 1986), 364-3691].
[[http://en.wikipedia.org/wiki/Douglas_McIlroy][Doug McIlroy]] added a rebuttal where he boiled Knuths code into a single (now famous) shell command:
#+begin_src bash
tr -cs A-Za-z '\n' |
tr A-Z a-z |
sort |
uniq -c |
sort -rn |
sed ${1}q
#+end_src
McIlroy invented the shell pipe as well as many of those command line tools. Hes quoted as saying:
#+begin_quote
A wise engineering solution would produce—or better, exploit—reusable parts.
#+end_quote
His example proved his point.
Perhaps this process was a bit too much writing for most engineers, who view code comments as unnecessary, over-sized baggage requiring maintenance.
Isnt our goal to write /readable code/?
While the resulting source code /tangled/ from a literate programming document, may look the same as a source file coded directly, this idea did not significantly change our industry.
Some projects like:
- [[http://www.oracle.com/technetwork/java/javase/documentation/index-jsp-135444.html][Javadoc]] for Java
- [[https://www.sphinx-doc.org][Sphinx]] for Python
- [[http://www.stack.nl/~dimitri/doxygen/][Doxygen]] for C and other languages
Can extract an API from the comments of the source code could be viewed as a /step/ toward literate programming. [[https://wiki.haskell.org/Literate_programming][Haskell]] has a partial implementation built into the compiler so that it doesn't require a special comment syntax or an external macro system.
In most of the systems listed above, the code, not the logic, drives the presentation order. For instance, many languages require imports, variable definitions and functions to be declared before use, and one cant image literature beginning with such a way. Knuth's original "WEB" program allowed a code block to refer (include) another code block, allowing the author to describe the code in any order that made the most sense. This ended the debate about: top-down vs. bottom-up.
Knuth's original /literate programming/ approach had minimal editor support, as he only wrote the WEB program as a pre-processor to create (/weave)/ the documentation and write (/tangle)/ the source code.
From my perspective, literate programming can only be useful with help from an editor, for example, many scientists use [[http://ipython.org/notebook.html][iPython's notebook]], now expanded as the [[https://jupyter.org/][Jupyter Project]]. However, unlike iPython's storage of the files in JSON format, I think a literate file should be readable text, as [[http://transcriptvids.com/v/oJTwQvgfgMM.html][Carsten Dominik]], the creator of Org, wrote:
#+begin_quote
“In the third millennium, does it still make sense to work with text files? Text files are the only truly portable format for files. The data will never get lost.”
#+end_quote
An Org file, with its readable syntax, and amazing support from Emacs, gives a programmer a good environment to discover, explore and clarify complex code in a literate way.
*Further Reading:*
- [[http://orgmode.org/worg/org-contrib/babel/how-to-use-Org-Babel-for-R.html][Introduction to org-mode's Babel Project]] for teaching Emacs to do Literate Programming
- [[http://orgmode.org/worg/org-contrib/babel/intro.html][Reference material for the Babel Project]]
- [[http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/][More Shell, Less Egg]] is a good historical essay on this subject
- [[http://www.hazyblue.me/2014/02/where-have-all-the-literate-programs-gone/][Where have all the Literate Programmers gone?]]
** Advantages of LP
Some of the advantages of literate programming for your code include:
- Clarification of your thoughts of complicated situations
- Better documentation for your source code
- Great for team communication for issues and problems
- Inter-language facility for using the /best tool for the job/ (for instance, querying a database and then manipulating it with a general purpose language)
The advantages of literate programming in Org is the advantage of Org itself:
- Text formatting, like emphasized text and lists
- Org's /organizational/ features, like embedded heading sections marking subtrees
- Tasks management, like Agendas, embedded /with your code/
- Note-oriented REPL for investigating new libraries and APIs
I made this last point as part of my essays on [[https://howardism.org/Technical/Emacs/literate-devops.html][Literate Devops]]. Briefly, REPLs can be a wonderful approach to discovering features of libraries and modules, as one types expressions, and sees the results. You can view a shell running in a terminal as a REPL.
A problem arises when the programmer needs to return to the results of past commands and expressions in the transient environment of a terminal.
With LP in Org, you can still type and evaluate an expression, but Emacs embeds the output (the P in REPL) back into your file buffer. As an added bonus, you can /name/ the results, and use that as input variables to other blocks of code (and these code blocks can be written in a different computer language).
However, if you are reading this book, you probably see the advantages, so lets begin a short journey to master this tool yourself.
* Getting Started
Since Emacs comes with Org, and Org comes with the ability to write literate programming, If you have a running Emacs instance, you begin your journey by opening up a file with an extension of =.org= (or any text file with =org-mode= enabled). This guide assumes basic familiarity with both Org and Emacs.
Since Emacs comes with Lisp, this *Getting Started* guide will use that language for our examples. In subsequent chapters, we will describe how to use different languages.
** Create a File
Create or open an Org file, and type the following:
#+begin_example
,#+begin_src emacs-lisp
"Hello World"
,#+end_src
#+end_example
Next, type ~C-c C-c~ (Control-c twice) and Emacs asks if you want to evaluate this code. To see the /results/ of evaluating that expression inserted back into your buffer after the marker, =RESULTS=, type =yes=.
While a classic, not a very good example. Lets try again with the following code block:
#+begin_example
,#+begin_src emacs-lisp
(truncate (* (sin .438) 100))
,#+end_src
#+end_example
Now type ~C-c C-c~ again. Notice the answer to the Great Question of Life, the Universe, and Everything appears as the =RESULTS= of evaluating your amazing Lisp code.
That, my friend, is the beginning of your adventure.
A few points. First, you typed a lot of stuff to see a number or string. Well start to file away such roughness to your workflow. This book contains a lot of tips, and youll see that programming literately can be just as fast as regular programming.
Second, the part, =emacs-lisp= is the language or subsystem to call for evaluation of a code block. Well show how you can use your favorite language, or even systems to generate images, call web services, and update tables in a database.
** Creating src Blocks Quickly
You dont have to type the entire text for src blocks, as Org comes with this ability, [[info:org#Structure Templates][Structure Templates]]. Type ~C-c C-,~ and a buffer appears allowing you to type ~s~ to have the bulk of the =src= code block inserted into your buffer.
Another approach is to use =org-tempo=, a template expansion feature. To kick-start this feature, press ~M-S-;~ and type the following:
#+begin_src emacs-lisp
(require 'org-tempo)
#+end_src
Note: If you are reading this in an Emacs buffer, you can also place your cursor at the end of that parenthesized s-expression and type ~C-x C-e~ to evaluate it.
At this point, you can begin a line with =<s= and hit ~TAB~ to have a src block expanded, with the cursor left at the end of first line, allowing you to type =emacs-lisp=.
This is Emacs, you probably have your favorite template expansion, like [[https://elpa.gnu.org/packages/doc/tempel/tempel.html][TempEL]] or [[https://www.emacswiki.org/emacs/Yasnippet][Yasnippet]], as any system that can generate your text works fine.The magic isnt hidden in markers, but shines plainly in the text itself.
* Working with Python
* Calling out to the Shell
Can we do both Bash, Fish and Powershell?
* Creating Illustrations
** Graphviz
** PlantUML
** Pikchr