From Stephen Wolfram, a nice phrase to describe the sorts of thing you can create using tools like Jupyter notebooks, Rmd and Mathematica notebooks: computational essays that complements the “computational narrative” phrase that is also used to describe such documents.
Wolfram’s recent blog post What Is a Computational Essay?, part essay, part computational essay, is primarily a pitch for using Mathematica notebooks and the Wolfram Language. (The Wolfram Language provides computational support plus access to a “fact engine” database that ca be used to pull factual information into the coding environment.)
But it also describes nicely some of the generic features of other “generative document” media (Jupyter notebooks, Rmd/knitr) and how to start using them.
There are basically three kinds of things [in a computational essay]. First, ordinary text (here in English). Second, computer input. And third, computer output. And the crucial point is that these three kinds of these all work together to express what’s being communicated.
In Mathematica, the view is something like this:
In Jupyter notebooks:
In its raw form, an RStudio Rmd document source looks something like this:
A computational essay is in effect an intellectual story told through a collaboration between a human author and a computer. …
The ordinary text gives context and motivation. The computer input gives a precise specification of what’s being talked about. And then the computer output delivers facts and results, often in graphical form. It’s a powerful form of exposition that combines computational thinking on the part of the human author with computational knowledge and computational processing from the computer.
When we originally drafted the OU/FutureLearn course Learn to Code for Data Analysis (also available on OpenLearn), we wrote the explanatory text – delivered as HTML but including static code fragments and code outputs – as a notebook, and then ‘ran” the notebook to generate static HTML (or markdown) that provided the static course content. These notebooks were complemented by actual notebooks that students could work with interactively themselves.
(Actually, we prototyped authoring both the static text, and the elements to be used in the student notebooks, in a single document, from which the static HTML and “live” notebook documents could be generated: Authoring Multiple Docs from a Single IPython Notebook. )
Whilst the notion of the computational essay as a form is really powerful, I think the added distinction between between generative and generated documents is also useful. For example, a raw Rmd document of Jupyter notebook is a generative document that can be used to create a document containing text, code, and the output generated from executing the code. A generated document is an HTML, Word, or PDF export from an executed generative document.
Note that the generating code can be omitted from the generated output document, leaving just the text and code generated outputs. Code cells can also be collapsed so the code itself is hidden from view but still available for inspection at any time:
Notebooks also allow “reverse closing” of cells—allowing an output cell to be immediately visible, even though the input cell that generated it is initially closed. This kind of hiding of code should generally be avoided in the body of a computational essay, but it’s sometimes useful at the beginning or end of an essay, either to give an indication of what’s coming, or to include something more advanced where you don’t want to go through in detail how it’s made.
Even if notebooks are not used interactively, they can be used to create correct static texts where outputs that are supposed to relate to some fragment of code in the main text actually do so because they are created by the code, rather than being cut and pasted from some other environment.
However, making the generative – as well as generated – documents available means readers can learn by doing, as well as reading:
One feature of the Wolfram Language is that—like with human languages—it’s typically easier to read than to write. And that means that a good way for people to learn what they need to be able to write computational essays is for them first to read a bunch of essays. Perhaps then they can start to modify those essays. Or they can start creating “notes essays”, based on code generated in livecoding or other classroom sessions.
In terms of our own learnings to date about how to use notebooks most effectively as part of a teaching communication (i.e. as learning materials), Wolfram seems to have come to many similar conclusions. For example, try to limit the amount of code in any particular code cell:
In a typical computational essay, each piece of input will usually be quite short (often not more than a line or two). But the point is that such input can communicate a high-level computational thought, in a form that can readily be understood both by the computer and by a human reading the essay.
So what can go wrong? Well, like English prose, can be unnecessarily complicated, and hard to understand. In a good computational essay, both the ordinary text, and the code, should be as simple and clean as possible. I try to enforce this for myself by saying that each piece of input should be at most one or perhaps two lines long—and that the caption for the input should always be just one line long. If I’m trying to do something where the core of it (perhaps excluding things like display options) takes more than a line of code, then I break it up, explaining each line separately.
It can also be useful to “preview” the output of a particular operation that populates a variable for use in the following expression to help the reader understand what sort of thing that expression is evaluating:
Another important principle as far as I’m concerned is: be explicit. Don’t have some variable that, say, implicitly stores a list of words. Actually show at least part of the list, so people can explicitly see what it’s like.
In many respects, the computational narrative format forces you to construct an argument in a particular way: if a piece of code operates on a particular thing, you need to access, or create, the thing before you can operate on it.
[A]nother thing that helps is that the nature of a computational essay is that it must have a “computational narrative”—a sequence of pieces of code that the computer can execute to do what’s being discussed in the essay. And while one might be able to write an ordinary essay that doesn’t make much sense but still sounds good, one can’t ultimately do something like that in a computational essay. Because in the end the code is the code, and actually has to run and do things.
One of the arguments I’ve been trying to develop in an attempt to persuade some of my colleagues to consider the use of notebooks to support teaching is the notebook nature of them. Several years ago, one of the en vogue ideas being pushed in our learning design discussions was to try to find ways of supporting and encouraging the use of “learning diaries”, where students could reflect on their learning, recording not only things they’d learned but also ways they’d come to learn them. Slightly later, portfolio style assessment became “a thing” to consider.
Wolfram notes something similar from way back when…
The idea of students producing computational essays is something new for modern times, made possible by a whole stack of current technology. But there’s a curious resonance with something from the distant past. You see, if you’d learned a subject like math in the US a couple of hundred years ago, a big thing you’d have done is to create a so-called ciphering book—in which over the course of several years you carefully wrote out the solutions to a range of problems, mixing explanations with calculations. And the idea then was that you kept your ciphering book for the rest of your life, referring to it whenever you needed to solve problems like the ones it included.
Well, now, with computational essays you can do very much the same thing. The problems you can address are vastly more sophisticated and wide-ranging than you could reach with hand calculation. But like with ciphering books, you can write computational essays so they’ll be useful to you in the future—though now you won’t have to imitate calculations by hand; instead you’ll just edit your computational essay notebook and immediately rerun the Wolfram Language inputs in it.
One of the advantages that notebooks have over some other environments in which students learn to code is that structure of the notebook can encourage you to develop a solution to a problem whilst retaining your earlier working.
The earlier working is where you can engage in the minutiae of trying to figure out how to apply particular programming concepts, creating small, playful, test examples of the sort of the thing you need to use in the task you have actually been set. (I think of this as a “trial driven” software approach rather than a “test driven* one; in a trial, you play with a bit of code in the margins to check that it does the sort of thing you want, or expect, it to do before using it in the main flow of a coding task.)
One of the advantages for students using notebooks is that they can doodle with code fragments to try things out, and keep a record of the history of their own learning, as well as producing working bits of code that might be used for formative or summative assessment, for example.
Another advantage is that by creating notebooks, which may include recorded fragments of dead ends when trying to solve a particular problem, is that you can refer back to them. And reuse what you learned, or discovered how to do, in them.
And this is one of the great general features of computational essays. When students write them, they’re in effect creating a custom library of computational tools for themselves—that they’ll be in a position to immediately use at any time in the future. It’s far too common for students to write notes in a class, then never refer to them again. Yes, they might run across some situation where the notes would be helpful. But it’s often hard to motivate going back and reading the notes—not least because that’s only the beginning; there’s still the matter of implementing whatever’s in the notes.
Looking at many of the notebooks students have created from scratch to support assessment activities in TM351, it’s evident that many of them are not using them other than as an interactive code editor with history. The documents contain code cells and outputs, with little if any commentary (what comments there are are often just simple inline code comments in a code cell). They are barely computational narratives, let alone computational essays; they’re more of a computational scratchpad containing small code fragments, without context.
This possibly reflects the prior history in terms of code education that students have received, working “out of context” in an interactive Python command line editor, or a traditional IDE, where the idea is to produce standalone files containing complete programmes or applications. Not pieces of code, written a line at a time, in a narrative form, with example output to show the development of a computational argument.
(One argument I’ve heard made against notebooks is that they aren’t appropriate as an environment for writing “real programmes” or “applications”. But that’s not strictly true: Jupyter notebooks can be used to define and run microservices/APIs as well as GUI driven applications.)
However, if you start to see computational narratives as a form of narrative documentation that can be used to support a form of literate programming, then once again the notebook format can come in to its own, and draw on styling more common in a text document editor than a programming environment.
(By default, Jupyter notebooks expect you to write text content in markdown or markdown+HTML, but WYSIWYG editors can be added as an extension.)
Use the structured nature of notebooks. Break up computational essays with section headings, again helping to make them easy to skim. I follow the style of having a “caption line” before each input. Don’t worry if this somewhat repeats what a paragraph of text has said; consider the caption something that someone who’s just “looking at the pictures” might read to understand what a picture is of, before they actually dive into the full textual narrative.
As well as allowing you to create documents in which the content is generated interactively – code cells can be changed and re-run, for example – it is also possible to embed interactive components in both generative and generated documents.
On the one hand, it’s quite possible to generate and embed an interactive map or interactive chart that supports popups or zooming in a generated HTML output document.
On the other, Mathematica and Jupyter both support the dynamic creation of interactive widget controls in generative documents that give you control over code elements in the document, such as sliders to change numerical parameters or list boxes to select categorical text items. (In the R world, there is support for embedded shiny apps in Rmd documents.)
These can be useful when creating narratives that encourage exploration (for example, in the sense of explorable explantations, though I seem to recall Michael Blastland expressing concern several years ago about how ineffective interactives could be in data journalism stories.
The technology of Wolfram Notebooks makes it straightforward to put in interactive elements, like Manipulate, [
interactivein Jupyter notebooks] into computational essays. And sometimes this is very helpful, and perhaps even essential. But interactive elements shouldn’t be overused. Because whenever there’s an element that requires interaction, this reduces the ability to skim the essay.”
I’ve also thought previously that interactive functions are a useful way of motivating the use of functions in general when teaching introductory programming. For example, An Alternative Way of Motivating the Use of Functions?.
One of the issues in trying to set up student notebooks is how to handle boilerplate code that is required before the student can create, or run, the code you actually want them to explore. In TM351, we preload notebooks with various packages and bits of magic; in my own tinkerings, I’m starting to try to package stuff up so that it can be imported into a notebook in a single line.
Sometimes there’s a fair amount of data—or code—that’s needed to set up a particular computational essay. The cloud is very useful for handling this. Just deploy the data (or code) to the Wolfram Cloud, and set appropriate permissions so it can automatically be read whenever the code in your essay is executed.
As far as opportunities for making increasing use of notebooks as a kind of technology goes, I came to a similar conclusion some time ago to Stephen Wolfram when he writes:
[I]t’s only very recently that I’ve realized just how central computational essays can be to both the way people learn, and the way they communicate facts and ideas. Professionals of the future will routinely deliver results and reports as computational essays. Educators will routinely explain concepts using computational essays. Students will routinely produce computational essays as homework for their classes.
Regarding his final conclusion, I’m a little bit more circumspect:
The modern world of the web has brought us a few new formats for communication—like blogs, and social media, and things like Wikipedia. But all of these still follow the basic concept of text + pictures that’s existed since the beginning of the age of literacy. With computational essays we finally have something new.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more…