7 - How the book works

7.1 - A 300-page waterslide for human attention

Now we have seen how human attention works, and the importance of pattern-recognition as an instinctive human trait.

We have some understanding of how this instinctive trait led to the evolution of writing and reading systems that goes back some 22,000 years.

We have seen how books alter our state of consciousness, how they capture and hold our attention.

We have glimpsed some of the factors at work in making reading easy or hard, and some of the care and attention to detail that goes into the typography of the book in order to make it invisible.

Now we can conduct a scientific analysis of the book and how the technology actually works.

It is hard to recognize technology at work in the book, since there are no flashing LEDs, no wheels, knobs or switches. There are no moving parts to this technology; because its magic is to stay still, but to move us. But it is basically a 300-page waterslide for human attention.

7.2 - The logical structure of the book

We’ll start our analysis of the book at the micro level.

The atomic component of the book is the letter. Formation of the letters is important, since these have to be easily recognizable (Tinker, Taylor, Nell, Tschichold, etc.).

Although typefaces vary in the shape of their letters, creative differences are only possible within fairly narrow boundaries. No matter how creative the type designers wishes to be, a lower-case “a” always has to be recognizable as such, or it is unusable. (This very fact is central to the US Government’s refusal to allow copyrighting of typefaces. Other Governments, e.g. Germany, take a different view and recognize the creativity involved in typefaces).

Design of letters is not merely about the shapes of the letters, but how those work together to form words. It is also about the visual balance between the counters – spaces inside characters - and the spaces outside. Some of this is the domain of the type designer, who has to consider how all the letters of her face work in combination with each other when creating them. Some of it is the province of the typesetter (human or computer) who has to define a letterspacing that makes the words as effortless to recognize as possible. (for more details, see Jan Smeijers excellent book, “Counterpunch”)

Typographers pay extremely close attention to letterspacing, since it is a core part of their work. Letterspacing should be as tight as possible without causing characters to collide. Typographic research documents that as letterspacing gets wider, word recognition becomes progressively harder. Letterspacing should remain constant throughout a book.

There is plenty of hard data from typographic research documenting how serif typefaces (e.g. Times New Roman, Palatino) work better for longer-duration reading tasks than sanserif faces (e.g. Helvetica, Arial).

The serifs fulfil two tasks. First, they aid in visually tying the individual letters together into word “gestalts”, making those units easier to recognize. Second, their direction helps to lead the eye along the horizontal path that makes for effortless recognition of successive “patterns” or words.

Researchers have failed to understand the significance of letter spacing and its effect on building easily-recognizable units of meaning. For example, one study carried out on letter-spacing attempted to gauge its effect by measuring subjects’ ability to recognize “pseudo-words” (i.e. groups of letters with no meaning) under varying letter-spacing conditions.

This study concluded that letter-spacing had no discernible effect on pseudo-word recognition.

The researchers were exactly right – and totally wrong at the same time. It is the fact that words have meaning, and that our brain recognizes units of meaning, that makes letterspacing important. The easier we make the brain’s task of recognizing units of meaning, the easier the document is to read. The study completely misses the point.

7.3 - Words and Lines

The next important piece of the technology is inter-word spacing. Research shows that this should be perceptibly wider than the inter-character spacing, to help point out the end of one word-recognition and the start of the next one. There is an optimum value for each type size. While actual spacing may vary slightly from this optimum, it can do so only within fairly tight limits, or it will interfere with the “flow” of the eye across the line of text.

Inter-word spacing should also be more or less constant throughout the whole of a book. This allows the reader to find and maintain his own “harmonic gait” or relaxed reading speed throughout the whole text.

The next level up in the technology is how this arrangement of words and spaces between them is assembled into lines. Research shows that line-length is important; an optimum value is around 66 letters and spaces per line at normal reading distance and in normal type sizes of 10-12 point. It also shows that ideally, line length should be constant, “cueing” the reader’s eye by keeping the start and end of each line in the same horizontal position. This way of setting text is called “justified”.

This brings up another set of factors. If line length is constant AND spacing between words cannot vary by more than a small amount, then the only way to achieve this is to hyphenate words. Research shows the obstacle of having to piece together two parts of a hyphenated word on successive lines means less effort for the reader than having to deal with variations in word spacing.

However, if words are to be hyphenated it must be done meaningfully, on the basis of syllables or etymology; the only way to do this is by using a language-specific dictionary that stores acceptable hyphenation points for words in that language. Algorithmic hyphenation is a very poor substitute; incorrectly-hyphenated words are blockers to the smooth flow of effortless attention.

It may be possible to save having to ship a hyphenation dictionary with every language by encoding “soft-hyphens”, or hyphenation opportunities, in electronic book or other computer-read text. Such soft hyphenation would be carried out by authoring tools.

If words are to be hyphenated and spacing adjusted microscopically, then this must be done at paragraph, not line, level. In some specific cases hyphenation is unacceptable, e.g. when a hyphenated part of a word lands as the only text left in a new line at the top of a page. In this case, decisions have to be made interactively about which is the “least worst” case: e.g. adding an extra line elsewhere in the text, (for instance at the start of a chapter, or on the previous page), or reducing inter-word spacing in that paragraph below the desired threshold value.

So line length, word spacing, number of lines and hyphenation are inter-related variables.

Distance between successive lines of text is another important factor. It should be constant throughout a book. In traditional typesetting, this is called “leading”, from the strips of metal that were put between lines of set metal type to control spacing.

Size of type, line length and leading are inter-related variables.

7.4 - Top-down analysis

Once we understand the method of assembling individual lines, it is worth changing our perspective on the book, and analyzing it from a “top-down” viewpoint.

Page size is important; “portrait” orientation gives more lines, of a better length, than landscape. It is ideally suited to the way we read, in fact, it evolved from basic reading principles. It is NOT an artifact left over from the past that needs to be left behind as we evolve reading technology; it is a result of the way we read, not a cause.

The page acts as a focal plane for the eyes. We use the high-acuity areas of the fovea and parafovea to read the text. The text area works best when proportional to the page size, creating a perspective that keeps the reader’s attention from wandering away from the text. The margins thus created also help the reader to unconsciously define the “field of recognition”. i.e. “this is the area of attention, where word recognition takes place”.

Understanding that reading is built upon the survival trait of pattern recognition reveals another way in which this “focal plane” works. Peripheral vision in animals and humans does not have the same focal resolution of high-acuity foveal and parafoveal vision. It is designed not to focus on pattern recognition, but to detect movement. Peripheral vision is used to detect threat or prey. It is the background detection system that tells us when and where to focus attention.

In reading, peripheral vision can remain in its “background/watchful” state, leaving us free to focus attention on the content.

Margins delineate the area between “attention” or focus and “ background watchfulness”.

At the top of the page, the eye begins on the first word. If more than a single word is contained in the fixation, constant inter-word spacing defines where one recognition ends and the next begins.

Now the attention (having been trained in the reading process) moves to the next line, repeating the same process for the number of lines on the page.

This flow is repeated from page to page.

7.5 - Visual Cues

At every stage, there are visual “cues” to help us. These cues are constant. Analyzing book typography from this perspective reveals a large number of possible variables that are given constant values (for a single instance of a book):

  • Letter shapes
  • Letter spacing
  • Word shapes (includes kerning, ligatures etc).
  • Inter-word spacing (varies, but only within a very tight range. Also contributes to lack of distracting rivers of white in text).
  • Left start position, right end position for line (defined by margins)
  • Line length
  • Interlinear spacing (leading)
  • Start of paragraph (indent)
  • Start of page (top margin)
  • End of page (bottom margin)
  • Start of Chapter (heading + additional space)

The effect of these constant values is that we can settle into a comfortable (because unconscious) reading pattern or gait.

7.6 - Disrupting the flow

Anything inside the text which disrupts this regular pattern has the effect of making an automatic, unconscious process become a conscious one: character shapes that are hard to identify, bad letterspacing that makes it an effort to recognize words, large variations in the spacing between words, and so on.

Tinker identifies all of these variables, and asserts that not only do they all have to be tuned to work together to make reading effortless, but that if only two or three are set sub-optimally, it can completely destroy reading efficiency.

It can thus be seen that the technology of a book is a complex engine to capture and hold human attention by directly hooking our innate pattern-recognizing behavior.

The book is a technology for Optimizing Serial Pattern Recognition. We get on this 300-page waterslide at Word One, and it is so designed that our attention slides from word to word, from line to line, and from page to page until we reach the last word. Of course, it is not quite that simple: we stop reading when other distractions, such as hunger, get in the way and rise to take priority. We also sometimes regress, perhaps to read a passage we did not fully understand, or enjoyed so much we want to go back and savor. But the natural dynamic is for a serial flow from start to finish.

There are other types of reading: encyclopaedias, reference books and so on, which we read in different ways. But they use the same basic mechanism to keep us in the passage which we are reading.

This technology is at least as complex as an internal combustion engine. And like an internal combustion engine, it takes only one or two variables wrongly “tuned” to make the whole engine vibrate with a dramatic loss in efficiency.

Precise attention to apparently insignificant tiny details such as word and letter-spacing is not disproportionate “fussiness”. It is by these and all the other details that the true power of setting up serial pattern recognition is achieved, making the recognition apparently effortless for the reader.

7.7 - Underlying mathematics

There is an underlying mathematics to OSPREY that can be captured in software code.

This mathematics is largely already known. Desktop publishing applications such as Adobe PageMaker, Quark Xpress and Microsoft Publisher do this today, provided they are driven with the correct parameters. Purists may gasp at the placement of Publisher in the same context as “professional” publishing applications. But in reality Microsoft Publisher 98, with its Quill pagination engine and underlying Line Services line-breaking engine, stands up extremely well in comparison.

All three of these examples, and the many other similar software packages on the market, suffer from basic shortcomings in relation to displaying text on screen for electronic books.

The first and most important of these is that they were all designed to produce print. As such, their screen display of text must adhere strictly to WYSIWYG. What their users really care about is what they will see in print. Screen text then has to be an exact representation of printer text. To achieve this, printer font metrics are used; this results in distortion of the screen display. Word and letter-spacing, and even the shapes of the characters themselves, are altered in order to make the lines match the printer output.

All defaults and adjustment of spacing are driven by these printer metrics: ideal screen parameters would be different.

All of these applications provide a framework of controls that allow the experienced user to achieve good results. None of them is today capable of taking text and automatically setting it to the right type size, measure, leading, page size, margins etc. for the eBook. However, the underlying code is perfectly capable of being tuned to do the job.

It is perfectly possible to program a set of “harmonic tunings” for the OSPREY variables and have code such as Quill and Line Services format incoming book text automatically.

results matching ""

    No results matching ""