Since the invention of writing up until very recently, documents have always been tangible objects, consisting of text, drawings or formulas inscribed or applied onto a surface: clay tablets, stone slabs, papyrus, vellum or parchment, and finally paper. How the document was formed by an author was irrelevant to the reader, as it could only be read only after being inscribed onto a tangible surface.
Computer technology has radically changed the means of preparing and distributing documents, but to many readers the electronically stored version is still considered to be irrelevant by comparison to the paper document. The concept that "the only real document is paper", is what I call the paper paradigm. A major paradigm shift in Kuhnian terms is required to accept that knowledge is now primarily maintained, retrieved and delivered in electronic (electromagnetic) formats. An increasing fraction of all the World 3 knowledge is either being created from the outset as electronic virtual documents or is rapidly being converted into such electronic formats. Today, paper documents are at best snapshots of the contents of what may be continually evolving electronic virtual containers for knowledge. Compared to ponderous paper, the virtual containers can be endlessly duplicated, distributed and retrieved at nearly the speed of light for essentially no cost.
Based on my personal background in scientific and technical writing, most of my discussion in this work focuses on literacy in the form of creating, retrieving and using textual documents. However, literacy in geometry, algebra and mathematics evolved in parallel with textual literacy, and these forms of expressed knowledge have been equally profoundly affected by the development of printing and computer processing (Hobart and Schiffman 1998). The impact of printing on literacy was discussed in Episode 1. In this episode I will focus on how the microelectronics technology described Episode 2 has radically altered what is signified by universal personal literacy.
Working in a paper paradigm, all of the author's cognition was done in the personal World 2. Once the knowledge was distilled it was transformed to words and committed to paper, perhaps with a pen or pencil or typewriter, or via dictation to another human who would transcribe the words to paper with pen, pencil or typewriter. The ponderousness of the process limited the number of drafts and required a prolonged cycle time from one edit to the next, and an even longer cycle time if work was sent to be printed.
Electronic word processing and desk top publishing (DTP) systems extended typewriter and typesetting technology, initially by providing a way to store and correct the content of documents before committing them to paper. From the typists' or printers' points of view, the new technologies enabled "letter perfect" paper documents or plates for printing to be produced without needing to rework the product after its contents were made tangible. Only when document files began to be distributed electronically via a network or floppy discs was anything new added to the paper paradigm.
Xerox's Palo Alto Research Centre played a major role in developing the architecture of a networked word processing environment108. Cringely (1996) and Hiltzik (1999) provide highly readable accounts of the personalities and business issues involved in introducing the new electronic technology. Although neither author writes from a specifically Kuhnian point of view, it is obvious from both histories that Xerox senior management was unable to understand the paradigmatic differences between paper and electronic documents in order to understand the world shatteringly revolutionary potential of the technology their own people developed. Had they exploited their inventions Xerox could easily have been a giant that would have exceeded the size of Microsoft and IBM combined.
The development of merge printing in the mid to late '70s linked early word processors to databases of customer details in order to produce personalised form letters. This added an element of automation to the process of typing documents. However, until the beginning of the 80's, word processing was limited to the environment of large corporations who could afford the mainframes and networks supporting the word processing environment (Eisenberg 1992). Personal computers changed the equation radically.
WordStar109, launched in June 1979 (Polsson, 1994–1999), was the first mass selling word processing system, and was one of three "killer" applications that led to an explosive growth in the number of personal computers being used as productivity tools by knowledge workers (Byte Magazine 1995). I purchased my first personal computer (a CP/M system) in 1981 for use as a word processor because it ran WordStar.
As printing enabled paper documents to be distributed physically to many readers, the electronic storage of word processed files allows the instant retrieval/copying of "electronic paper" on demand for reading anywhere in the world at the speed of light. Access to personal computers for reading electronic paper is still not as “universal” as literacy is for reading paper documents. Nevertheless, the explosive spread of personal computer–based authoring and retrieval systems certainly qualifies as a quantitative revolution in the nature of documents by comparison to the paper paradigm. However, as explained in the following paragraphs, word processing on its own has not caused any major revolution in cognitive processes.
The second of three “killer applications” providing a quantitative extension of human cognitive capabilities was the electronic spreadsheet, able to accurately calculate rows and columns of a table in seconds that would take hours to do by hand. Some mainframe spreadsheets had been developed by the early ‘60s, but these were only available to large corporations and government bodies (Mattessich, ????).
VisiCalc, introduced in October 1979 (Polsson, 1995–2000), soon after Wordstar, also helped turn the personal computer into a real productivity tool. Lotus 1–2–3, introduced in 1983, provided graphing functions in addition to column and row calculations, to enhance productivity still more.110
Spreadsheets extend human cognition by greatly improving speed and accuracy in analyzing columnar data. Assisted by computerized spreadsheets, humans can complete calculations that would otherwise be completely impractical because of the number of individual mathematical manipulations required and the inherent fallibility of human brains for mathematical processing. However, as for word processing systems, the applications did not fundamentally change the basic paradigm of columns and rows of data on one or several large sheets of paper.
During the 1960’s and 70’s, the need to improve data management drove large businesses and government organizations to install mainframe computers. In this environment, data was still controlled by Management Information Systems (MIS) departments, often tied to computer systems leased from IBM. Early database systems were basically little more than collections of two–dimensional tables, whereby the contents of specific rows of information in each table could be rapidly accessed by indexes to one or more columns of the table. This effectively turned data into information by providing contextual connections.. For example details of a customer’s account might be indexed by customer name.
The development of relational database111 (RDB) concepts in the 1970s112, allowing row contents of two or more tables to be indexed via relationships established via “join” tables, enabled much more efficient data structures to be developed through the process of "normalization". An RDB has no need to repeat information (such as a customer name and address) across different kinds of tables containing information relating to a particular customer. Customer contact details would be maintained in one table, transaction details in a second table, product pricing details in another, and payment details in a fourth table. A unique CustomerID identifies each customer, each order by an OrderID, each product by a ProductID, and each payment by a PaymentID. A single order of several products would be described by several entries in a "join" table indexed by the OrderID, relating the CustomerID to the Order ID on one side, and to each of the products included in the order on the other side. The join table would probably include details on the quantity delivered against each product line item because this information would be unique to particular order by the particular customer. Another join table would contain entries for payments received, linking customer details via the CustomerID, order details via the OrderID and payment details via the PaymentID. Report generators would then have access to all details required to print invoices without the requirement to maintain the complete set of customer and product details for each individual order. High level programming and query languages made it comparatively easy to develop processes to manipulate the data contained in the RDB.
Database Management Systems (DBMS), which allowed end users to develop their own database applications, were the third kind of killer application driving the spread of personal computing. The Vulcan database program for microcomputers was launched in August 1979. Ashton–Tate later marketed an upgraded version of the product as dBase II in 1981 (Polsson, 1995–2000). This combined relational table structures with a high level programming language that that simplified the building of applications to help collect and process the tabulated data. Oracle Corp, founded in 1977 is now a dominant supplier of RDB technologies.
Relational DBMS applications with their associated high level (i.e., 4th generation) programming languages greatly extended the clerical paradigm of keeping and processing records. Even small businesses could implement applications that enabled them to compete successfully with the very large commercial and government organizations that could afford large clerical staffs.
As will be discussed further below, DBMS tools also began to provide qualitatively new ways for managing data and simple kinds of information. In parallel with the development of the three killer applications discussed above, comparable computerized tools were developed for geometrical applications like engineering design/drawing and for working with algebraic formulas – first in mainframe environments (Hobart and Schiffman 1998) and then made available to the individual user. However, before beginning my discussion of qualitative revolutions, some additional comments on paper paradigms are warranted.
Moving data aggregation and document authoring and their associated delivery processes into the electronic environment is promoting a major shift in the document paradigm. As understood in the paper paradigm; documents, spreadsheets and data tables or card files are tangible objects (requiring slow and costly physical production, filing, sorting, indexing and distribution processes). In an electronic paradigm, the essence or content of the document or other information lives virtually in a computer memory or electronic or magnetic storage. Content can be instantaneously retrieved and displayed for viewing whenever and wherever required. Although revolutionary, the initial shift in technology from physical paper to electronic container is primarily quantitative. In this shift, the essential idea that the document or other information object is a discrete container for knowledge has not changed in the minds of many who manage the objects.
However, the paradigm shift from paper to electronic documents is far from complete, and has probably been impeded by Microsoft's dominance of the personal computing market. As will be seen below, as electronic media becomes dominant, Microsoft's dominance will probably wane, and even the concept of a document may become obsolete.
Today, the paradigm of electronic paper is supported by increasingly complex and cumbersome applications providing the ability to represent on–screen the exact appearance of paper outputs (WYSIWYG – what you see is what you get). The only reason they work at all is because computer processing power and speed have increased faster than the demands made by these huge applications. Microsoft and Apple competed for many years to provide graphical user interfaces (GUIs) to replicate the appearance of paper, to the point that the MS Windows 2000 operating environment grew to be one of the largest software applications ever created113. The contest has also been particularly fierce in the office software arena between Microsoft Office and WordPerfect Office, who have been competing for many years attempting to be all things for all people. The story of this competition is well documented because of the long–running anti–trust suite against Microsoft114.
I am no friend of Microsoft; because Microsoft's popular DOS and Word products supplanted technically superior products I used as productivity tools in my own cognitive toolkit, such as CP/M and WordPerfect. However, as Cringely (1996) and friends of Microsoft115 note, Microsoft did not create the circumstances that enabled its dominance of the market for operating systems and productivity tools. These authorities argue that Microsoft achieved dominance through being well poised to exploit the opportunity created by the popularity of the IBM PC hardware and by creating products that catered to customers' desires to work in a paper paradigm even though the underlying environment was electronic.
From my own experience, I would argue that Microsoft's success depended heavily on the power of the paper paradigm to influence corporate purchasing decisions. No matter how efficiently information is organized and managed, or how much epistemic quality has been added, a human person cannot use information stored in an electromagnetic environment until it is transformed into a format the human can perceive. Because early personal computer memory and processing resources were slow and expensive, the pioneering software applications (e.g., WordStar) managed data very efficiently with a few lines of code to produce character-based displays. However, many humans, whose perceptions were based on experiences with paper documents, did not like the esthetics of character–based displays. Application developers targeting large markets wanted to make their products more “user friendly”.
Xerox PARC's Alto and Star systems116 and Apple's Lisa and Macintosh systems established that operating environments and word processing systems providing WYSIWYG tools with bit–mapped graphical user interfaces (GUI) were much "friendlier" to user perceptions (Evans, et. al. 1999; Myers 1998). Microsoft introduced early versions of the Windows GUI to the PC market in 1985, which achieved a comparatively small market share compared to DOS because there were few applications using Windows117. The release of Windows was followed by the Windows GUI version of Word in 1989 and then by Excel and PowerPoint. Led by MS Word, the now synergistic Windows packages achieved dominance in their respective application areas around 1992. Because Microsoft's "killer" applications all used a similar paper paradigm, they achieved market dominance over products that were arguably technically superior (and had been dominant in the past) but lagged in providing the paper–based GUI. (And, of course, Microsoft did not make it easy for its competitors to use the Window's operating environment.)
The other factor which helped Microsoft achieve and maintain its almost total market monopoly over other WYSIWYG/Windows–based desk top publishing and word processing applications is what Liebowitz and Margolis (1999), Evans and Leder (1999) and Evans et al. (1999) call the "network effect" or "network externality"118. Liebowitz and Margolis (1998) define the concept as follows:
Network externality has been defined as a change in the benefit, or surplus, that an agent derives from a good when the number of other agents consuming the same kind of good changes. As fax machines increase in popularity, for example, your fax machine becomes increasingly valuable since you will have greater use for it. This allows, in principle, the value received by consumers to be separated into two distinct parts. One component, which in our writings we have labelled the autarky value, is the value generated by the product even if there are no other users. The second component, which we have called synchronization value, is the additional value derived from being able to interact with other users of the product, and it is this latter value that is the essence of network effects.
The works cited above claim that despite the network effect, a "better" technology can readily displace an entrenched product. From my understanding of paradigms and personal experience with power paradigms have to influence software purchases in organizations I have worked for, I would argue the opposite. Network externality is a major issue to businesses - especially given that compatibility of systems to interface World 3 knowledge with the human communicators across the virtual network of trading partners is paramount.
Word processing applications are tools people and organizations use for communicating with other people and organizations. Spreadsheets and database applications also have important communication functions, but they are primarily personal productivity tools used for individual purposes within organizations. While the majority of organizations were still using paper as their primary medium for exchanging information, it made little difference which word processor was used to produce the paper – everyone could still read the paper. A number of competing word processor suppliers could survive in such a market. However, once a proprietary electronic format itself became the primary medium for communication, the network effect virtually assured a monopoly for the leading proprietary product used for that communication.
Although today’s word processors and spreadsheets still firmly use a paper document paradigm for their tangible output, most organizations and many individuals now distribute and access documents produced by these applications electronically for on–screen viewing rather than by physical paper. In this regard it is important that to achieve the formatting result required by the paper paradigm, complex proprietary formatting codes are included within the electronic documents to ensure that the recipient's screen displays the document in as close to a paper format as possible. Despite efforts(?) of software developers to develop software able to convert content between one proprietary formatting code and another, the effective transfer/translation of electronic documents between different applications is fraught with difficulty. Assuming that it is even possible to translate the document at all119, it is practically impossible to edit documents produced in one brand of word processor in another brand of word processor.
Especially in large corporate or organizational networks where draft documents are frequently exchanged for review and editing, both parties in the exchange must either use compatible software from the same vendor, or revert to the slow and costly physical editing and transport of paper documents120. At least in the corporate environment, there is also a continuing requirement to refer to older (i.e., "legacy") documents – and it is far easier to do this electronically from a central repository than it is to maintain and locate the physical document. Thus once any word processing product is taken up by dominant users in an electronic communications network, the simple requirements of smaller users to communicate effectively with dominant organizations ensures that the majority product will drive out the competitors121, irrespective of any quality issues. The requirement to retain access to legacy documents ensures that it would be a very costly for any large organization to switch to an incompatible application.
Only if the fundamental communication paradigm changes is there likely to be any economically justifiable reason to change the dominant information development and management applications. Such a fundamental change is now taking place.
"Structured authoring" is a technology for capturing text that no longer depends on proprietary word processing codes. Structured authoring applications combined with database management systems enable information to be recorded and managed in revolutionarily new ways. As will be discussed in the following sections, these technologies are fundamentally changing the the ways in which we capture, store, discover and retrieve knowledge. In time the concept of a document as a container for storing knowledge will become obsolete.
Recalling that “knowledge is appropriate information that is available when and where it is needed”, all texts written for reading by humans are structurally organized with semantic cues to help the reader comprehend as knowledge, the information contained in the text. These cues include
recording of individual thoughts as sentences, in juxtaposition with other sentences,
use of white space to organize sentences into paragraphs and paragraphs into larger semantic structures such as chapters and sections,
plus additional apparatus such as:
emphases, and so on.
As discussed above, computerized word processing and typesetting systems offer users a rich array of formatting functions to provide the visual cues needed by humans. However, the proprietary procedural code placed in the text to control output formatting is too inconsistent to identify content for easy semantic recognition by computers.122
Although structured authoring systems's user interfaces often resemble those of word processing and desktop publishing systems, the structured authoring system tags syntactical and semantic content for parsing by computers123 rather than tagging formats for layout on paper. A document type definition (DTD) or schemafor a class of documents defines the kinds of elements (which may be defined semantically - Berners-Lee et al. 2001; W3C 2002; Swartz 2002) allowed to occur in a conforming document, and establishes syntactical rules determining where each kind of element may be used in the sequential and hierarchical structure of a document of that type. In authoring structured text, blocks of text are wrapped by tags conforming to the rules of the DTD or schema to identify structural elements in the document's logical flow. When the document is output for printing or viewing, formats are applied to structural elements based on each element’s type and location in the logical structure of the document. Text formatting is completely separated from the authoring activity – and in fact may be performed by completely separate applications sourced from independent vendors. By contrast, word processor codes are entered as ad hoc formatting decisions relating to particular documents.
Historically, the concept of structured text is a direct development from proprietary "markup" codes used by word processing and typesetting systems. However, as will be shown, a seemingly straightforward evolutionary change in the way electronic text is marked up and processed enables a major grade shift in the epistemic quality of what computer systems can do with the information. The key to this revolution is the Standard Generalised Markup Language (SGML), established as an international standard in 1986 (ISO8879).
“Markup” originally referred to handwritten editorial notes and codes on a manuscript or typescript telling a typesetter how to set text for the printed page. As typesetting was automated with molten metal (e.g., Linotype - Figure 13)124 and then electronic typesetting systems, editorial markup was turned into special codes125 setting off particular blocks of character data that the typesetting system would recognise as formatting instructions rather than as text to be set. Most word processors place formatting instructions into the text in the same way126.
Figure 13. Linotype machines used to cast hot metal into lines of type for printing. The machine on the left is a Model 48. The other is a Model 78. The left hand figure shows the silvery colored furnace in the lower left quadrant supplying the molten lead-tin alloy used for casting the type bars. The flat trapezoidal trays at the top are "magazines" holding molds ("matrices") for the individual characters. When a key is pressed, the appropriate matrix is dropped into place in the "assembler box" used to form a line of type. Words are separated by wedge shaped space bands. When the line of type is complete it is automatically justified and then hot metal is cast against the matrices to form the line of type to be set up in page. After the line is cast, the matrices are disassembled and lifted back to the top of the machine by a conveyer for automatic sorting and redistribution into the appropriate channels of the magazines. Photos courtesy of Dave's Linotype Website (Hughes 2001)
Two strands of development led from format markup to SGML and XML.
On the typesetting side, in the 1960's a peak typesetting industry body, the Graphic Communications Association (now known as the International Digital Enterprise Alliance), created a non–proprietary standard typesetting language called GenCode able to be understood by a wide range of typesetting systems. GenCode allowed publishers to maintain an integrated set of archives that could be reprinted by a wide range of typesetting systems and printers (Connolly, et. al. 1997).
Beginning in 1970, IBM provided the other strand, with the development of General Markup Language (GML – Goldfarb 1996, 1997). GML formed the basis for IBM's 1978 Document Composition Facility127 that pioneered the core concepts of a formal Document Type Definition (DTD) and the semantic tagging of functional elements of text (i.e., "content markup").
In the early 1980s, the GenCode and GML communities joined to form the American National Standards Institute (ANSI) Committee on Computer Languages for the Processing of Text, who drafted the ISO specification 8879 (1986). Standard Generalised Markup Language (SGML) incorporated GML’s concepts of descriptive/functional markup, document type definitions, and the complete separation of standardised non–proprietary text markup from format processing instructions. Format processing was performed by proprietary output systems that applied formats based on element tags and their location in the overall structure of the document (Sperberg–McQueen and Burnard, 1994).
Specifically, SGML is a general purpose language for defining logical structures of different kinds of documents and for defining markup tags allowed to be used in those structures. Documents marked up in the SGML language have three main parts:
A declaration, defining to the computer (a) what characters may be used as tag delimiters, as characters of normal text, and (b) various parameters determining other syntactical details of the SGML language itself.
A DTD, defining to the computer what element tags are allowed in the document and structural rules determining where the defined tags may be used in the logical structure of the document.
An instance of the textual content (i.e., the output document), which contains elements of text marked up in conformance with the DTD.
Documents may be delivered as completely self–defining containers of information, with their declaration and DTD attached. However, in most cases, declarations and DTDs are held in central repositories and are only referred to as required to control authoring or formatting applications. Many DTDs may be defined using the same declaration, and many documents may be structured in conformance to the same DTD. In most applications, DTD are published as open standards. Many parties exchanging the same types of documents may use these standards, irrespective of the software applications used. Major uses of SGML are in the production of technical documentation and automated assembly of journals and business advisory services. CALS (Computer aided Acquisition and Lifecycle Support) initiatives promoted by defense logistics and acquisition organizations in many countries greatly encouraged the spread of SGML128.
SGML was designed to be processed on mainframes where the ASCII characters of the tags were manually typed. Consequently, SGML has a number of optional provisions to minimize typing tags in locations where a powerful computer system could infer they must occur. Tag minimization helped those who only had simple line editors create SGML texts. However, the complexity of the processing needed to infer the missing tags made parsing, editing and print formatting programs substantially more complicated than needed and was counterproductive for developing personal computer applications. Nevertheless, authoring software was developed to read SGML DTDs and understand the syntactical rules they described129. To help the author think structurally, most such systems provide a palette or logical tree view (GUI) of the types of elements allowed at any specified point in the document. To ensure conformance to the associated DTD, authoring applications also parse the document structure as the text is typed and marked up, or at least identify and flag illegal structures when the document is saved.
Aside from SGML and GML (which is still used in some IBM 360/370 installations) there are two flavours of semantic markup language in common use today:
HTML is a tagging standard based on a single SGML DTD, first released to the world in 1993 (Sears, 1998). The DTD defines a standard format–oriented markup able to be understood by compliant Web browsers. However, HTML may be hand coded (i.e., using extensive tag minimisation) and many browsers do not enforce strict compliance to any version of the HTML DTD. Also the dominant browser developers (i.e., Netscape and Microsoft) have each introduced their own proprietary tagging options on top of the established standard. Consequently, much of the HTML formatted text on the Web today does not strictly comply with any DTD (Sears, 1998)130. Because of these factors, although HTML is one of the key factors that enabled the amazing growth of the World Wide Web as a universal information exchange medium between people131, HTML does not facilitate the computer aided processing or retrieval of semantic content.
XML is a full featured version of SGML developed and optimised132 for use by personal computer applications and Web browsers. Compared to HTML, which serves primarily as a standard for formatting markup; XML is intended to tag content with semantic information for use on the Web and exchanges between different information systems. Like HTML, XML texts can be authored and delivered as "well–formed"133 documents to the Web without an associated DTD, along with separately constructed style sheets of various kinds to control the formatting. However, as with SGML, texts can be authored and distributed under full DTD control to ensure standardisation between communication partners134.
All of these markup languages have been established as non–proprietary international standards for marking up content. The intent of all is to allow content to be readily exchanged between the whole range of applications (editors, viewers, output formatters, indexing systems, etc.) and between different applications of each type conforming to the standard. How HTML has added value to knowledge is explored in a later section, where the growth of the Web is considered.