Tikalon Header Blog Logo

Michael Hart and Project Gutenberg

March 8, 2013

I started graduate school in 1970 at a university with a multitude of time-sharing computer terminals for student use. It was the yesteryear equivalent of giving every student a laptop or tablet computer. This was an IBM System/360 implementation for which the time-sharing programming language was APL.

I always liked APL, probably because it was designed by a mathematician, Kenneth Iverson; but most of the other students were likely muttering the yesteryear equivalent of WTF?, which was just F![1] It was an interpreted language, with instant gratification; it had useful mathematical functions, such as sorting, built into single symbols; and it allowed processing of alphabetic characters.

One interesting feature of the system, unknown to most students, was the ability to send short text messages between users who were logged-in at different terminals. Since I had both a student account and a research account, I decided to test this feature. It worked, and the ability to send a message across a room seemed magical. This gives credence to Arthur C. Clarke's law that any sufficiently advanced technology is indistinguishable from magic.

My message was likely limited to 128 characters, or some small power of two, so it was roughly equivalent to a tweet, and it was a tweet that could only reach a single person. At roughly the same time, July 4, 1971, Michael S. Hart was typing the US Declaration of Independence into a Xerox Sigma V mainframe computer at the Materials Research Laboratory of the University of Illinois. Michael Hart was born on this date in 1947.[2-6]

Hart got a copy of this text during a stop at a grocery store after a holiday fireworks display. Since his computer was linked to others through ARPANET, the predecessor to today's Internet, he had planned to email it to everyone with email access. He was dissuaded from sending what would have been the first spam message, so he instead posted a notice that the text was available for download.

Michael Stern Hart

Michael S. Hart, March 8, 1947 - September 6, 2011.

(Photo by Doug Bowman, Creative Commons licensed image from Flickr.)


We might not have heard about Michael Hart were it not for his continued production and posting of the electronic text of other public domain documents, such as the US Bill of Rights. All this early work was done by Hart, himself, but soon volunteers joined him in his goal of putting as much free content as possible up on the fledgling Internet. Early works posted were the US Constitution, the King James Bible, and Alice's Adventures in Wonderland.

This effort, now called Project Gutenberg, is the repository of more than 40,000 books, some in languages other than English, with a projected total number of books that might reach a million by 2021, the project's fiftieth anniversary.[3] One explanation of the project name is that Johannes Gutenberg's printing press paved the way for books to reach beyond the elite classes.[6]

These electronic texts were first encoded in ASCII, the "plain-vanilla," universal text standard, but many are now also available in the common electronic book formats. Hart rightly chose the ASCII file format, which didn't depend on any propriety file format. The ASCII file format might lack text formatting features, but it contains the text in an always backwards-compatible rendition.[5]

Here's a short summary of Project Gutenberg's accomplishments.
DateMilestone
1973The United States Constitution
1974-1988Collected Works of William Shakespeare
8/1989King James Bible
1/1991Alice's Adventures in Wonderland
7/1991Peter Pan
1/1994The Complete Works of William Shakespeare
(eBook #100)
8/1997Dante's Divine Comedy (eBook #1,000)
5/1999Don Quixote (eBook #2,000)
4/2002The Notebooks of Leonardo da Vinci
(eBook #5,000)
10/2002The Human Genome Project
10/2003The Magna Carta (eBook #10,000)
10/2006Twenty Thousand Leagues Under the Sea
(audiobook, eBook #20,000)
6/201136,000 ebooks in many languages
One significant milestone occurred in October, 2003, when the book collection doubled in 18 months. This is essentially the "Moore's Law" point for Project Gutenberg. Here's a summary of the number of books published in the Project's history.

Project Gutenberg Statistics

More than 40,000 eBooks are now avaialble from Project Gutenberg, with a growth approximating an exponential function of time.

(Graph by the author, rendered using Gnumeric, from Project Gutenberg Statistics.)


Michael Hart, who died at age 64 on September 6, 2011, was an interesting character who was a lot like Richard Stallman. Just as Stallman advocates software freedom, Hart advocated an Information Age freedom of the press that would allow everyone free access to the world's intellectual corpus. Such a freedom has now become an essential human right.

Hart's facility in both the literary and computer worlds might have derived from his parents' professions. They were both instructors at the University of Illinois, where his mother, a World War II cryptanalyst, taught mathematics, and his father taught Shakespeare.[3] Hart grew up with computers; and, being just a few months older than I am, it's not surprising that he and I both had home-built CP/M computers.[5] Hart attended lectures at the university before he was a high school student.[3]

Hart lived simply, not even owning a mobile phone.[6] He did his own home and auto repairs, and built electronic equipment, including computers, from discarded components.[2] He was the author of two books that he donated to Project Gutenberg, "A Brief History of the Internet," and "Poems and Tales from Romania."[3] Project Gutenberg can only host donated or public domain items, and this highlights a recent problem with copyright, at least in the United States.

The Copyright Term Extension Act of 1998 effectively removed about a million books from the public domain by extending copyright term by twenty years.[3] In the first forty years of Project Gutenberg, the average lifetime of a copyrighted work increased from thirty to nearly a hundred years. Librarians and publishers petitioned the US Supreme Court for redress in a test case, Eldred v. Ashcroft, but the Court declared the law constitutional on January 15, 2003.

Michael Hart expressed his vision succinctly in July, 2011, a few months before his death,
"One thing about eBooks that most people haven't thought much is that eBooks are the very first thing that we're all able to have as much as we want other than air. Think about that for a moment and you realize we are in the right job." [2]

References:

  1. Although it looks like it, F! is not fifteen factorial.
  2. Gregory B. Newby, Michael Stern Hart (1947-2011), Project Gutenberg Web Site.
  3. William Grimes, "Michael Hart, a Pioneer of E-Books, Dies at 64," The New York Times, September 8, 2011.
  4. Shane Richmond, "Michael Hart, creator of the ebook, dies," Telegraph (UK), September 8, 2011.
  5. Michael Jon Jensen, "Michael Hart, 1947-2011, Defined the Landscape of Digital Publishing," The Chronicle of Higher Education, September 12, 2011.
  6. Michael Hart, The Economist, September 24, 2011.
  7. History of Project Gutenberg.                         

Permanent Link to this article

Linked Keywords: Graduate school; 1970; Syracuse University; time-sharing; computer terminal; laptop computer; tablet computer; IBM System/360; programming language; APL; mathematician; Kenneth Iverson; interpreted language; mathematics; function; subroutine; sorting algorithm; sorting; symbol; alphabetic characters; user account; research; magic; Arthur C. Clarke; three laws; technology; power of two; Twitter; tweet; Michael S. Hart; United States Declaration of Independence; SDS Sigma series; Xerox Sigma V mainframe computer; Materials Research Laboratory; University of Illinois; grocery store; holiday; fireworks; ARPANET; Internet; email; spam message; Creative Commons license; Flickr; public domain; United States Bill of Rights; United States Constitution; Authorized King James Version; King James Bible; Alice's Adventures in Wonderland; Project Gutenberg; language; English; Johannes Gutenberg; printing press; elite classes; ASCII; William Shakespeare; Peter Pan; The Complete Works of William Shakespeare; Dante's Divine Comedy; Don Quixote; The Notebooks of Leonardo da Vinci; The Human Genome Project; The Magna Carta; Twenty Thousand Leagues Under the Sea; Moore's Law; Gnumeric; Project Gutenberg Statistics; Richard Stallman; free software movement; software freedom; Information Age; freedom of the press; Western intellectual tradition; world's intellectual corpus; human right; parent; profession; instructor; mother; World War II; cryptanalyst; computer; CP/M; high school; mobile phone; electronic equipment; author; copyright; United States; The Copyright Term Extension Act of 1998; librarian; publisher; Supreme Court of the United States; Eldred v. Ashcroft.