Most people would answer that there can be unboundedly many. This is partially true, as we can have book containing single “a”, book with “aa”, book with one trillion a’s. That’s because unrestricted books can be arbitrary long. But there is one thing which will make this infinity reduce to finite number – page numbering.

(Quick note: in August 2010 Google calculated that there were 130 million books in the world. Certainly finite number)

First of all, based on some tests I made, one page can contain 50 lines of text, with 85 characters each (on average). I’ll round it to 100. Most commonly there are 64 characters to choose from – 26 pairs of upper- and lowercase letters and 12 punctuation symbols. Page 0 contains book title, but, from axiom of extensionality, two books with same story, even with different titles, are considered equal. So 0^{th} page is irrelevant.

Page number is usually written on lines below text. So, for example, as long as page number fits in one line, we have 49 lines to fill with text. If page number fills all lines there is no place for text, so we can ignore these pages. There can’t be any more pages, as we’d need more space than we have for page number. So page count is bounded.

All pages with number between 1 and 10^{100}-1 have at most 100 digits, so they fit in one line. This gives exactly 10^{100}-1 such pages. Each of them can fit 49 lines of text, or 4900 characters. Page numbers between 10^{100} and 10^{200}-1 fit in 2 lines, leaving space for 4800 characters each. That is 10^{200}-10^{100} pages, etc. By summing number of characters over all pages with number less than 10^{4900} gives length of string of characters representing book. That is \(\sum\limits_{i=0}^{49} (10^{100(i+1)}-10^{100i})\cdot (49-i)\cdot 100\). Call this number N. To see how many different strings of characters we can have given that much space we take \(64^N\), where 64 is number of characters. This number is larger than \(64^{10^{4900}}\), so is a lot greater than googolplex. Think how many monkeys we would need to type all of them!

But is it all? What if we consider books with any given number of pages? Summing them will give much larger answer!

First, how many characters can we put on xth page? We have to know how much space does page number take. Here comes very handy formula for number of digits of number - \(\lfloor log_{10}(x)+1\rfloor \). We have to divide it by 100 and round to know how many lines it takes - \(\lceil {{\lfloor log_{10}(x)+1\rfloor} \over 100} \rceil\). Rest of lines can be filled with text. This gives following character count on page x - \((50-\lceil {{\lfloor log_{10}(x)+1\rfloor} \over 100} \rceil)\cdot 100\). Very roughly it is equal to \(5000-log_{10}(x)\) Call this \(c(x)\). So every n-page book contains at most \(\sum\limits_{i=1}^n c(i)\). To know how many books it gives we have to take 64 to the power of this sum, or, as follows from sum-product relation, \(\prod\limits_{i=1}^n 64^{c(i)}\). By summing over all book lengths allowed it gives following number of all books if total page count matters: \(\sum\limits_{n=1}^{10^{4900}-1} \prod\limits_{i=1}^n 64^{c(i)}\). I'm not really sure how big this number is, but I think it still doesn't exceed googolduplex (I may by wrong).

Most of these books are very complex, for which I mean they look like scnislavebilwavcajeuiopersvaweivrbgh, expect \(10^{1000}\) characters long. But we can't do anything with it, can we? (spoiler: we can)

Wolfram Alpha states that average length of English word is 5.1 characters. According to Global Language Monitor there is almost 1020000 words in English language. Recall that longest books can have N characters. If these books were written in English, it would average to \(N \over 6.1\) words (we must include spaces). Every word can be one of million words used in English. So that gives \(1020000^{N\over 6.1}\) books written in English. This, of course, isn't improvement, as number of 5-character strings from 26-letter alphabet is almost 12 million.

How about adding capitalization? Different sources give different values, but on one of sites I found that average length of sentence is about 17 words. So, on average, every 17th word is first in sentence, thus capitalized. But this capitalized word has exactly as many possibilities, so it doesn't increase count.

How about punctation? I found no estimates, but I guess punctation symbol appears approximately every 8 words. So 7 words with length 6.1 (including space) and one with length 7.1. 1020000 for first 7, 12240000 for next one. So every "punctation block" has one of \(1020000^7\cdot 12240000\approx 1.4\cdot 10^{49}\) possibilities. Each block has about 50 characters (49.8 to be exact), so approximate number of all books with punctated sentences is around \((1.4\cdot 10^{49})^{N\over 49.8}\).

These books make more sense than czdbaevivberyuvscbhjkvsuyzboysuaei, but still most of them look like salad marshmallow, divide indeed spotlight also chocolate. I don't think we can do much better. We can use standard sentence length to know where dots should be, or make use of numbers, or implement LaTeX, but computing that must be pretty hard...

Appendix: I finally found where I first heard of this problem - Cantor's Attic. As far as I know no one actually computed that number, but I may be wrong.