Indexing your book

You’ve just written a book and checked the proofs. Now it’s time to prepare the index. How do you go about it?

One option is for someone else to do it. There are some talented indexers available. This is easy for the author, but there’s one catch. Nearly always, you as the author know your material better than anyone. (If you don’t, maybe you’re a celebrity and didn’t write the book with your name listed as the author. In this case, you might learn something by doing the index.)

In my experience, the person most likely to rely on the indexes in my books is me! A few years after writing a book, I want to check a point or a name for something new I’m writing, but I can’t remember the details. So I turn to the index of one of my books.

Back to your book. Let’s assume you’ve decided to prepare the index yourself. How do you start? It’s worth looking at indexes in a variety of books, especially ones similar to yours. You can also read advice on preparing indexes; there is some good material available.

I once read that indexes are typically between 2% and 10% of the length of a book. You can aim for a minimal index, 2% or less, or a comprehensive one, closer to 10%. Sometimes the publisher will impose constraints, for example on the number of pages allowed. By using very small print, you can pack in more entries.

You as the author know your own style and your work habits, and it’s important to find an approach that suits you. What I’ll do here is describe a couple of the ways I’ve gone about indexing in case you might find a useful idea or two. I’ll use examples from my latest book, Official Channels; you can download it for free and see the index for yourself.

Here are the first few entries in the index.

academic exploitation, 81–83, 128–29, 133
acknowledgement practice. See plagiarism
activists, 59, 101–5, 187–89. See also political jiu-jitsu
Acton, Lord, 30, 116–17, 166

A page-by-page approach

Before word processing, indexing involved going through the text page by page, adding entries to a handwritten list. Word processing makes things easier. Here’s one way to proceed. Go through the text page by page. When you see a word that should be in the index, make an entry in your index list, in no particular place. If you see that the word is relevant for several pages, include those pages, but otherwise don’t worry about whether you’ve already included the word. When you get to the end of the book, put everything in alphabetical order. You’ll have to amalgamate entries with the same word. For example, after putting entries into alphabetical order, you might find:

Acton, Lord, 30, 116–17
Acton, Lord, 166

Just put them together to form

Acton, Lord, 30, 116–17, 166

What a computer can’t do well

Assuming you have an electronic copy of your book as it will appear, you can use a computer program that automatically compiles a concordance, which lists every mention of every word. The problem is that the program has no knowledge of what your book is about, so it chooses words without any understanding. That means there’s still a lot of work to do. Eliminating words such as “the” from the list is easy. However, there are two other problems.

Let’s say the program lists Zambia in your index. Did you really discuss Zambia? If you said, “Every country from Albania to Zambia,” then Zambia is not a useful entry. Someone using the index would expect that you’ve said something specific about Zambia. Maybe you did, just not at this particular page.

Suppose the program gives a list of page numbers for “community.” You did discuss the role of the community in your book, but you also used the word in a generic sense, for example, “In this community …” A useful index will include only those pages where there’s a substantive attention to the concept of community. This means that you need to check every instance where you used the word and eliminate the unhelpful instances.

Finding every use of a word is one thing. An index has added value when it includes relevant pages where you didn’t even use a word. Suppose you’re writing about torture. You might have some pages about sensory deprivation where you don’t use the word torture, but it’s useful to include those pages.

Some indexes stick to words found in the text but give little information about the connections between the words. This is where the author, or a highly knowledgeable indexer, can provide guidance, especially using See and See also.

bill of rights. See First Amendment

In my book, I do not discuss the US bill of rights, but do discuss one important part of it, the First Amendment to the US Constitution. Using See does not necessarily imply that the bill of rights is the same as the First Amendment; it just gives an indication of where to look for something relevant to the bill of rights.

courts, 13, 23, 86–87, 109–10, 156–57. See also defamation; First Amendment; law; official channels

See also points to related topics. If I’m trying to think of the First Amendment but can’t remember the name, maybe I’ll think of courts or the law. Under entries for “courts” and “law,” the First Amendment is listed after See also. Unlike the bill of rights, I actually discuss courts, so those page numbers are included.

A skim-and-check approach

After indexing quite a few of my books, I found a method that works well for me. There’s one important requirement: I have a pdf of the entire book. It’s most convenient if page 1 of the book is page 1 of the pdf.

I start by going through the book from page 1. Typically there are one to five entries on each page, though this can vary considerably. On page 5, for example, I discuss the whistleblowing case of Vince Neary, so I begin an entry for him.

Neary, Vince, 5

It’s more than a passing reference: I discuss Vince’s case for several pages. So I look forward to see how long this is.

Neary, Vince, 5–10


Vince Neary

There’s also an entry for State Rail, about which Vince blew the whistle, and “whistleblowing” as a general topic. I add these to my index file, in alphabetical order.

In the text, I say that Vince had come to Australia from England. Should I include “England” in the index? Perhaps, for a very comprehensive index, or maybe if I discuss other individuals from England. But in this book, I don’t discuss England as a country, so I don’t include it in the index.

Another issue: State Rail, for which I’ve created an entry, is a government organisation in the Australian state of New South Wales, commonly abbreviated NSW. Should I include an entry for NSW, with a See cross-reference from “New South Wales”? I know that later in the book I have lengthy treatments of two other NSW organisations. So it would be reasonable to include “NSW” in the index. However, I don’t actually say anything specifically about the state of NSW, for example the population, the government or the climate. Because of this, I decide not to include “NSW” in the index. This is the sort of decision that determines how long the index becomes.

There are numerous decisions of this sort in any index. Should a word be included? What cross-references should be listed? Making decisions requires mental effort. This is why indexing is not a mechanical process — or at least shouldn’t be a mechanical process, if the index is to be really useful. This is also why I don’t work on the index for long stretches of time. An hour per day is plenty. That way I keep fresh, and on the following day my mind has processed some of the issues I had confronted.

To keep everything on the screen, I use two columns and a small font. I keep adding entries and adding page numbers to existing entries until reaching the end of the book. Through this skim stage, I’m not too worried about being comprehensive. The main thing is to pick up all significant topics.

Next I glance through the index to pick up anomalies and start adding See and See also cross-references. Then I start through the index, searching the book pdf for each word or phrase. Proper names are the easiest. One of my entries is Lord Acton. I search the pdf for Acton, noting the pages where it appears. If I picked up all instances in going through the text, the pdf search will find all those instances. Sometimes, though, I missed an instance or incorrectly typed a page number.

For some entries, I don’t want to list every mention in the text. Many of the case studies in my book are Australian, so when I search the pdf for “Australia” there are a lot of hits. If I listed every one, there would be so many pages that the entry would be useless. No one wants to look at 50 or 100 different pages to find what they’re looking for. So I only include those pages where Australia is discussed, not just mentioned. Also, I have considerable discussions about several Australian organisations, for example Whistleblowers Australia. I add “See also Whistleblowers Australia” to the entry and don’t include the pages for Whistleblowers Australia under “Australia” unless there’s a comment about Australia as a country. The result:

Australia, 19, 22–26, 37, 43–47, 77–78, 119, 168–69, 172–75, 179–81. See also ASIC; HCCC; ICAC; Whistleblowers Australia

Because this entry has a fairly long list of pages, it is more unwieldy than most other entries. But it’s still more helpful than if I had listed every page where the word Australia appears. As well, the word “Australia” is not part of the name of the HCCC or ICAC. These are organisations in Australia, so the See also reference goes beyond a simple cross-reference to the word “Australia.”

Next consider a more challenging entry, discussed earlier:

courts, 13, 23, 86–87, 109–10, 156–57. See also defamation; First Amendment; law; official channels

I searched the pdf for the word “court” and decided to list some but not all pages where the word appears. Sometimes in the text I listed several examples of official channels — “grievance procedures, ombudsmen, anti-corruption agencies, and courts.” This sort of reference to courts isn’t worth including in the index because I haven’t said anything much about courts. It’s only when there is some substantive comment about courts that I want to include page numbers.

Along the way, I thought about other areas where courts are regularly involved, leading to See also references to defamation and the First Amendment. Courts are a type of official channel, so there’s a See also reference to official channels. Then, I thought, courts are intimately bound up with the law. At that stage I didn’t even have an entry for law. So I searched the pdf for all mentions, going through the same winnowing process, leading to this:

law, 33, 200–1. See also courts; First Amendment; injustice; official channels; SLAPPs
     and crusades, 44
     defamation, 24–25, 176, 179
     and HCCC, 229
     and myth system, 37–38
     and operational code, 38, 46
     serving power, 33
     whistleblowing, 19, 22–26, 28, 42–43, 45–48

In this entry, I list pages where I discuss law in general at the outset (law, 33, 200–1) and then have sub-entries for when law is part of a discussion of specific topics. Note how these are in alphabetical order in a peculiar way, with the main word potentially either before or after “law”. For example, the first item on the list, “and crusades,” is connected as “law and crusades” whereas the second item, “defamation,” is connected as “defamation law.” The “and” is not taken into account in forming the alphabetical order.

The final sub-entry in this list, “whistleblowing,” is connected to “law” as “whistleblowing law.” Technically, it would be more appropriate to refer to “whistleblower law.” However, elsewhere in the index I made a major entry for “whistleblowing,” and for the purposes of the index it seemed to me unnecessarily discriminating to have separate entries for “whistleblowing” and “whistleblower.” Perhaps on another day I might have chosen differently.

For this index, I laid out the complex entries using the format above. Another option is:

law, 33, 200–1; and crusades, 44; defamation, 24–25, 176, 179; and HCCC, 229; and myth system, 37–38; and operational code, 38, 46; serving power, 33; whistleblowing, 19, 22–26, 28, 42–43, 45–48. See also courts; First Amendment; injustice; official channels; SLAPPs

This format is more compact, and I’ve used it in the past. However, it is not quite as convenient to read.

After completing a draft of the index, it is worthwhile looking through it all again, noting any obvious problems. It is definitely worth checking the alphabetical order. If you use a sort function, it may not result in an order that you want.

There are a few complications in arranging entries in alphabetical order. Consider these two entries:

Whistleblowers Australia, 2, 5, 9, 14–16, 19–20, 52–54, 236–37
Whistleblower’s Survival Guide, 19–20

I’ve ignored the apostrophe for the purposes of alphabetical order, but my sort function put the two entries in reverse order.

Then there are numbers:

Ferguson, Adele, 172–75
5th Pillar, 69
First Amendment, 175–81

I’ve included “5th Pillar” as if it were spelled “Fifth Pillar.” You might prefer to put numbers at the beginning, before letters.

For “#MeToo,” I ignored the #:

medical dominance, 225–27
#MeToo, 114–15
Milošević, Slobodan, 166–67

Then there are titles with indefinite articles:

political jiu-jitsu, 144–52. See also backfire
The Politics of Nonviolent Action, 145
power, 27–29

I could have written the book entry as Politics of Nonviolent Action, The, 145. There are rules for most of these sorts of issues. I usually follow the rules because they are designed to make things consistent and easy, but sometimes I use my own judgement. Given that I’m the one likely to use my index more than anyone else, I want it to be convenient for me.

Ideally, you should find someone to check your index. Spots checks would involve looking at random pages, seeing words or topics, and seeing whether the index includes the words or topics with those pages. Though I can’t remember ever asking anyone to check my indexes, it’s a worthwhile precaution. A friend told me about a book by a well-known author for which the page numbers listed in the index were in disarray, with few of them correct. How could this happen? Imagine that you accidentally use a version of the text with the wrong page numbers — even just an extra paragraph added early in the book could cause subsequent pages to be changed — or the publisher adds a foreword and renumbers all the subsequent pages. Not a pleasant thought.

When preparing an index, sometimes I wish that I could rewrite aspects of the book. The index alerts me to inconsistent uses of words, of words that are overused, of repetitions in the text, and of important concepts that I’ve not addressed. Preparing the index offers a perspective on what you’ve written that may be slightly different from what you gained from the writing and proofreading. If you gain insights from the index, write them down for later. It’s possible you’ll prepare a second edition of your book!

Is there a politics of indexing, in other words does indexing reflect the exercise of power? Any book has a politics in this sense. It’s your way of making sense of something, and in doing this you make assumptions and give a partial perspective via the words you use and don’t use. The index reflects the book’s politics, namely its perspective, and sometimes highlights or accentuates it. Does your index include emotive words such as abuse or exploitation? Does it include contentious topics?

If there’s a book about the politics of indexing, it would be fascinating to look at its index.

Brian Martin
bmartin@uow.edu.au

Thanks to Anneleis Humphries and Jason MacLeod for valuable comments on drafts.