1. Short guide to create an index

  1. Open the IndexMakerS.
  2. Select the documents to be indexed: Click the + icon in the "Create Index" palette (or cmd-N) and select the PDF(s) to be indexed.
  3. .
  4. Click on "Create Index".
  5. Edit the raw index using stop word lists, substitutions and other filters, or select only specific target words.
  6. .
  7. Format the result
    . In the preview view you can format your index for PDF, print, HTML or XML output and export it via the file menu.

2. Indexing

2.1. Generate Index

To select a PDF document to be indexed, click the "+" icon. You can also select a directory in the process. Then all PDF documents within the directory will be loaded.
To delete a document, select it and click on the "-" icon.
You can change the order of the documents using drag & drop.
For each source document you can specify from which or to which page the indexing should happen. Enter the information in the columns "From" and "To". If you leave the fields empty, the entire document will be indexed.
Double-click on the file name of a source document to see a preview of the document.

Page count
In order to properly assign page numbers, it is important that IndexMaker knows which physical page the number 1 page is on.
If the first document starts with page 1 or a higher page number, enter it in the text box and select the first option.
If the page count starts on a later page with page count 1, select the second option and enter the page of the PDFs where the page count starts. All pages before that, will be indexed with Roman numerals.

If you index several documents at once, you can specify that the numbering should be continuous. Otherwise, it will start again with page number 1 for each document.

2.1.1. Physical and content page count

2.2. Preferences

You must make these settings before you create the index!

2.2.1. Preferences for indexing

Specify which should be the allowed characters when indexing and which characters should serve as word separators. Usually the basic settings are sufficient and do not need to be changed.
However, you can add additional characters if necessary. If you also want Greek, Hebrew or Cyrillic characters or Katakana or Sanskrit to be recognized, you can add them to the allowed characters by checking the checkbox.

.

You can specify that email addresses should not be separated when indexing, even though they contain word separator characters (@ and period). You can specify this for prices (11,99) or times (14:45) as well.
Or you can formulate your own condition using Regular Expressions.

2.2.2. Observe upper and lower case

Indexing is always case-sensitive.
You can specify whether or not to be case sensitive when comparing your target and stop words to the text of the source documents.

2.2.3. Crossreference

If you place one term hierarchically below another, it will no longer be in its original position in the index but only below the hierarchical head term. You can, however, place a cross-reference in its original place, such as "see: ...". In the reference text text box, enter the text you want your references to begin with.

2.2.4. Edit Table

You can set the font size of the edit table here.

3. Edit Index

3.1. Index table

Column 1: Column 2: Column 3:

3.2. Filter index

In the filter panel you have many options to reduce the number of items in the index or change them.

3.2.1. Target Words

Instead of reducing and filtering the full text index step by step, you can alternatively define target words, i.e. specify a positive selection of words that should appear in the index.
Target words are collected in the list editor in the target word list and appear in the index list in green.

. To specify a word as a target word you have the following options:

With the last two options, the word is automatically added to the target word list. Likewise, all its substitutions - if any - will be added to the target word list. In the filter palette you can see the current number of target words.

.

If you select "hide others" for the target words in the filter palette, only the entries that were defined as target words and found in the PDF will be shown.
If no target words have been defined yet or none of the target words were found in the index, the index is empty.

You can enter names of persons here in two forms: As "John Doe" or as "Doe, John". Accordingly, the entry appears in the index. If you specify "Doe, John", "John Doe" will also be searched for in the text.

3.2.2. Stopwords

Stop words are words that are so meaningless that they should not appear in the index, such as: "in, at, with, and, the, that,..." etc. They are grayed out in the index list and not included in the preview or export.

To specify a word as a stop word you have the following options:

The word will then be automatically added to the stop word list. Likewise, all its substitutions - if any - will be added to the stop word list.

In the filter palette you can see the current number of stop words. You can prevent stop words from being displayed by checking the appropriate box in the Filter Palette.

3.2.3. Substitutions

The substitution list is used to replace words with another word. This is useful if words are to be traced back to their basic form, for example (e.g. tree -> tree). These words are displayed in gray in the index list, since their addresses are added to their basic forms in the preview or during export.
Substitutions automatically become target words as well.

To assign a basic form to a word you have the following options:

The word is then automatically added to the substitution list. In the filter palette you can see the current number of substitutions.

.

3.2.4. Hierarchy

You can use this list to group terms under a generic term. For example, if the terms oak, beech, maple, etc. should appear under the generic term tree. In the index, the sub-terms are displayed as follows: Oak (-> Tree). The hierarchy can contain several levels. To avoid cicle references, sub-terms cannot be their own super-terms.

To make a word a hierarchical subheading of another you have the following options:

The word is then automatically added to the hierarchy list. In the filter palette you can see the current number of hierarchies.

.

3.2.5. Other Filter

3.2.6. Font Filter

The font filter can be used to remove words from the index that occur in a specific font/size in the document. All fonts occurring in the document are listed. To remove words of a certain font from the index, the corresponding checkbox must be deactivated.
This can be helpful, for example, to exclude headings or footnotes. The prerequisite for this is that these are present in a delimitable font.
This function is usable with indexes created from version 5.1.

3.2.7. Location Format

The raw index initially contains redundant addresses (details of where they were found). You can determine the formatting of the address specification. Five formats are available for this purpose.
. If you index several documents at the same time, you can display the file names in the addresses.

3.2.8. Spell check

If you check the "Only failed spell check" box in the filter palette, only the words for which the spelling correction failed will be displayed. This feature is good for correcting any spelling errors.
You can disable the spell checker in the preferences.

3.2.9. Number filter

If you select "Do not index numbers" in the filter palette (see 3.2.0), you can exclude entries that contain numbers. With the gear icon you can define:

.