1. Short guide to create an index

Open the IndexMakerS.
Select the documents to be indexed: Click the + icon in the "Create Index" palette (or cmd-N) and select the PDF(s) to be indexed.
Click on "Create Index".
Edit the raw index using stop word lists, substitutions and other filters, or select only specific target words.
Format the result
. In the preview view you can format your index for PDF, print, HTML or XML output and export it via the file menu.

2. Indexing

2.1. Generate Index

To select a PDF document to be indexed, click the "+" icon. You can also select a directory in the process. Then all PDF documents within the directory will be loaded.
To delete a document, select it and click on the "-" icon.
You can change the order of the documents using drag & drop.
For each source document you can specify from which or to which page the indexing should happen. Enter the information in the columns "From" and "To". If you leave the fields empty, the entire document will be indexed.
Double-click on the file name of a source document to see a preview of the document.

Page count
In order to properly assign page numbers, it is important that IndexMaker knows which physical page the number 1 page is on.
If the first document starts with page 1 or a higher page number, enter it in the text box and select the first option.
If the page count starts on a later page with page count 1, select the second option and enter the page of the PDFs where the page count starts. All pages before that, will be indexed with Roman numerals.

If you index several documents at once, you can specify that the numbering should be continuous. Otherwise, it will start again with page number 1 for each document.

2.1.1. Physical and content page count

2.2. Preferences

You must make these settings before you create the index!

2.2.1. Preferences for indexing

Specify which should be the allowed characters when indexing and which characters should serve as word separators. Usually the basic settings are sufficient and do not need to be changed.
However, you can add additional characters if necessary. If you also want Greek, Hebrew or Cyrillic characters or Katakana or Sanskrit to be recognized, you can add them to the allowed characters by checking the checkbox.

You can specify that email addresses should not be separated when indexing, even though they contain word separator characters (@ and period). You can specify this for prices (11,99) or times (14:45) as well.
Or you can formulate your own condition using Regular Expressions.

2.2.2. Observe upper and lower case

Indexing is always case-sensitive.
You can specify whether or not to be case sensitive when comparing your target and stop words to the text of the source documents.

2.2.3. Crossreference

If you place one term hierarchically below another, it will no longer be in its original position in the index but only below the hierarchical head term. You can, however, place a cross-reference in its original place, such as "see: ...". In the reference text text box, enter the text you want your references to begin with.

2.2.4. Edit Table

You can set the font size of the edit table here.

3. Edit Index

3.1. Index table

Column 1:

This column informs you about the frequency of occurrence of a word.
Unfortunately, it is currently not possible to sort the index by frequencies by clicking in the table header. To learn how to sort an index by frequencies, see 4.5

Column 2:

Words that appear red are not found in the dictionary (spell checker).
You can turn the spell checker on and off in the preferences.
Words that appear in green are target words. You can assign entries by shift-clicking as a target word. The corresponding word will then be added to the target word list. In the filter palette, you can restrict the display to the target words.
Stop words are grayed out. With alt-click the word is added to the stop word list.
An arrow indicates the assignment of an entry to a basic form (eg: cognitive -> cognitive). A ctrl-click assigns the current selection as the base form to the word you click on. The word is then added to the substitutions list.
If a word is hierarchically subordinated to another it is pointed out as follows: "Oak (^ Tree)". A cmd-click assigns the current selection as the hierarchical head term to the word you click on. The word is then included in the hierarchy list.

Column 3:

Here you can see where this word occurs in the source documents.
You can specify the format of the addresses and, if you are indexing several documents at once, you can also have the file names displayed. See under "Format found locations".

3.2. Filter index

In the filter panel you have many options to reduce the number of items in the index or change them.

3.2.1. Target Words

Instead of reducing and filtering the full text index step by step, you can alternatively define target words, i.e. specify a positive selection of words that should appear in the index.
Target words are collected in the list editor in the target word list and appear in the index list in green.

. To specify a word as a target word you have the following options:

Enter a target word by hand into the target word list in the List Editor.
Select a word in the index window and assign it as a target word in the word palette.
Hold down the Shift key and click on the word in the index list.

With the last two options, the word is automatically added to the target word list. Likewise, all its substitutions - if any - will be added to the target word list. In the filter palette you can see the current number of target words.

If you select "hide others" for the target words in the filter palette, only the entries that were defined as target words and found in the PDF will be shown.
If no target words have been defined yet or none of the target words were found in the index, the index is empty.

You can enter names of persons here in two forms: As "John Doe" or as "Doe, John". Accordingly, the entry appears in the index. If you specify "Doe, John", "John Doe" will also be searched for in the text.

3.2.2. Stopwords

Stop words are words that are so meaningless that they should not appear in the index, such as: "in, at, with, and, the, that,..." etc. They are grayed out in the index list and not included in the preview or export.

To specify a word as a stop word you have the following options:

Enter a stop word by hand into the stop word list in the list editor.
Select a word in the index window and designate it as a stop word in the word palette.
Click on the entry in the index list while holding down the alt key.
Select "Word as stop word" from the Edit menu to make the current selection the stop word

The word will then be automatically added to the stop word list. Likewise, all its substitutions - if any - will be added to the stop word list.

In the filter palette you can see the current number of stop words. You can prevent stop words from being displayed by checking the appropriate box in the Filter Palette.

3.2.3. Substitutions

The substitution list is used to replace words with another word. This is useful if words are to be traced back to their basic form, for example (e.g. tree -> tree). These words are displayed in gray in the index list, since their addresses are added to their basic forms in the preview or during export.
Substitutions automatically become target words as well.

To assign a basic form to a word you have the following options:

Enter the word by hand into the substitution list in the list editor.
Select a word in the index window and enter the base form in the word palette.
Select the base form in the index list and then click on the word to be substituted while holding down the ctrl key.

The word is then automatically added to the substitution list. In the filter palette you can see the current number of substitutions.

3.2.4. Hierarchy

You can use this list to group terms under a generic term. For example, if the terms oak, beech, maple, etc. should appear under the generic term tree. In the index, the sub-terms are displayed as follows: Oak (-> Tree). The hierarchy can contain several levels. To avoid cicle references, sub-terms cannot be their own super-terms.

To make a word a hierarchical subheading of another you have the following options:

Enter the term by hand into the hierarchy list in the list editor.
Select the word in the index window and assign a generic term to it in the word palette.
Select the sub-term in the index list and then click on the word you want to be the super-term while holding down the cmd key.

The word is then automatically added to the hierarchy list. In the filter palette you can see the current number of hierarchies.

3.2.5. Other Filter

You can exclude numbers from the index. (see also 3.2.9)
You can restrict the index to words that start with an uppercase letter.
You can display all entries consistently in lower case (useful in English)
You can specify the minimum number of letters a word must have to appear in the index.
You can restrict the index to words that occur with a certain frequency,
or display only those words for which the spelling correction failed (see 3.2.8).

3.2.6. Font Filter

The font filter can be used to remove words from the index that occur in a specific font/size in the document. All fonts occurring in the document are listed. To remove words of a certain font from the index, the corresponding checkbox must be deactivated.
This can be helpful, for example, to exclude headings or footnotes. The prerequisite for this is that these are present in a delimitable font.
This function is usable with indexes created from version 5.1.

3.2.7. Location Format

The raw index initially contains redundant addresses (details of where they were found). You can determine the formatting of the address specification. Five formats are available for this purpose.
. If you index several documents at the same time, you can display the file names in the addresses.

1,1,2,2,... Each page number of each location is listed here.
1,2,3,... Here, each page number appears only once, even if a term appears multiple times on the page.
1-10,12... Here, each page number appears only once, even if a term appears multiple times on the page. Continuous pages are combined with a hyphen
1f.,5ff.,... Continuous pages are summarized with ff. If there is only one continuation page with f.
1f., 5-9,12,... If there is only one continuation page it is summarized with f., several continuation pages with a hyphen.
without - The finding places are hidden
show filenames - The filenames also appear before the page numbers. This is useful when indexing multiple documents at the same time.

3.2.8. Spell check

If you check the "Only failed spell check" box in the filter palette, only the words for which the spelling correction failed will be displayed. This feature is good for correcting any spelling errors.
You can disable the spell checker in the preferences.

3.2.9. Number filter

If you select "Do not index numbers" in the filter palette (see 3.2.0), you can exclude entries that contain numbers. With the gear icon you can define:

Whether pure numbers should be considered or not
Whether only the numbers should be deleted or the whole entry should be disregarded for entries that start with numbers
Whether only the numbers should be deleted or the whole entry should be disregarded for entries that end with numbers.

3.3. Reduce index

With the menu item "Edit > Reduce Index" you can reduce the size of the index. This permanently deletes all entries marked as stop words and all other filtered-out words such as numbers, etc. from the index and can no longer be reconstructed.
However, a smaller index is processed faster and saves memory.

3.4. General information

3.4.1. Hyphens and word separations

Since many PDF documents represent the character string "- " as a word breaker, the IndexMaker cannot distinguish between enumeration ("in- or outbound") and hyphenation ("im- printed"). The IndexMaker pulls these words together because separations are much more common. The problem also arises when a hyphen crosses a line boundary. E.g. Goethe-Symposium. Where "Goethe-" is at the end of one and "Symposium" at the beginning of the next line.
The reason is - unlike the typewriter - the word processing programs do not know hard line changes. So the text can be rearranged if the font size changes.
Separations across pages can also not be recognized because there is no distinction between text and footnotes.
A hyphen between the main words is taken into account: Art-Fair remain with a hyphen.
Unfortunately, the UNICODE character 00AD cannot be processed.

3.4.2. Index names

If you want to index full names, you can specify names of people in two forms in the connected words list: As "John Doe" or as "Doe, John". The entry in the index will appear accordingly. If you specify "Doe, John", "John Doe" will also be searched for in the text.

4. Preview

In the preview mode you can format your index for the printing or for export.

Select the components you want to output and their order. Depending on which components you select, the corresponding palettes appear below to specify the details.

You can also attach your index to the original document here and add links if necessary.

4.1. Text Formatting

You can format the finished index in different ways, such as:

set a heading for the index
set the font and alignment
Specify font size separately for the heading, initials, entries, and addresses
Set the font and background colour of the initials
You can specify a separator between word and addresses
Specify the indentation for the addresses
Determine the number of columns
You can also decide whether your annotations for the terms should be displayed and, if so, whether they should be set off from the entry and from the finds with a line break.

Keep in mind that bold and italic font styles are not available for every font.

Numbers in the index:

In the preview, the index entries are sorted alphanumerically (1,2,...9,10,11,...20...A,B,C...Z).
Leading zeros in front of numbers are ignored (002x -> 2).
You can also combine the numbers in the preview so that all numbers appear under the initial "0...". To do this, place a checkmark next to "Summarize numbers".
Page numbers can also be right-aligned if left-alignment or justification has been selected for the font.

Once you have made changes, click on "Apply" and the layout of the index will be recalculated.

4.1.1. Separator

If you leave the field for the separator empty, an entry will only be separated from its addresses by a space.
Alternatively, you can specify any character or string as a separator.

4.2. Glossary

Specify the title and settings for the glossary here

4.3. Figures

Specify the title and settings for the list of illustrations here

Note that entries only appear here if something has been entered in the text field in Edit mode.

4.4. Analysis

In the preview, under the tab Analysis, an evaluation of the word frequencies can be created.
. The list is grouped by frequency and can be sorted in ascending and descending order.

The further appearance of the analysis depends on the settings of the index.

Once you have made changes, click on "Apply" and the layout of the analysis will be recalculated.

4.4.1. Graphic

A graph can also be generated for evaluation.
The graph appears at the end of the evaluation. You can specify the axis labeling and the grid.

4.5. Pagination

For the PDF export you can specify here with which page number the index should start. This way you can better attach the index to the source document later.
. The position of the page number can also be defined by specifying the distances from the outer and lower margins.
Select the checkbox if no page number should be displayed.
Also set the font size of the page numbers here.

4.6. Border Distance

You can set the distances between text and page bounds.
If you like to switch the inner and outer distance for even and odd page numbers select the "Switch left/right pages" checkbox.

Once you have made changes, click on "Apply" and the layout will be recalculated.

4.6.1. Double Pages

4.6.2. Single Pages

4.7. HTML output

In the HTML output, an A-Z navigation can be included at the beginning of the document.

5. Save, load and export the index

You can open indexes that have already been edited and saved with "Open Index".

You can merge an already existing index with another one by adding another one to the opened one with the menu item "Add Index...". All settings (address formatting, maximum word length, etc.) will be retained from the first index.

When saving, all settings and lists are saved with in the document. The file format extension is .idxm

If you later want to index other PDF documents with the same set of stop and target words, hierarchies and substitutions, select "Save lists without index". This will only save the lists you have created, which can then be reused.

5.1. Export

The index can be exported in numerous formats.

5.1. Text export

Specify here in which format your index should be exported.
Which character should be inserted between word and annotation, annotation and addresses and how should both be indented?
Also specify how indentation should look like per hierarchical level.

5.2. CVS Export

With the CSV export, you can process the data of the index in a database.
The SQL command for creating a corresponding table could look like this:

CREATE TABLE words (
   word tinytext,
   frequency int(11) DEFAULT NULL,
   annotation text,
   addresses text,
   basicForm tinytext,
   stoppword tinytext,
   targetword tinytext,
   id int(11) unsigned NOT NULL AUTO_INCREMENT,
   PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

You can also use the CSV_Export to sort the index by the frequency of occurrence of the words.
. Unfortunately it is currently not possible to sort the index by frequencies by clicking in the table header. But you can export your index as a CSV file. You can then import this file into Excel or Numbers and then sort it accordingly.

Drag the CSV file onto the Numbers program icon
Select "Organize" > "Show Sort Options"
Then select "Sort entire table" in the palette on the right margin and the second column where the frequencies are located.

Open MS Excel and load the CSV file via the menu "File" > "Import". Tab stops serve as separators.
Select the entire table by hovering over the column headers.
Then select "Data" > "Sort" from the menu and specify column B as the sort key.

5.3. MS Word export

You can export your index in Microsoft Word format.
Note that this does not preserve the column settings, but they are easy to reconstruct in Word.

6. List Of Figures

With the IndexMaker it is easily possible to create a list of figures.

To do this, select the entry List of Figures or Option-Command-F in the Edit menu. The IndexMaker tries to recognize all images in the PDF and lists a small preview.
With the checkbox you can exclude individual entries. On the right side you can set the preview size or filter the images by size.

6.1. Add and delete lines

If an image is missing, you can add more lines or delete excess lines via the menu. Added lines can be assigned to a page number.

7. List Editor

An essential tool for editing the index are the lists used to filter the raw index.
. You open the List Editor in the Edit menu by clicking the List Editor button, in the Filter palette, or by clicking the List icon in the window header. You can use this editor to edit or create stop word, target word, substitution, and hierarchy lists at any time.

Words that were not considered during indexing are displayed in gray.

In the menu of the editor's header, select the list you want to see or edit.

Insert a word into a list -> with the plus icon
Delete a word from a list -> with the minus symbol (select word before)
Clear the current list -> in the popup menu
Add a list from a text file to the current list -> in the popup menu.
Note that the lists must be UTF-8 encoded.
Save the existing list in a separate text document -> in the popup menu
Add default sets of stop words and substitutions and the like to the current lists -> using the popup menu

You can also make all the words in the index target words or stop words, and then make a negative selection.

7.1. Create a target word list from spreadsheet

If you want to import a table in which the first and last names are in separate columns into IndexMaker as a list for target words, they must first be linked.
. Then select the column (C in this case) and copy it.

Then open the TextEdit program and first convert the blank document to "plain text" under Format. Now paste the copied table column and save the document with the extension .txt. When saving, make sure that "Unicode (UTF-8)" is selected as the encoding.

7.2. Create a target word list from word document

7.2.1.

7.2.2.

8. Word Panel

In the Word palette, you can affect the entry selected in the index as follows:

Define by which term a word should be replaced. For example, by the basic form (trees -> tree), synonyms (vacation -> vacation) or names (DaVinci -> DaVinci, Leonardo).
Set a generic term under which it should appear in the index. The term then appears in the index under the generic term hierarchically staggered.
If you select the reference option, the term appears not only under its generic term or its substitute, but also still in its normal place in the index with a reference to the generic term / substitute: "see..." You can change the reference text in the preferences. The references will be summarized accordingly.
You can add an annotation to each word, which can appear later in the index. This allows you to build a glossary, for example.
If you use substitutions or basic forms, enter the annotations there.
Specify whether the word should be a stop word. The word is then added to the stop word list or deleted from it.
Specify if it should be included in the target word list.
You can also delete the entry completely.
You can also duplicate the entry (see also 3.4.3).

9. Search Words / View Context

The search dialog "Context" can be called up via the window menu, via cmd-F or the search icon in the header of the window.
If you have previously selected a term in the raw index, the context search will open immediately with the corresponding found locations.

You can use it to search the index for terms. You will then receive a list with all references including the information in which document and on which page the term can be found.
With the eye symbol you can deactivate or activate each finding place.
Next to it you see a short text excerpt with the occurrence of the term. With the slider you can determine the size of the text section.

You can also use a wildcard (*) for the search, which you can specify before or after the search term. Alternatively, you can search for phrases by entering multiple words.

9.1. Show original fonts

If you select the checkbox "show original font", the found locations are displayed in their original font.
This function can be used with indexes created from version 5.1 onwards.

9.1. Graphical display of found locations

In the lower part of the context search you can see graphically where in the document the current search term occurs. (Each line represents 100 pages)
This function can be used with indexes created from version 5.1 onwards.

10. Shortcuts

Click on an entry in the index

double click	open context
Shift-Click	Target Word on/off	entry in the index turns green
Option-Click	Stop Word on/off	entry in the index turns gray
Ctrl-Click	Clicked word becomes the substitute of the current selection	The basic form is shown behind an arrow ? and the word turns gray
Cmd-Click	Clicked word becomes the hierarchical subitem of the current selection	The umbrella term is shown in brackets after an arrow ?

Shortcuts

ctrl-s	Make the current selection in the index list the stop word
ctrl-z	Make the current selection in the index list the target word

11. Problem solving

11.1. Not all names are found

11.1.1.

11.1.2.

11.1.3. Best Practice

The best solution here is to copy the apostrophe from the PDF and set it as a word separator in the preferences and restart the index.
Since there are a number of similar looking apostrophes, this is the safest method.

11.2. Same names, different people

If you have entries such as the name "Thomas Schmidt" that are found in the index, but they are two different people with the same name, you can also split the index entry.
To do this, select the entry in the raw index and duplicate it in the Word palette. Give the duplicate a unique name (e.g. "Thomas Schmidt (Hamburg)"). For the original entry, specify a substitution (e.g. "Thomas Schmidt (Munich)").
In the context search, you can now show or hide the corresponding occurrences for each of the entries.

11.3. Apparently missing addresses

11.4. Splitting entries

11.4.1. Procedure

1) Select "John" and create two duplicates in the word palette ("Duplicate item") One duplicate is called "as preacher", the other "as farmer".

2) Double-click on "as preacher" in the list. The context search opens. By clicking on the eye symbol, the addresses that do not belong in this category can now be removed. The same applies to "as farmer"

3) Select "as preacher" and enter "John" as an umbrella term in the word palette. Likewise for "as farmer".

Version 8