Run HIVE2

With the NER output, plain text, and a manifest in place, we are ready to generate the index terms.

Once everything is prepared, we are ready to process our materials in HIVE2.

  1. Run batch-indexer.py. A small GUI appears.
  2. Click 1. Select manifest.csv file when prompted and navigate to the manifest file. Since it is in the same folder with the plain-text and NER files, the script now has everything it needs to process the files.
  3. Click 2. Start indexing.
  4. HIVE2 will process each file individually, using the vocabularies and parameters specified in the manifest to generate index terms and links to online authority files for the entry.
  5. The batch folder is now complete. It contains four files for each entry, plus one manifest file for the batch.
    each entry full text kp*.txt
    each entry NER Topics kp*a.csv
    each entry NER Geo kp*b.csv
    one per folder manifest manifest_*.csv
    each entry batch output kp*_md.csv
  6. When finished, it creates a new CSV file for each entry that stores the HIVE index terms. This output file includes an _md suffix attached to the base file name. It is saved in the same batch folder with all of the other metadata-related files.