Generating the Manifest

The manifest stores all of the parameters for HIVE

HIVE2 needs to know which vocabularies to use for each file and what the parameters are for the vocabulary: minimum word count, for example. All of this information is stored in a csv file that includes the names of the all the files to be processed. This is called the “manifest” for HIVE. We create it with a Python script.

  1. Run auto-manifest.py on the batch folder.
    1. Provide the edition number (1 or 2 digits) of the batch folder when requested.
    2. Provide the letter name when requested.
  2. The script creates a manifest file and saves it to the batch folder. It names the file manifest_ + the letter name.
    manifest_A.csv
  3. The batch folder now contains three files for each entry and a single manifest file for the batch.
    each entry full text kp*.txt
    each entry NER Topics kp*a.csv
    each entry NER Geo kp*b.csv
    one per folder manifest manifest_*.csv