This topic describes how to set up and update the TF.IDF model with new training data.
/share/models/tfidf/en_abstracts.zip
The following procedure assumes that you have downloaded a corpus ZIP file and renamed it to en__abstracts.zip.
To update the TF.IDF model:
[2016/07/15 11:21:42 -0400] [Admin Server] Generating the tfidf model file using new model file...Success! [2016/07/15 11:24:45 -0400] [Admin Server] Publishing the tfidf model file... [2016/07/15 11:24:57 -0400] [Admin Server] Successfully published the model file.
The operation replaces the TF.IDF model's current JAR on the YARN worker nodes with the new one.
./bdd-admin.sh update-model tfidf
This reverts the TF.IDF model to the original, shipped version.