export chat history using element

created 2 functions to gen text and train
2022-08-16 01:27:53 -04:00 · 2022-08-16 01:25:38 -04:00
2 changed files with 29 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -17,6 +17,8 @@ Client has started!
 ...
 ```

+If you do not want to wait until the bot creates its own dataset from new messages, you can export chat history easily using [Element](https://element.io/blog/element-1-9-1-export-is-finally-here/). In this case, you will need to manually remove the time stamps from the text file.
+
 ## Setup

 The project is split into 2 parts `index.js` and `textgen.py`. The `index.js` file contains the code that interacts with the user on Matrix and sends text generated by the `textgen.py` file.
@ -44,7 +46,7 @@ Before a bot can be used the fields in the `config.json` file must be populated

 ► user*         ⇢     Account's User ID.

-► file         ⇢     Path of file used for training the AI.
+► file         ⇢     Path of file used for training the AI (.txt file only).

 ► prefix       ⇢     Bot listens to commands that start with this prefix.

--- a/textgen.py
+++ b/textgen.py
@ -0,0 +1,26 @@
+from aitextgen.TokenDataset import TokenDataset
+from aitextgen.tokenizers import train_tokenizer
+from aitextgen.utils import GPT2ConfigCPU
+from aitextgen import aitextgen
+import json
+
+with open('config.json', 'r') as file:
+    json_object = json.load(file)
+
+file_name = json_object['file']
+
+def generate_message():
+    ai = aitextgen(model_folder="trained_model",
+                tokenizer_file="aitextgen.tokenizer.json")
+    ai.generate()
+
+def train_ai():
+    train_tokenizer(file_name)
+    tokenizer_file = "aitextgen.tokenizer.json"
+    config = GPT2ConfigCPU()
+    ai = aitextgen(tokenizer_file=tokenizer_file, config=config)
+    data = TokenDataset(file_name, tokenizer_file=tokenizer_file, block_size=64)
+    ai.train(data, batch_size=8, num_steps=50000, generate_every=5000, save_every=5000)
+    print("AI has been trained!")
+
+print(generate_message())
Author	SHA1	Message	Date
array-in-a-matrix	99b70ab839	export chat history using element	2022-08-16 01:27:53 -04:00
array-in-a-matrix	2fcda33cf8	created 2 functions to gen text and train	2022-08-16 01:25:38 -04:00