text-gen-bot/README.md

94 lines
3.4 KiB
Markdown

# text-gen-bot
Matrix bot that generates messages based off of messages of other users using a neural network. The first Matrix AI? [Featured on TWIM!](https://matrix.org/blog/2022/08/26/this-week-in-matrix-2022-08-26#dept-of-interesting-projects-%EF%B8%8F)
Note: Project is still being developed and some functionality is not fully implemented yet.
## Table of content
- [Useage](#usage)
- [Commands](#commands)
- [Setup](#setup)
- [Configurations](#configurations)
- [Using existing messages](#using-existing-messages)
### Usage
First install the needed using the instructions [here](#setup). Then copy `example.config.json` and rename it `config.json`. Replace the items in angled brackets with their respective values of the bot account (e.g. replace `<DOMAIN.TLD>` with the homeserver url like `https://matrix.org` or `https://matrix.arrayinamatrix.xyz`). An explanation of each configurable string is located in [configurations](#configurations) section of this document. To obtain the token of an account follow the instructions [here](https://matrix.org/docs/guides/usage-of-matrix-bot-sdk#instantiation).
Once the config file has been populated with valid data, execute the `index.js` file (Warning: executing for the first time will be slow.).
```sh
$ node index.js
...
<some warnings show up, ignore them>
...
Client has started!
...
```
List of directories and files created by the project and its dependencies:
- `aitextgen/`
- `node_modules/`
- `trained_model/`
- `aitextgen.tokenizer.json`
- `storage.json`
- `training-matrix.txt`
#### Commands
The bot has several commands:
```text
► speak ⇢ Generates a message in the room.
► prompt ⇢ Generate a message using a prompt.
```
To train the AI, run this command:
```sh
> python3 train.py
```
### Setup
The project is split into 2 parts `index.js` and `textgen.py`. The `index.js` file contains the code that interacts with the user on Matrix and sends text generated by the `textgen.py` file.
Install [matrix-bot-sdk](https://github.com/turt2live/matrix-bot-sdk) and [python-shell](https://github.com/extrabacon/python-shell) (JS):
```sh
> pnpm add matrix-bot-sdk
> pnpm add python-shell
```
Install [aitextgen](https://github.com/minimaxir/aitextgen) (PY):
```sh
> pip3 install aitextgen
```
### Configurations
Before a bot can be used the fields in the `config.json` file must be populated with valid information. Values in angled brackets (stared below) must be supplied before usage.
```text
► homeserver* ⇢ URL of the bot's homeserver.
► token* ⇢ Account token used to sign in the bot.
► user* ⇢ Account's User ID.
► file ⇢ Path of file used for training the AI (.txt file only).
► prefix ⇢ Bot listens to commands that start with this prefix.
► frequency ⇢ How often the bot sends a message (keep high to prevent spam).
► retrain ⇢ The bot retrains itself after this many extra lines of messages are recorded in the text file.
```
### Using existing messages
If you do not want to wait until the bot creates its own dataset from new messages, you can export chat history of a room easily using [Element](https://element.io/blog/element-1-9-1-export-is-finally-here/). In this case, you will need to manually remove the time stamps from the text file then place the path of the file in the `file` variable in the configuration file.