text-gen-bot/README.md

87 lines
3.2 KiB
Markdown
Raw Normal View History

2022-08-15 00:50:58 -04:00
# text-gen-bot
2022-08-23 13:39:54 -04:00
2022-08-23 11:54:01 -04:00
Matrix bot that generates messages based off of messages of other users using a neural network. The first Matrix AI?
2022-08-23 13:39:54 -04:00
2022-08-23 11:55:14 -04:00
Note: Project is still being developed and some functionality is not fully implemented yet.
2022-08-15 17:46:01 -04:00
## Table of content
2022-08-15 20:00:17 -04:00
- [Useage](#usage)
- [Setup](#setup)
- [Configurations](#configurations)
- [Using existing messages](#using-existing-messages)
### Usage
First install the needed using the instructions [here](#setup). Then copy `example.config.json` and rename it `config.json`. Replace the items in angled brackets with their respective values of the bot account (e.g. replace `<DOMAIN.TLD>` with the homeserver url like `https://matrix.org` or `https://matrix.arrayinamatrix.xyz`). An explanation of each configurable string is located in [configurations](#configurations) section of this document. To obtain the token of an account follow the instructions [here](https://matrix.org/docs/guides/usage-of-matrix-bot-sdk#instantiation).
2022-08-15 20:00:17 -04:00
Once the config file has been populated with valid data, execute the `index.js` file (Warning: executing for the first time will be slow.).
```sh
$ node index.js
...
2022-08-15 20:10:42 -04:00
<some warnings show up, ignore them>
2022-08-15 20:00:17 -04:00
...
Client has started!
2022-08-15 20:12:48 -04:00
...
2022-08-15 20:00:17 -04:00
```
List of directories and files created by the project and its dependencies:
- `aitextgen/`
- `node_modules/`
- `trained_model/`
- `aitextgen.tokenizer.json`
- `storage.json`
- `training-matrix.txt`
2022-08-16 01:27:53 -04:00
2022-08-23 13:22:26 -04:00
#### Commands
2022-08-23 11:55:14 -04:00
2022-08-23 13:22:26 -04:00
The bot has several commands:
```text
► generate ⇢ Generates a message in the room.
► train ⇢ Trains the AI's mini GPT-2 model.
```
2022-08-23 11:55:14 -04:00
### Setup
2022-08-15 17:46:01 -04:00
2022-08-15 22:50:25 -04:00
The project is split into 2 parts `index.js` and `textgen.py`. The `index.js` file contains the code that interacts with the user on Matrix and sends text generated by the `textgen.py` file.
2022-08-15 19:03:20 -04:00
2022-08-22 01:55:10 -04:00
Install [matrix-bot-sdk](https://github.com/turt2live/matrix-bot-sdk) and [python-shell](https://github.com/extrabacon/python-shell) (JS):
2022-08-15 17:46:01 -04:00
```sh
2022-08-15 19:03:20 -04:00
> pnpm add matrix-bot-sdk
2022-08-22 01:55:10 -04:00
> pnpm add python-shell
2022-08-15 17:46:01 -04:00
```
2022-08-22 01:55:10 -04:00
Install [aitextgen](https://github.com/minimaxir/aitextgen) (PY):
2022-08-15 17:46:01 -04:00
```sh
2022-08-15 19:03:20 -04:00
> pip3 install aitextgen
2022-08-15 17:46:01 -04:00
```
2022-08-16 00:39:40 -04:00
### Configurations
2022-08-16 00:39:40 -04:00
2022-08-16 00:50:05 -04:00
Before a bot can be used the fields in the `config.json` file must be populated with valid information. Values in angled brackets (stared below) must be supplied before usage.
2022-08-16 00:39:40 -04:00
```text
2022-08-16 00:39:40 -04:00
► homeserver* ⇢ URL of the bot's homeserver.
► token* ⇢ Account token used to sign in the bot.
► user* ⇢ Account's User ID.
2022-08-16 03:43:01 -04:00
► file ⇢ Path of file used for training the AI (.txt file only).
2022-08-16 00:39:40 -04:00
2022-08-16 03:43:01 -04:00
► prefix ⇢ Bot listens to commands that start with this prefix.
2022-08-16 00:39:40 -04:00
2022-08-16 11:28:45 -04:00
► frequency ⇢ How often the bot sends a message (keep high to prevent spam).
2022-08-16 00:39:40 -04:00
2022-08-16 03:43:01 -04:00
► retrain ⇢ The bot retrains itself after this many extra lines of messages are recorded in the text file.
2022-08-16 00:39:40 -04:00
```
### Using existing messages
If you do not want to wait until the bot creates its own dataset from new messages, you can export chat history of a room easily using [Element](https://element.io/blog/element-1-9-1-export-is-finally-here/). In this case, you will need to manually remove the time stamps from the text file then place the path of the file in the `file` variable in the configuration file.