Commit Graph

89 Commits

Author SHA1 Message Date
466a16b2b4 feat(data): add more 2021-05-16 13:03:17 -04:00
2b2dd18717 feat(data): add more 2021-05-15 18:15:23 -04:00
10c0a593e3 feat(data): add more 2021-05-12 14:58:43 -04:00
e8ae13a648 feat(data): add more 2021-05-10 12:23:42 -04:00
d8f32ff7b1 feat(data): add more 2021-05-06 10:04:39 -04:00
fcc309b5b3 feat(data): add more 2021-05-04 13:27:48 -04:00
4e2b3c0f0d feat(data): add more 2021-05-03 16:27:07 -04:00
58bd18cfbe feat(data): add more 2021-05-01 23:00:04 -04:00
afcd9d244a feat(data): add more 2021-04-30 15:41:44 -04:00
11176694b8 feat: remove defs, add some loc, add context menu 2021-04-27 20:16:07 -04:00
c23a47ff1c feat(data): add more 2021-04-25 19:34:15 -04:00
c2f6b71410 feat(data): add more 2021-04-25 11:43:06 -04:00
e550f83d92 feat(data): add more 2021-04-23 13:21:36 -04:00
a47c1c8aa4 feat(data): add more 2021-04-20 15:47:17 -04:00
9a682aa7db feat(data): add more 2021-04-18 19:26:00 -04:00
33cbdcf327 feat(data): really stress normal 2021-04-17 12:28:45 -04:00
67434ed07d feat(data): add more 2021-04-17 12:25:46 -04:00
7ee3550400 fix(trainer): ignore gg 2021-04-15 16:13:48 -04:00
7f31f8b825 feat(data): add more 2021-04-15 12:48:56 -04:00
1e70519b29 feat(data): add more 2021-04-14 14:40:16 -04:00
f14a2ede44 feat(data): add more 2021-04-12 14:30:06 -04:00
08d12c5a84 feat(data): add more 2021-04-11 07:46:55 -04:00
da37007270 feat: use imgui tables 2021-04-08 12:09:53 -04:00
8c0142ae62 refactor: update for api level 3 2021-04-05 14:45:04 -04:00
3992b502ad feat(data): add more 2021-04-02 15:11:56 -04:00
7e6ddceef5 feat: add category for static substitutes 2021-03-31 14:37:08 -04:00
02d5767c43 feat(data): add more data 2021-03-24 18:37:28 -04:00
12e6d55b20 feat(data): add more data 2021-03-22 18:23:22 -04:00
7c9143b211 feat(data): add more 2021-03-19 14:36:10 -04:00
5286cf4b00 feat(data): add more 2021-03-17 12:01:01 -04:00
21c8e86002 feat(data): add more 2021-03-14 20:35:05 -04:00
4685b6167f feat(data): add more data 2021-03-12 12:34:42 -05:00
6417d1b37b feat(data): add more data 2021-03-10 19:40:40 -05:00
d215ed6d60 feat(data): add more 2021-03-06 20:40:41 -05:00
ef0624efde feat(data): add more 2021-03-05 23:01:05 -05:00
20530c6290 feat(trainer): ignore "mounts" 2021-03-05 13:50:16 -05:00
7a8da6df1e feat(data): add more 2021-03-05 02:24:16 -05:00
f96ef4fc1b feat(data): add more 2021-03-04 20:06:26 -05:00
0af172e0a6 feat: stop trying to separate static sub messages 2021-03-04 16:48:11 -05:00
4148df1237 feat(data): add more data 2021-03-03 19:55:54 -05:00
891d4f5aae feat(data): add more data 2021-03-03 18:05:57 -05:00
4d35c0bac6 feat(data): add more data 2021-03-02 21:16:34 -05:00
1ef5c9f1b5 feat: add community ad filtering
Also add tooltip on filter hover with description.
2021-03-02 13:19:47 -05:00
fd256722a1 feat(data): add more 2021-03-02 12:44:14 -05:00
65558fa199 feat(data): ignore "blu" and add more 2021-02-26 12:07:19 -05:00
83e6b20333 feat(data): add more 2021-02-24 20:01:41 -05:00
819ac1b457 feat(data): add more 2021-02-21 15:50:17 -05:00
0dc0c2ef00 feat(data): add more data
Also pull out stop words into field.
2021-02-20 19:25:15 -05:00
c3df0a1f8e feat: add normalisation to pipeline
Add a step to normalise messages to the ML pipeline. This ensures
computed properties run on the raw data (which is actually partially
normalised by the compute context). This prevents properties which
rely on symbols (e.g. "B>") from being unable to work properly when
normalisation happens before they have access to the input.
2021-02-17 21:45:09 -05:00
d00b3b0845 feat: better handle puncutation
Certain symbols are turned into one space so the model sees multiple
words instead of one. Previously "[RP]Hi" would turn into "RPHi" and
be its own token. Now it turns into "RP" and "Hi", counting as two
tokens. This change increased the model's accuracy.

Also make "18", "http", "https", and LGBT-related words into stop
words (meaning they're ignored). Each of these stop words made the
model more accurate and reduced unwanted bias.

Messages destined for ML are now normalised by the plugin in the same
way the model's input is for training. This should make the results
come closer to expected.
2021-02-17 20:01:34 -05:00