Commit Graph

64 Commits

Author SHA1 Message Date
7e6ddceef5 feat: add category for static substitutes 2021-03-31 14:37:08 -04:00
02d5767c43 feat(data): add more data 2021-03-24 18:37:28 -04:00
12e6d55b20 feat(data): add more data 2021-03-22 18:23:22 -04:00
7c9143b211 feat(data): add more 2021-03-19 14:36:10 -04:00
5286cf4b00 feat(data): add more 2021-03-17 12:01:01 -04:00
21c8e86002 feat(data): add more 2021-03-14 20:35:05 -04:00
4685b6167f feat(data): add more data 2021-03-12 12:34:42 -05:00
6417d1b37b feat(data): add more data 2021-03-10 19:40:40 -05:00
d215ed6d60 feat(data): add more 2021-03-06 20:40:41 -05:00
ef0624efde feat(data): add more 2021-03-05 23:01:05 -05:00
20530c6290 feat(trainer): ignore "mounts" 2021-03-05 13:50:16 -05:00
7a8da6df1e feat(data): add more 2021-03-05 02:24:16 -05:00
f96ef4fc1b feat(data): add more 2021-03-04 20:06:26 -05:00
0af172e0a6 feat: stop trying to separate static sub messages 2021-03-04 16:48:11 -05:00
4148df1237 feat(data): add more data 2021-03-03 19:55:54 -05:00
891d4f5aae feat(data): add more data 2021-03-03 18:05:57 -05:00
4d35c0bac6 feat(data): add more data 2021-03-02 21:16:34 -05:00
1ef5c9f1b5 feat: add community ad filtering
Also add tooltip on filter hover with description.
2021-03-02 13:19:47 -05:00
fd256722a1 feat(data): add more 2021-03-02 12:44:14 -05:00
65558fa199 feat(data): ignore "blu" and add more 2021-02-26 12:07:19 -05:00
83e6b20333 feat(data): add more 2021-02-24 20:01:41 -05:00
819ac1b457 feat(data): add more 2021-02-21 15:50:17 -05:00
0dc0c2ef00 feat(data): add more data
Also pull out stop words into field.
2021-02-20 19:25:15 -05:00
c3df0a1f8e feat: add normalisation to pipeline
Add a step to normalise messages to the ML pipeline. This ensures
computed properties run on the raw data (which is actually partially
normalised by the compute context). This prevents properties which
rely on symbols (e.g. "B>") from being unable to work properly when
normalisation happens before they have access to the input.
2021-02-17 21:45:09 -05:00
d00b3b0845 feat: better handle puncutation
Certain symbols are turned into one space so the model sees multiple
words instead of one. Previously "[RP]Hi" would turn into "RPHi" and
be its own token. Now it turns into "RP" and "Hi", counting as two
tokens. This change increased the model's accuracy.

Also make "18", "http", "https", and LGBT-related words into stop
words (meaning they're ignored). Each of these stop words made the
model more accurate and reduced unwanted bias.

Messages destined for ML are now normalised by the plugin in the same
way the model's input is for training. This should make the results
come closer to expected.
2021-02-17 20:01:34 -05:00
fcfe1bb727 feat(data): add more data 2021-02-17 18:58:34 -05:00
f874d8ac37 feat(data): add more 2021-02-16 12:15:06 -05:00
03fe8eecc2 feat(data): add more data 2021-02-14 15:40:56 -05:00
d921a8cfb0 feat(data): add more data 2021-02-12 11:56:54 -05:00
8eb0507041 feat(data): more data 2021-02-08 22:34:29 -05:00
fb9f5d9b94 feat(data): more data 2021-02-07 12:54:14 -05:00
6985eb2eee feat(data): add more data 2021-02-03 12:16:03 -05:00
Anna
53e0bc3309 feat(data): more 2021-02-01 00:19:51 -05:00
Anna
fdf0849ea6 feat(data): add more data 2021-01-30 20:51:14 -05:00
Anna
d7d6c53c75 chore: fix permissions 2021-01-30 16:10:59 -05:00
Anna
d1228e6bee feat(data): add more data 2021-01-30 16:04:37 -05:00
b2e719faa0 feat(data): add more 2021-01-29 22:44:51 -05:00
41e79cb2c9 feat(data): more 2021-01-29 15:01:44 -05:00
bfd6c1b8e2 feat(data): add more data 2021-01-24 18:30:21 -05:00
b0e3c442d1 feat(data): more 2021-01-23 19:57:03 -05:00
d8ccbc6844 feat(data): more data 2021-01-22 17:42:32 -05:00
245a83afe0 feat(data): more reports 2021-01-22 17:33:17 -05:00
75a75476c7 feat(data): more reports 2021-01-21 18:02:33 -05:00
1a2fa2eab4 feat(data): more reports 2021-01-18 22:19:35 -05:00
9931e334dc feat(data): add more 2021-01-16 13:20:18 -05:00
0e30924253 feat(data): add more data 2021-01-12 11:23:38 -05:00
a6b181bdf5 feat(data): add more reports 2021-01-11 10:31:33 -05:00
91fd57db0e feat(data): add more data 2021-01-06 19:21:03 -05:00
601ccffdc0 feat(data): add even more data 2021-01-05 18:02:03 -05:00
1ae0e0feb1 feat(data): add more data 2021-01-05 11:39:44 -05:00