Commit Graph

43 Commits

Author SHA1 Message Date
3be14696c6
feat(data): add more 2021-02-21 15:50:17 -05:00
84644d2806
feat(data): add more data
Also pull out stop words into field.
2021-02-20 19:25:15 -05:00
b36377c16e
feat: add normalisation to pipeline
Add a step to normalise messages to the ML pipeline. This ensures
computed properties run on the raw data (which is actually partially
normalised by the compute context). This prevents properties which
rely on symbols (e.g. "B>") from being unable to work properly when
normalisation happens before they have access to the input.
2021-02-17 21:45:09 -05:00
c65fb94ad6
feat: better handle puncutation
Certain symbols are turned into one space so the model sees multiple
words instead of one. Previously "[RP]Hi" would turn into "RPHi" and
be its own token. Now it turns into "RP" and "Hi", counting as two
tokens. This change increased the model's accuracy.

Also make "18", "http", "https", and LGBT-related words into stop
words (meaning they're ignored). Each of these stop words made the
model more accurate and reduced unwanted bias.

Messages destined for ML are now normalised by the plugin in the same
way the model's input is for training. This should make the results
come closer to expected.
2021-02-17 20:01:34 -05:00
7f27a17f4f
feat(data): add more data 2021-02-17 18:58:34 -05:00
beb47b7eb4
feat(data): add more 2021-02-16 12:15:06 -05:00
4e182112b4
feat(data): add more data 2021-02-14 15:40:56 -05:00
c208210a64
feat(data): add more data 2021-02-12 11:56:54 -05:00
8313fa20ed
feat(data): more data 2021-02-08 22:34:29 -05:00
ba065bd0cf
feat(data): more data 2021-02-07 12:54:14 -05:00
7416023634
feat(data): add more data 2021-02-03 12:16:03 -05:00
Anna Clemens
09d4c8fe38
feat(data): more 2021-02-01 00:19:51 -05:00
Anna Clemens
4241421bed
feat(data): add more data 2021-01-30 20:51:14 -05:00
Anna Clemens
6ba4f5b1a5
chore: fix permissions 2021-01-30 16:10:59 -05:00
Anna Clemens
54987b9fbc
feat(data): add more data 2021-01-30 16:04:37 -05:00
96c0b27595
feat(data): add more 2021-01-29 22:44:51 -05:00
f00ae589bc
feat(data): more 2021-01-29 15:01:44 -05:00
51900a9b6b
feat(data): add more data 2021-01-24 18:30:21 -05:00
0d49cdf938
feat(data): more 2021-01-23 19:57:03 -05:00
874b2a8255
feat(data): more data 2021-01-22 17:42:32 -05:00
5ea5b01442
feat(data): more reports 2021-01-22 17:33:17 -05:00
d37fc28bbd
feat(data): more reports 2021-01-21 18:02:33 -05:00
ab02685ccb
feat(data): more reports 2021-01-18 22:19:35 -05:00
c15b6d18c4
feat(data): add more 2021-01-16 13:20:18 -05:00
6094dc1ec8
feat(data): add more data 2021-01-12 11:23:38 -05:00
c52c698929
feat(data): add more reports 2021-01-11 10:31:33 -05:00
b5c297219e
feat(data): add more data 2021-01-06 19:21:03 -05:00
d031926ccb
feat(data): add even more data 2021-01-05 18:02:03 -05:00
9b83e829bd
feat(data): add more data 2021-01-05 11:39:44 -05:00
3c2fa92cff
feat(data): more data 2021-01-03 19:35:33 -05:00
044a13ed3b
feat(data): more reports 2021-01-03 16:56:53 -05:00
b31c6bfc46
feat(data): more reports 2021-01-02 17:28:30 -05:00
1e33ba0487
feat(trainer): have trainer sort data automatically 2021-01-02 16:59:00 -05:00
7147d92468
feat(data): more reports 2021-01-02 16:53:47 -05:00
1577c292a6
feat(data): more reports 2021-01-02 13:52:02 -05:00
deac55d19b
feat(data): add more data 2021-01-02 13:09:00 -05:00
4229438ed5
chore(data): fix mode 2021-01-02 07:54:49 -05:00
0a575866fa
feat(data): add more reports 2021-01-02 07:31:09 -05:00
bbde5df7f8
feat(data): catch up on reports 2020-12-31 17:10:20 -05:00
49d84ee00e
feat(data): add more FC ads 2020-12-29 10:11:56 -05:00
9a8e301c74
chore(data): add more data 2020-12-28 22:39:10 -05:00
908133bdf8
refactor: put computation in interface
This basically undoes the benefits of the previous commit. May end up being reverted.
2020-12-28 21:48:31 -05:00
482cc23c7d
feat(trainer): add trainer to actual repo 2020-12-28 20:14:19 -05:00