84644d2806
feat(data): add more data
...
Also pull out stop words into field.
2021-02-20 19:25:15 -05:00
b36377c16e
feat: add normalisation to pipeline
...
Add a step to normalise messages to the ML pipeline. This ensures
computed properties run on the raw data (which is actually partially
normalised by the compute context). This prevents properties which
rely on symbols (e.g. "B>") from being unable to work properly when
normalisation happens before they have access to the input.
2021-02-17 21:45:09 -05:00
c65fb94ad6
feat: better handle puncutation
...
Certain symbols are turned into one space so the model sees multiple
words instead of one. Previously "[RP]Hi" would turn into "RPHi" and
be its own token. Now it turns into "RP" and "Hi", counting as two
tokens. This change increased the model's accuracy.
Also make "18", "http", "https", and LGBT-related words into stop
words (meaning they're ignored). Each of these stop words made the
model more accurate and reduced unwanted bias.
Messages destined for ML are now normalised by the plugin in the same
way the model's input is for training. This should make the results
come closer to expected.
2021-02-17 20:01:34 -05:00
7f27a17f4f
feat(data): add more data
2021-02-17 18:58:34 -05:00
beb47b7eb4
feat(data): add more
2021-02-16 12:15:06 -05:00
4e182112b4
feat(data): add more data
2021-02-14 15:40:56 -05:00
c208210a64
feat(data): add more data
2021-02-12 11:56:54 -05:00
8313fa20ed
feat(data): more data
2021-02-08 22:34:29 -05:00
ba065bd0cf
feat(data): more data
2021-02-07 12:54:14 -05:00
7416023634
feat(data): add more data
2021-02-03 12:16:03 -05:00
Anna Clemens
09d4c8fe38
feat(data): more
2021-02-01 00:19:51 -05:00
Anna Clemens
4241421bed
feat(data): add more data
2021-01-30 20:51:14 -05:00
Anna Clemens
6ba4f5b1a5
chore: fix permissions
2021-01-30 16:10:59 -05:00
Anna Clemens
54987b9fbc
feat(data): add more data
2021-01-30 16:04:37 -05:00
96c0b27595
feat(data): add more
2021-01-29 22:44:51 -05:00
f00ae589bc
feat(data): more
2021-01-29 15:01:44 -05:00
51900a9b6b
feat(data): add more data
2021-01-24 18:30:21 -05:00
0d49cdf938
feat(data): more
2021-01-23 19:57:03 -05:00
874b2a8255
feat(data): more data
2021-01-22 17:42:32 -05:00
5ea5b01442
feat(data): more reports
2021-01-22 17:33:17 -05:00
d37fc28bbd
feat(data): more reports
2021-01-21 18:02:33 -05:00
ab02685ccb
feat(data): more reports
2021-01-18 22:19:35 -05:00
c15b6d18c4
feat(data): add more
2021-01-16 13:20:18 -05:00
6094dc1ec8
feat(data): add more data
2021-01-12 11:23:38 -05:00
c52c698929
feat(data): add more reports
2021-01-11 10:31:33 -05:00
b5c297219e
feat(data): add more data
2021-01-06 19:21:03 -05:00
d031926ccb
feat(data): add even more data
2021-01-05 18:02:03 -05:00
9b83e829bd
feat(data): add more data
2021-01-05 11:39:44 -05:00
3c2fa92cff
feat(data): more data
2021-01-03 19:35:33 -05:00
044a13ed3b
feat(data): more reports
2021-01-03 16:56:53 -05:00
b31c6bfc46
feat(data): more reports
2021-01-02 17:28:30 -05:00
1e33ba0487
feat(trainer): have trainer sort data automatically
2021-01-02 16:59:00 -05:00
7147d92468
feat(data): more reports
2021-01-02 16:53:47 -05:00
1577c292a6
feat(data): more reports
2021-01-02 13:52:02 -05:00
deac55d19b
feat(data): add more data
2021-01-02 13:09:00 -05:00
4229438ed5
chore(data): fix mode
2021-01-02 07:54:49 -05:00
0a575866fa
feat(data): add more reports
2021-01-02 07:31:09 -05:00
bbde5df7f8
feat(data): catch up on reports
2020-12-31 17:10:20 -05:00
49d84ee00e
feat(data): add more FC ads
2020-12-29 10:11:56 -05:00
9a8e301c74
chore(data): add more data
2020-12-28 22:39:10 -05:00
908133bdf8
refactor: put computation in interface
...
This basically undoes the benefits of the previous commit. May end up being reverted.
2020-12-28 21:48:31 -05:00
482cc23c7d
feat(trainer): add trainer to actual repo
2020-12-28 20:14:19 -05:00