Commit Graph

13 Commits

Author SHA1 Message Date
Anna 34030cb86d
fix: spacify before counting words 2021-05-03 16:24:57 -04:00
Anna 6db8a80894
feat: stop trying to separate static sub messages 2021-03-04 16:48:11 -05:00
Anna 1836b6dad7
feat(trainer): run on net5 and accept csv path
Hopefully will use this to automate model deployment.
2021-03-02 04:52:36 -05:00
Anna 35faec5fe6
chore: fix modes 2021-02-24 20:23:28 -05:00
Anna b36377c16e
feat: add normalisation to pipeline
Add a step to normalise messages to the ML pipeline. This ensures
computed properties run on the raw data (which is actually partially
normalised by the compute context). This prevents properties which
rely on symbols (e.g. "B>") from being unable to work properly when
normalisation happens before they have access to the input.
2021-02-17 21:45:09 -05:00
Anna c65fb94ad6
feat: better handle puncutation
Certain symbols are turned into one space so the model sees multiple
words instead of one. Previously "[RP]Hi" would turn into "RPHi" and
be its own token. Now it turns into "RP" and "Hi", counting as two
tokens. This change increased the model's accuracy.

Also make "18", "http", "https", and LGBT-related words into stop
words (meaning they're ignored). Each of these stop words made the
model more accurate and reduced unwanted bias.

Messages destined for ML are now normalised by the plugin in the same
way the model's input is for training. This should make the results
come closer to expected.
2021-02-17 20:01:34 -05:00
Anna Clemens 6ba4f5b1a5
chore: fix permissions 2021-01-30 16:10:59 -05:00
Anna Clemens 2229a0534a
feat: use separate process for classifying 2021-01-30 16:02:37 -05:00
Anna c07ed79775
fix: add word boundary checks 2021-01-11 10:31:18 -05:00
Anna 3316fc08d5
refactor: make compute private 2020-12-29 10:31:41 -05:00
Anna 96ef48f9db
refactor(trainer): use correct schema, though it shouldn't matter 2020-12-28 22:04:50 -05:00
Anna 908133bdf8
refactor: put computation in interface
This basically undoes the benefits of the previous commit. May end up being reverted.
2020-12-28 21:48:31 -05:00
Anna 1487863c19
fix: make plugin work on stock Dalamud
Use some horrible, cursed AppDomain shit to load dependencies that break on normal Dalamud in their own environment, then do classification there instead.
2020-12-23 03:52:19 -05:00