Post
104
Mo bigga' != Mo betta' with this geometric penta structure.
More purpose with more careful organization... now we're talking.
I'm going heavy into lexical cardinality today and preparing a full crystal structured geometry that is full wordnet capable. Anything that isn't can be formed at runtime.
Full lexicality will include unigrams, 2-6 ngram counts from wordnet with frequency weights, usage, and a multitude of other elements. Each will be crystallized specifically. If you have any suggestions to making this more robust I'm all ears.
I could go with google books or something bigger, but I'm sticking to wordnet because it won't take me weeks to process entirely.
Crystal geometry will be given rich versions that include the correct lexical and organizational subsets specific to the lexicality and frequency of use, as well as the proper ascii, wordnet, and unicode sets.
For wordnet-rich; Each definition will attribute towards the overall goal of the upcoming crystals so the system will represent that goal proportionately through multiple crystals and trajectory concatenated rather than full concatenation like the current vocabulary is doing. Additionally, the frequency tokens will decide the orthogonal trajectory more carefully.
For testing and quick prototype purposes;
We will need to train a Bert variant that can house some capability of rapid geometric crystal prediction through ngram feature similarity, sentence similarity, sentence classification, and a few other bert traits that bert-beatrix-2048 is capable of. I know Bert can handle this at least - however Bert can't house the entirety of meaning so it will be imperfect... even so it will be considerably faster than trying to query the whole dataset every time you want a character, or preparing a massive vocab for rapid testing and iteration. Ask bert.
Not to mention feature extraction for training rapid classification heads with geometric subsystems, which are notoriously fast at training.
More purpose with more careful organization... now we're talking.
I'm going heavy into lexical cardinality today and preparing a full crystal structured geometry that is full wordnet capable. Anything that isn't can be formed at runtime.
Full lexicality will include unigrams, 2-6 ngram counts from wordnet with frequency weights, usage, and a multitude of other elements. Each will be crystallized specifically. If you have any suggestions to making this more robust I'm all ears.
I could go with google books or something bigger, but I'm sticking to wordnet because it won't take me weeks to process entirely.
Crystal geometry will be given rich versions that include the correct lexical and organizational subsets specific to the lexicality and frequency of use, as well as the proper ascii, wordnet, and unicode sets.
For wordnet-rich; Each definition will attribute towards the overall goal of the upcoming crystals so the system will represent that goal proportionately through multiple crystals and trajectory concatenated rather than full concatenation like the current vocabulary is doing. Additionally, the frequency tokens will decide the orthogonal trajectory more carefully.
For testing and quick prototype purposes;
We will need to train a Bert variant that can house some capability of rapid geometric crystal prediction through ngram feature similarity, sentence similarity, sentence classification, and a few other bert traits that bert-beatrix-2048 is capable of. I know Bert can handle this at least - however Bert can't house the entirety of meaning so it will be imperfect... even so it will be considerably faster than trying to query the whole dataset every time you want a character, or preparing a massive vocab for rapid testing and iteration. Ask bert.
Not to mention feature extraction for training rapid classification heads with geometric subsystems, which are notoriously fast at training.