A fast text tendency classification method is proposed, which adopts the category space model to describe the tendency of words to categories, and realizes classification based on the statistical characteristics of words; in view of the complexity of tendency classification, a new secondary feature extraction method is proposed on the basis of comprehensive consideration of three statistical characteristics: word frequency, word text frequency, and word distribution: the first feature extraction adopts a combined feature extraction method to remove low-frequency words and noise words evenly distributed in each category; the second feature extraction removes words with unclear category tendency. Experiments show that this classification method not only has high classification performance, but also runs fast, and has certain practical value in information retrieval, information filtering, content security management, etc. Keyword category weight; Category space model; Text tendency classification; Secondary feature extraction
You Might Like
Recommended ContentMore
Open source project More
Popular Components
Searched by Users
Just Take a LookMore
Trending Downloads
Trending ArticlesMore