Monday, September 23, 2024

AI - Natural Language Processing - Stemming

 Stemming is a text processing task in which you reduce words to their root, which is the core part of a word. For example, the words “helping” and “helper” share the root “help.” Stemming allows you to zero in on the basic meaning of a word rather than all the details of how it’s being used. 


There are 4 stemmers: Porter Stemmer, Snowball Stemmer, ARLSTem Stemmer,  ARLSTem2 Stemmer


Code:

from nltk.stem import PorterStemmer
from nltk.stem.snowball import SnowballStemmer
from nltk.stem.arlstem import ARLSTem
from nltk.stem.arlstem2 import ARLSTem2
from nltk.tokenize import word_tokenize

stemmer = PorterStemmer()

sentence_for_stemming = "Jesus answered, I am the way and the truth and the life. No one comes to the Father except through me."
words = word_tokenize(sentence_for_stemming)

stemmed_words = [stemmer.stem(word) for word in words]
print("\n stemmed_words by PorterStemmer")
print(stemmed_words)

stemmer = SnowballStemmer("english", ignore_stopwords=True)
stemmed_words = [stemmer.stem(word) for word in words]
print("\n stemmed_words by SnowballStemmer")
print(stemmed_words)

stemmer = ARLSTem()
stemmed_words = [stemmer.stem(word) for word in words]
print("\n stemmed_words by ARLSTem stemmer")
print(stemmed_words)

sentence_for_stemming = "Thầy là đường, là sự thật, và là sự sống. Không ai đến được với Cha mà không qua Thầy"
words = word_tokenize(sentence_for_stemming)

stemmer = ARLSTem2()
stemmed_words = [stemmer.stem(word) for word in words]
print("\n stemmed_words by ARLSTem2 stemmer")
print(stemmed_words)




Result of stemming by Porter Stemmer, Snowball Stemmer, ARLSTem Stemmer,  ARLSTem2 Stemmer:

 stemmed_words by PorterStemmer

['jesu', 'answer', ',', 'i', 'am', 'the', 'way', 'and', 'the', 'truth', 'and', 'the', 'life', '.', 'no', 'one', 'come', 'to', 'the', 'father', 'except', 'through', 'me', '.']


 stemmed_words by SnowballStemmer

['jesus', 'answer', ',', 'i', 'am', 'the', 'way', 'and', 'the', 'truth', 'and', 'the', 'life', '.', 'no', 'one', 'come', 'to', 'the', 'father', 'except', 'through', 'me', '.']


 stemmed_words by ARLSTem stemmer

['Jesus', 'answered', ',', 'I', 'am', 'the', 'way', 'and', 'the', 'truth', 'and', 'the', 'life', '.', 'No', 'one', 'comes', 'to', 'the', 'Father', 'except', 'through', 'me', '.']


 stemmed_words by ARLSTem2 stemmer

['Thầy', 'là', 'đường', ',', 'là', 'sự', 'thật', ',', 'và', 'là', 'sự', 'sống', '.', 'Không', 'ai', 'đến', 'được', 'với', 'Cha', 'mà', 'không', 'qua', 'Thầy']





My solution: GenAI + Mobile App + Web App.
Beside that use Machine Learning for Data Analysis: predict stock price is a one.


Call me: +84854147015

WhatsApp: +601151992689

https://amatasiam.web.app

Email: ThomasTrungVo@Gmail.Com



My services: Predict and more detail in each group/cluster, between each cluster/group or do predict in number of stock codes or do predict in any stock data from any stock market, (US, Hong Kong, Singapore, Japan, London, Korea) .


You need to get an AI, Machine Learning or OpenAI system? Call me!


Cut 90% cost by using my development services for AI, Machine Learning, Mobile App and Web App!






No comments:

Post a Comment