MACHINE LEARNING DISCUSSION FORUM FOR TECHNICAL QUERIES

Discussion in 'Big Data and Analytics' started by Nishant Singh_3, Oct 10, 2018.

  1. Nishant Singh_3

    Nishant Singh_3 Well-Known Member
    Simplilearn Support

    Joined:
    Aug 1, 2018
    Messages:
    107
    Likes Received:
    2
    Hi Learners,

    This thread is for you to discuss the queries and concepts related to Machine Learning course.

    Happy Learning !!

    Regards,
    Team Simplilearn
     
    #1
  2. _34254

    _34254 Member

    Joined:
    Jul 9, 2018
    Messages:
    2
    Likes Received:
    0
    Hi
    My Question is related to NLP. When I am trying to download the article from a webpage , java script code is getting downloaded with this. which is not getting removed even after stopwords removal. Below is the code if someone can help

    # Code
    articleURL="https://www.washingtonpost.com/poli...id-thank-her-her-vote/?utm_term=.9b78180faf97"

    def GetText(url):
    page=urlopen(url).read().decode('utf8','ignore')
    soup=BeautifulSoup(page,'lxml')
    text=soup.find('article')
    return text.encode('ascii',errors='replace')

    Text=GetText(articleURL).decode()
     
    #2

Share This Page