0 votes
in NLP using Python by
Explain how we can do parsing.

1 Answer

0 votes
by

Parsing is the method to identify and understand the syntactic structure of a text. It is done by analyzing the individual elements of the text. The machine parses the text one word at a time, then two at a time, further three, and so on.

When the machine parses the text one word at a time, then it is a unigram.

When the text is parsed two words at a time, it is a bigram.

The set of words is a trigram when the machine parses three words at a time.

Look at the below diagram to understand unigram, bigram, and trigram.

Now, let’s implement parsing with the help of the nltk package.

  import nltk

  text = ”Top 30 NLP interview questions and answers”

We will now tokenize the text using word_tokenize.

  text_token= word_tokenize(text)

Now, we will use the function for extracting unigrams, bigrams, and trigrams.

  list(nltk.unigrams(text))

Output:

  [ "Top 30 NLP interview questions and answer"]

  list(nltk.bigrams(text))

Output:

  ["Top 30", "30 NLP", "NLP interview", "interview questions",   "questions and", "and answer"]

  list(nltk.trigrams(text))

Output:

  ["Top 30 NLP", "NLP interview questions", "questions and answers"]

For extracting n-grams, we can use the function nltk.ngrams and give the argument n for the number of parsers.

  list(nltk.ngrams(text,n))

...