Explain how we can do parsing.

Question

Explain how we can do parsing.

1 Answer

john ganales · Answer 1 · 2023-05-08T22:43:35+0000

Parsing is the method to identify and understand the syntactic structure of a text. It is done by analyzing the individual elements of the text. The machine parses the text one word at a time, then two at a time, further three, and so on.

When the machine parses the text one word at a time, then it is a unigram.

When the text is parsed two words at a time, it is a bigram.

The set of words is a trigram when the machine parses three words at a time.

Look at the below diagram to understand unigram, bigram, and trigram.

Now, let’s implement parsing with the help of the nltk package.

import nltk

text = ”Top 30 NLP interview questions and answers”

We will now tokenize the text using word_tokenize.

text_token= word_tokenize(text)

Now, we will use the function for extracting unigrams, bigrams, and trigrams.

list(nltk.unigrams(text))

Output:

[ "Top 30 NLP interview questions and answer"]

list(nltk.bigrams(text))

Output:

["Top 30", "30 NLP", "NLP interview", "interview questions", "questions and", "and answer"]

list(nltk.trigrams(text))

Output:

["Top 30 NLP", "NLP interview questions", "questions and answers"]

For extracting n-grams, we can use the function nltk.ngrams and give the argument n for the number of parsers.

list(nltk.ngrams(text,n))

Explain how we can do parsing.

Please log in or register to answer this question.

1 Answer