posTag

This is a part-of-speech tagger using a Hidden Markov Model, specifically, the Viterbi Algorithm, in Python. This project comes with two pickled dictionaries:

A.pickle: contains the tag transition probabilities

A[tag 1][tag 2] = P(tag 2|tag 1)

B.pickle: contains the word likelihood

B[tag][word] = P(word|tag)

This project also contains the files brown.test, which is a list of untagged sentences from the Brown Corpus and brown.test.answers, which is the tagged version of those sentences.

The Python file implements the Viterbi Algorithm + backtrace method on brown.test to find the most likely POS tags using the pickled dictionaries. It then compares the predicted POS tags with the actual answers in brown.test.answers, and calculates the accuracy, which ends up to be roughly 96.19%.

This project was originally created for my Introduction to Natural Language Processing class at San Jose State University.

Scott Ruddell sruddell09@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
A.pickle		A.pickle
B.pickle		B.pickle
README.md		README.md
brown.test		brown.test
brown.test.answers		brown.test.answers
posTag.py		posTag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

posTag

About

Releases

Packages

Languages

sruddell09/posTag

Folders and files

Latest commit

History

Repository files navigation

posTag

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages