自然语言处理之词袋模型Bag_of_words

自然语言处理之词袋模型Bag_of_words文章目录读取训练数据BeautifulSoup处理获取词袋和向量预测结果使用随机森林分类器进行分类输出提交结果尝试使用xgb还是随机森林好用教程地址:https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words读取训练数据训练数据的内容是2500条电影评论。impor…

大家好,又见面了,我是你们的朋友全栈君。

教程地址:


https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words

读取训练数据

训练数据的内容是2500条电影评论。

import pandas as pd
train = pd.read_csv("./data/labeledTrainData.tsv", header=0, delimiter="\t", quoting=3)
train.head(3)
id sentiment review
0 “5814_8” 1 “With all this stuff going down at the moment …
1 “2381_9” 1 “\”The Classic War of the Worlds\” by Timothy …
2 “7759_3” 0 “The film starts with a manager (Nicholas Bell…
train.shape
(25000, 3)
example = train['review'][0]
example
'"With all this stuff going down at the moment with MJ i\'ve started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ\'s feeling towards the press and also the obvious message of drugs are bad m\'kay.<br /><br />Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br /><br />The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci\'s character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ\'s music.<br /><br />Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br /><br />Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ\'s bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i\'ve gave this subject....hmmm well i don\'t know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."'

train当中的review项里面包含的数据是HTML类型,为了去除HTML标签,保存纯粹的评论,使用BeautifulSoup。

BeautifulSoup处理

from bs4 import BeautifulSoup
# 创建 beautifulsoup 对象
soup = BeautifulSoup(example)
#格式化输出内容
print(soup.prettify())
<html>
 <body>
  <p>
   "With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.
   <br/>
   <br/>
   Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.
   <br/>
   <br/>
   The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.
   <br/>
   <br/>
   Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.
   <br/>
   <br/>
   Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
  </p>
 </body>
</html>
# 查找各个标签
print(soup.title)
print(soup.head)
print(soup.a)
print(soup.p)
None
None
None
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# 遍历孩子
for child in  soup.body.children:
    print (child)
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# find_all是一个很神奇的函数,可以传入字符、列表、正则表达式、函数等等等。
print(soup.find_all('br'))
[<br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>]
# 可以在soup.select里面直接使用css代码
print(soup.select('.p'))
[]
# 获取text
print(soup.get_text())
"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
import nltk
import re
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
lancaster_stemmer = LancasterStemmer()
print (stopwords.words("english"))
['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't", 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]

获取词袋和向量

# 写一个处理函数
def review_to_words( raw_review ):
    review_text = BeautifulSoup(raw_review).get_text()
    # 去除标点和数字,仅保留强烈语气词
    letters_only = re.sub("[^a-zA-Z?!]", " ", review_text) 
    # 统一转换为小写字母
    words = letters_only.lower().split()                             
    # 由于set的搜索速度更快,所以把list转换成set
    stops = set(stopwords.words("english"))                  
    # 移除停止词,并且将词转为原形形式
    meaningful_words = [lancaster_stemmer.stem(w) for w in words if not w in stops]   
    # 返回标准语句
    return( " ".join( meaningful_words ))  
# 获取字符串列表
num_reviews = train["review"].size
clean_train_reviews = []
for i in range(num_reviews):
    clean_train_reviews.append( review_to_words( train["review"][i] ) )

uk edit show rath less extrav us vert person concern get new kitch perhap bedroom bathroom wond grat got us vert show everyth real tv instead mak improv hous occup could afford entir hous get rebuilt know show try show lousy welf system ex us beg hard enough receiv rath vulg produc plac tak plac particul sear also uncal rsther turn on famy depr are pot millionair would far bet help commun whol instead spend hundr thousand doll on hom build someth whol commun perhap plac diy pow tool borrow return along build mat everyon benefit want giv on person caus enorm res among rest loc commun stil liv run hous

在进行下一步之前,有必要介绍一下词袋模型,要将几个句子转化成向量,第一步是把它们包含的所有词不重复地装到一个袋子里,然后这几个句子就可以转换成和袋子里的词的数量一样长的向量,这个向量的每一个位置都对应着袋子里面某一个词在句子中出现的次数,如果没有出现就是0.

from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(analyzer = "word", tokenizer = None, preprocessor = None, stop_words = None, max_features = 5000)
train_data_features = vectorizer.fit_transform(clean_train_reviews)
train_data_features = train_data_features.toarray()
print(train_data_features.shape)
(25000, 5000)

查看词袋里面装的具体内容

vocab = vectorizer.get_feature_names()
print(len(vocab))
print(vocab)
5000
['abandon', 'abbot', 'abc', 'abduc', 'abl', 'abomin', 'aborigin', 'abort', 'abound', 'about', 'abraham', 'abrupt', 'abs', 'absolv', 'absorb', 'absurd', 'abud', 'abund', 'abus', 'abysm', 'ac', 'academy', 'acc', 'acceiv', 'access', 'accid', 'acclaim', 'accompany', 'accompl', 'accord', 'account', 'accus', 'ach', 'achiev', 'acid', 'acknowledg', 'acquaint', 'acquir', 'across', 'act', 'actress', 'ad', 'adam', 'adapt', 'addict', 'addit', 'address', 'adel', 'adequ', 'adjust', 'admin', 'admir', 'admit', 'adolesc', 'adopt', 'adr', 'adult', 'adv', 'advers', 'advert', 'aesthet', 'af', 'affair', 'affect', 'affirm', 'affleck', 'afford', 'afr', 'afraid', 'afric', 'afterma', 'afternoon', 'afterward', 'ag', 'again', 'agend', 'aggress', 'ago', 'agon', 'agony', 'agr', 'agree', 'ah', 'ahead', 'aid', 'aim', 'aimless', 'air', 'airpl', 'airport', 'ak', 'akin', 'akshay', 'al', 'ala', 'alarm', 'alb', 'albeit', 'albert', 'alcohol', 'alec', 'alert', 'alex', 'alexand', 'alexandr', 'alfr', 'alic', 'alik', 'alison', 'all', 'alleg', 'alley', 'allow', 'allud', 'almost', 'alon', 'along', 'alongsid', 'alot', 'already', 'alright', 'also', 'alt', 'altern', 'although', 'altm', 'altogeth', 'alvin', 'alway', 'aly', 'am', 'amand', 'amaz', 'amazon', 'amb', 'ambigu', 'ambit', 'amby', 'americ', 'amidst', 'amitabh', 'among', 'amongst', 'amount', 'ampl', 'amrit', 'amus', 'amy', 'an', 'analys', 'anch', 'anct', 'and', 'anderson', 'andr', 'andre', 'andrew', 'andy', 'ang', 'angel', 'angl', 'angry', 'angst', 'angy', 'anil', 'anim', 'ann', 'annount', 'annoy', 'anny', 'anoth', 'answ', 'ant', 'antagon', 'antholog', 'anthony', 'anticip', 'anton', 'antonio', 'antonion', 'antwon', 'anxy', 'anybody', 'anyhow', 'anym', 'anyon', 'anyone', 'anyth', 'anytim', 'anyway', 'anywh', 'ap', 'apart', 'apocalypt', 'apolog', 'app', 'appal', 'appear', 'appl', 'applaud', 'apply', 'apprecy', 'approach', 'appropry', 'approv', 'approxim', 'april', 'apt', 'ar', 'arab', 'arc', 'arch', 'archaeolog', 'architect', 'are', 'area', 'argentin', 'argu', 'ariel', 'aristocr', 'arkin', 'arm', 'armstrong', 'army', 'arnold', 'around', 'arquet', 'arrang', 'arrest', 'arrog', 'arrow', 'art', 'arth', 'artic', 'artsy', 'artwork', 'arty', 'as', 'ash', 'asham', 'ashley', 'asid', 'ask', 'asleep', 'aspect', 'aspir', 'ass', 'assassin', 'assault', 'assembl', 'assert', 'asset', 'assign', 'assist', 'assocy', 'assort', 'assum', 'astair', 'aston', 'astound', 'astronaut', 'asyl', 'at', 'athlet', 'atl', 'atmosph', 'atroc', 'atrocy', 'attach', 'attack', 'attempt', 'attenborough', 'attend', 'attitud', 'attorney', 'attract', 'attribut', 'aud', 'audio', 'audit', 'audrey', 'audy', 'august', 'aunt', 'aur', 'aussy', 'aust', 'austin', 'austral', 'aut', 'auth', 'auto', 'autobiograph', 'autom', 'av', 'avail', 'aveng', 'avid', 'avoid', 'aw', 'await', 'awak', 'award', 'away', 'awesom', 'awhil', 'awkward', 'ax', 'aztec', 'bab', 'baby', 'babysit', 'bacal', 'bach', 'bachch', 'bachel', 'back', 'backdrop', 'background', 'backst', 'backward', 'bacon', 'bad', 'baddy', 'baffl', 'bag', 'bait', 'bak', 'baksh', 'bal', 'bald', 'baldwin', 'ballet', 'ban', 'band', 'bang', 'bank', 'bant', 'bar', 'barb', 'barbar', 'barbr', 'bargain', 'bark', 'barn', 'barney', 'baron', 'barrel', 'barry', 'barrym', 'bas', 'basebal', 'bash', 'basket', 'basketbal', 'bastard', 'bat', 'bath', 'bathroom', 'batm', 'battl', 'battlefield', 'bau', 'bay', 'bbc', 'be', 'beach', 'bean', 'bear', 'beard', 'beast', 'beat', 'beatl', 'beatty', 'beauty', 'beav', 'becam', 'beckham', 'beckins', 'becom', 'bed', 'bedroom', 'beer', 'beetl', 'befriend', 'beg', 'begin', 'begun', 'behav', 'behavio', 'behavy', 'behind', 'behold', 'being', 'bel', 'belg', 'believ', 'belong', 'belov', 'belt', 'belush', 'ben', 'bend', 'benea', 'benefit', 'bennet', 'bent', 'beowulf', 'bergm', 'berkeley', 'berlin', 'bernard', 'besid', 'best', 'bet', 'betray', 'better', 'betty', 'bev', 'bew', 'bewild', 'beyond', 'bias', 'bibl', 'big', 'biggest', 'bik', 'bikin', 'biko', 'bil', 'bimbo', 'bin', 'bind', 'bing', 'biograph', 'biop', 'bir', 'bird', 'birthday', 'bit', 'bitch', 'bittersweet', 'bizar', 'bla', 'black', 'blackmail', 'blad', 'blah', 'blair', 'blak', 'blam', 'bland', 'blank', 'blast', 'blat', 'blaz', 'bleak', 'blee', 'blend', 'bless', 'blew', 'blind', 'blink', 'bliss', 'blob', 'block', 'blockbust', 'blond', 'blood', 'bloody', 'bloom', 'blossom', 'blow', 'blown', 'blu', 'blunt', 'blur', 'bo', 'board', 'boast', 'boat', 'bob', 'bobby', 'body', 'bog', 'bogart', 'boggl', 'boil', 'bol', 'bold', 'bollywood', 'bomb', 'bon', 'bond', 'bonny', 'boo', 'boob', 'boog', 'book', 'boom', 'boost', 'boot', 'bor', 'bord', 'boredom', 'born', 'borrow', 'boss', 'boston', 'both', 'bottl', 'bottom', 'bought', 'bound', 'bount', 'bounty', 'bourn', 'bout', 'bow', 'bowl', 'box', 'boy', 'boyfriend', 'boyl', 'brad', 'brady', 'brain', 'brainless', 'branagh', 'branch', 'brand', 'brando', 'brat', 'brav', 'braveheart', 'bravo', 'brazil', 'brea', 'bread', 'break', 'breakdown', 'breakfast', 'breast', 'breath', 'breathtak', 'bree', 'brend', 'brent', 'bret', 'bri', 'brick', 'brid', 'bridg', 'bridget', 'brief', 'bright', 'bril', 'bring', 'brit', 'britain', 'bro', 'broad', 'broadcast', 'broadway', 'brok', 'bronson', 'bront', 'brood', 'brook', 'brooklyn', 'brosn', 'broth', 'brought', 'brow', 'brown', 'bruc', 'bruno', 'brush', 'brut', 'bry', 'bsg', 'btw', 'bubbl', 'buck', 'bucket', 'bud', 'buddy', 'budget', 'buff', 'buffalo', 'bug', 'build', 'built', 'bul', 'bulk', 'bullet', 'bum', 'bumbl', 'bump', 'bunch', 'bunny', 'bur', 'burd', 'burk', 'burn', 'burst', 'burt', 'burton', 'bury', 'bus', 'busey', 'bush', 'businessm', 'bust', 'busy', 'but', 'butch', 'butl', 'button', 'buy', 'buzz', 'bye', 'cab', 'cabin', 'cabl', 'caf', 'cag', 'cagney', 'cain', 'cak', 'cal', 'calc', 'calib', 'californ', 'calm', 'cam', 'cambod', 'camcord', 'cameo', 'camer', 'camera', 'cameram', 'cameron', 'camp', 'campaign', 'campbel', 'campy', 'can', 'canad', 'cancel', 'candid', 'candl', 'candy', 'cannib', 'cannon', 'cannot', 'cant', 'canyon', 'cap', 'capac', 'capit', 'capot', 'capt', 'captain', 'car', 'card', 'cardboard', 'carel', 'cares', 'caretak', 'carey', 'carl', 'carlito', 'carlo', 'carm', 'carn', 'carol', 'carolin', 'caron', 'carp', 'carradin', 'carrey', 'carry', 'cart', 'cartoon', 'cary', 'cas', 'casablanc', 'cash', 'casino', 'casp', 'cassavet', 'cassidy', 'cast', 'castl', 'cat', 'catalog', 'catastroph', 'catch', 'catchy', 'categ', 'catherin', 'cathol', 'cattl', 'caught', 'caus', 'caut', 'cav', 'cbs', 'cd', 'ceas', 'cecil', 'ceil', 'cel', 'celebr', 'celest', 'celluloid', 'cemetery', 'cens', 'cent', 'century', 'cerebr', 'ceremony', 'certain', 'cg', 'cgi', 'chain', 'chainsaw', 'chair', 'challeng', 'chamb', 'chamberlain', 'champ', 'chan', 'chang', 'channel', 'chant', 'chao', 'chaplin', 'chapt', 'char', 'charact', 'charg', 'charism', 'charl', 'charlot', 'charlton', 'charm', 'chas', 'chat', 'chavez', 'che', 'cheadl', 'cheap', 'cheaply', 'check', 'cheek', 'chees', 'cheesy', 'chem', 'cher', 'chess', 'chest', 'chew', 'chib', 'chicago', 'chick', 'chief', 'chil', 'child', 'childr', 'chin', 'chines', 'chip', 'cho', 'chocol', 'choir', 'chok', 'chong', 'choos', 'chop', 'choppy', 'chor', 'choreograph', 'chos', 'chris', 'christ', 'christian', 'christianity', 'christians', 'christie', 'christina', 'christine', 'christmas', 'christopher', 'christy', 'chronicles', 'chuck', 'chuckl', 'church', 'churn', 'cia', 'cigaret', 'cinderell', 'cindy', 'cinem', 'cinema', 'cinematograph', 'circ', 'circumst', 'cit', 'city', 'civil', 'cla', 'clad', 'claim', 'clair', 'clan', 'clar', 'clark', 'clash', 'class', 'classm', 'classy', 'claud', 'claustrophob', 'claw', 'clay', 'cle', 'clear', 'clerk', 'clev', 'cli', 'clich', 'click', 'cliff', 'cliffhang', 'clim', 'climact', 'climax', 'climb', 'clin', 'clint', 'clip', 'cliv', 'cloak', 'clock', 'clon', 'clooney', 'clos', 'closest', 'closet', 'closeup', 'cloth', 'cloud', 'clown', 'clu', 'club', 'clueless', 'clumsy', 'clunky', 'clut', 'co', 'coach', 'coast', 'coat', 'cod', 'cody', 'coff', 'coffin', 'coh', 'coher', 'coincid', 'cok', 'col', 'cold', 'colin', 'coll', 'collab', 'collaps', 'colleagu', 'collect', 'colleg', 'collet', 'collin', 'colm', 'colo', 'colon', 'colonel', 'colony', 'columb', 'columbo', 'com', 'comb', 'combin', 'comeback', 'comedy', 'comfort', 'command', 'commend', 'commerc', 'commit', 'common', 'commun', 'comp', 'company', 'comparison', 'compass', 'compel', 'compens', 'compet', 'competit', 'compl', 'complain', 'complaint', 'complet', 'complex', 'comply', 'compos', 'composit', 'compound', 'compr', 'comprehend', 'comprom', 'compuls', 'comput', 'con', 'conceit', 'conceiv', 'concern', 'concert', 'conclud', 'concoct', 'cond', 'condemn', 'condit', 'conduc', 'conf', 'confess', 'confid', 'confin', 'confirm', 'conflict', 'confront', 'confus', 'congrat', 'connect', 'connery', 'conqu', 'conrad', 'conscy', 'consequ', 'conserv', 'consid', 'consist', 'conspir', 'const', 'constitut', 'construct', 'consum', 'cont', 'contact', 'contain', 'contemp', 'contempl', 'contempt', 'contend', 'contest', 'context', 'contin', 'continu', 'contract', 'contradict', 'contrast', 'contribut', 'control', 'controvers', 'controversy', 'conv', 'conveny', 'convers', 'convert', 'convey', 'convict', 'convint', 'convolv', 'cook', 'cooky', 'cool', 'coop', 'cop', 'cor', 'corbet', 'corey', 'corm', 'corn', 'corny', 'corp', 'corps', 'correct', 'corrid', 'corrupt', 'cost', 'costum', 'couch', 'could', 'counsel', 'count', 'counterpart', 'countless', 'country', 'countrysid', 'county', 'coup', 'coupl', 'cour', 'cours', 'court', 'courtroom', 'cousin', 'cov', 'cow', 'coward', 'cowboy', 'cox', 'crack', 'craft', 'craig', 'cram', 'crap', 'crappy', 'crash', 'crav', 'crawford', 'crawl', 'craz', 'crazy', 'cre', 'cream', 'creasy', 'cred', 'credit', 'creek', 'creep', 'creepy', 'crew', 'cri', 'crim', 'crimin', 'cring', 'crippl', 'cris', 'crisp', 'crit', 'crocodil', 'crook', 'crop', 'crosby', 'cross', 'crow', 'crowd', 'crown', 'cru', 'cruc', 'crud', 'cruel', 'crush', 'cry', 'crypt', 'cryst', 'cub', 'cue', 'culmin', 'cult', 'cum', 'cunningham', 'cup', 'cur', 'curios', 'curs', 'curt', 'curtain', 'cury', 'cusack', 'cush', 'custom', 'cut', 'cyborg', 'cyc', 'cyn', 'cyph', 'da', 'dad', 'daddy', 'dahl', 'dahm', 'dai', 'daisy', 'dal', 'dalton', 'dam', 'damn', 'damon', 'dan', 'dandy', 'dang', 'daniel', 'danny', 'dant', 'dar', 'dark', 'darl', 'darn', 'dash', 'dat', 'daught', 'dav', 'david', 'davy', 'dawn', 'dawson', 'day', 'daylight', 'dazzl', 'de', 'dea', 'dead', 'deaf', 'deal', 'dealt', 'dean', 'deann', 'dear', 'death', 'deb', 'debby', 'debr', 'debt', 'debut', 'dec', 'decad', 'decapit', 'deceas', 'deceiv', 'decid', 'deck', 'decl', 'declin', 'ded', 'dee', 'deem', 'deep', 'deeply', 'deer', 'def', 'defend', 'defens', 'defin', 'definit', 'defy', 'deg', 'degr', 'degrad', 'del', 'delay', 'delet', 'delib', 'delicy', 'delight', 'deliry', 'delivery', 'delud', 'delv', 'dem', 'demand', 'demil', 'democr', 'demon', 'demonst', 'den', 'deniro', 'denou', 'dent', 'deny', 'denzel', 'dep', 'depart', 'depend', 'depict', 'deprav', 'depress', 'depth', 'deputy', 'der', 'derang', 'derek', 'des', 'desc', 'descend', 'describ', 'desert', 'deserv', 'design', 'desir', 'desp', 'despair', 'despit', 'destin', 'destiny', 'destroy', 'destruct', 'det', 'detach', 'detail', 'detect', 'determin', 'detery', 'detract', 'detroit', 'dev', 'devast', 'develop', 'devil', 'devo', 'devoid', 'devot', 'devy', 'dialog', 'diamond', 'dian', 'diary', 'dick', 'dict', 'did', 'die', 'died', 'diff', 'difficul', 'difficult', 'dig', 'digest', 'digit', 'dign', 'dil', 'dilemm', 'dim', 'dimend', 'dimin', 'din', 'dinosa', 'dir', 'direct', 'dirt', 'dirty', 'dis', 'disagr', 'disappear', 'disappoint', 'disast', 'disbeliev', 'disc', 'discern', 'disciplin', 'disco', 'discov', 'discovery', 'discuss', 'diseas', 'disgrac', 'disgu', 'disgust', 'dish', 'disjoint', 'dislik', 'dism', 'dismiss', 'disney', 'disord', 'dispatch', 'display', 'dispos', 'disregard', 'disrespect', 'dissolv', 'dist', 'distinct', 'distort', 'distract', 'distress', 'distribut', 'district', 'disturb', 'div', 'divers', 'divert', 'divid', 'divin', 'divorc', 'dixon', 'dj', 'do', 'doc', 'doct', 'docu', 'dodg', 'dog', 'dogm', 'dol', 'doll', 'dolph', 'dom', 'domest', 'domin', 'domino', 'don', 'donald', 'donn', 'dont', 'doo', 'doom', 'door', 'dor', 'dorothy', 'dos', 'dot', 'doubl', 'doubt', 'dougla', 'down', 'downey', 'downhil', 'download', 'downright', 'doyl', 'doz', 'dr', 'drab', 'dracul', 'draft', 'drag', 'dragon', 'drain', 'drak', 'dram', 'drama', 'draw', 'drawn', 'dre', 'dread', 'dream', 'dreamy', 'dreck', 'dress', 'drew', 'drift', 'dril', 'drink', 'drip', 'driv', 'drivel', 'dron', 'drop', 'drov', 'drown', 'drug', 'drum', 'drunk', 'dry', 'du', 'dub', 'duby', 'duck', 'dud', 'dudley', 'due', 'duel', 'duh', 'duk', 'dul', 'dumb', 'dumbest', 'dump', 'dun', 'duo', 'dur', 'dust', 'dustin', 'dutch', 'duty', 'duval', 'dvd', 'dvds', 'dwarf', 'dwel', 'dying', 'dyl', 'dynam', 'dysfunct', 'eag', 'eagl', 'ear', 'earl', 'earn', 'earnest', 'eas', 'east', 'eastern', 'eastwood', 'easy', 'eat', 'ebert', 'ecc', 'echo', 'econom', 'ed', 'eddy', 'edg', 'edgy', 'edi', 'edison', 'edit', 'educ', 'edward', 'edy', 'eery', 'effect', 'efficy', 'effort', 'effortless', 'eg', 'ego', 'egypt', 'eight', 'eighty', 'einstein', 'eith', 'el', 'elab', 'eld', 'elect', 'electron', 'eleg', 'eleph', 'elev', 'elimin', 'elit', 'elizabe', 'elliot', 'elm', 'els', 'else', 'elsewh', 'elud', 'elv', 'elvir', 'em', 'embark', 'embarrass', 'embody', 'embrac', 'emerg', 'emil', 'emm', 'emot', 'emp', 'empath', 'empathy', 'emphas', 'empir', 'employ', 'empty', 'emy', 'en', 'ench', 'enco', 'encount', 'end', 'endear', 'ending', 'endless', 'enemy', 'energet', 'energy', 'enforc', 'eng', 'engin', 'engl', 'england', 'engross', 'enh', 'enigm', 'enjoy', 'enl', 'enlight', 'enorm', 'enough', 'ens', 'ensembl', 'ensu', 'ent', 'enterpr', 'entertain', 'enthral', 'enthusiasm', 'enthusiast', 'entir', 'entitl', 'entry', 'environ', 'envy', 'ep', 'episod', 'epitom', 'eq', 'equ', 'equip', 'er', 'eras', 'erik', 'erot', 'errol', 'escap', 'esp', 'espec', 'esquir', 'ess', 'est', 'esth', 'estrang', 'et', 'etc', 'etern', 'eth', 'ethn', 'eug', 'europ', 'ev', 'evalu', 'evelyn', 'ever', 'every', 'everybody', 'everyday', 'everyon', 'everyth', 'everywh', 'evid', 'evil', 'evok', 'evolv', 'ex', 'exact', 'exag', 'examin', 'exampl', 'exceiv', 'excel', 'excess', 'exchang', 'excit', 'exclud', 'excrucy', 'excus', 'execut', 'exempl', 'exerc', 'exhaust', 'exhibit', 'exit', 'exorc', 'exot', 'expand', 'expect', 'expedit', 'expend', 'expens', 'expert', 'expery', 'expl', 'explain', 'explicit', 'explod', 'exploit', 'expos', 'exposit', 'express', 'exquisit', 'ext', 'extend', 'extery', 'extinct', 'extr', 'extra', 'extraordin', 'extrem', 'ey', 'eyebrow', 'eyr', 'fab', 'fabl', 'fabr', 'fac', 'facil', 'fact', 'fad', 'fai', 'fail', 'faint', 'fair', 'fairbank', 'fairy', 'faith', 'fak', 'fal', 'falk', 'fallon', 'fals', 'fam', 'famili', 'famy', 'fan', 'fant', 'fantast', 'fantasy', 'far', 'farc', 'farm', 'farrel', 'fart', 'fasc', 'fascin', 'fash', 'fassbind', 'fast', 'fat', 'fath', 'fault', 'fav', 'favo', 'favorit', 'favourit', 'fay', 'fbi', 'fear', 'feast', 'feat', 'fed', 'fee', 'feebl', 'feel', 'feet', 'feinston', 'fel', 'felix', 'fellin', 'fellow', 'felt', 'fem', 'femin', 'feminin', 'fent', 'fer', 'ferrel', 'fest', 'fet', 'fetch', 'fev', 'fi', 'fiant', 'fict', 'fido', 'field', 'fiend', 'fierc', 'fif', 'fifteen', 'fifty', 'fig', 'fight', 'fil', 'film', 'filmmak', 'filt', 'filthy', 'fin', 'find', 'finest', 'fing', 'finney', 'fir', 'firm', 'first', 'fish', 'fishburn', 'fist', 'fit', 'fiv', 'fix', 'flag', 'flair', 'flam', 'flash', 'flashback', 'flashy', 'flat', 'flav', 'flaw', 'flawless', 'fle', 'fleet', 'flem', 'flesh', 'fli', 'flick', 'flight', 'flimsy', 'flip', 'flirt', 'flo', 'flock', 'flood', 'flop', 'flor', 'florid', 'flow', 'fluff', 'fluid', 'fly', 'flyn', 'foc', 'focus', 'fog', 'foil', 'folk', 'follow', 'fond', 'fontain', 'food', 'fool', 'foot', 'footbal', 'for', 'forbid', 'forc', 'ford', 'foreign', 'foremost', 'forest', 'forev', 'forg', 'forget', 'forgot', 'form', 'formul', 'formula', 'forrest', 'fort', 'fortun', 'forty', 'forward', 'fost', 'fought', 'foul', 'found', 'four', 'fox', 'foxx', 'frag', 'fragil', 'frail', 'fram', 'franch', 'francisco', 'franco', 'frank', 'frankenstein', 'franklin', 'franky', 'frant', 'fraud', 'fre', 'freak', 'freaky', 'fred', 'freddy', 'freedom', 'freem', 'freez', 'french', 'frenzy', 'frequ', 'fresh', 'fri', 'friday', 'friend', 'fright', 'frog', 'from', 'front', 'fronty', 'frost', 'froz', 'fruit', 'frust', 'fry', 'fu', 'fuel', 'ful', 'fulc', 'fulfil', 'fun', 'funct', 'fund', 'funda', 'funniest', 'funny', 'furnit', 'furtherm', 'fury', 'fut', 'fuzzy', 'fx', 'gabl', 'gabriel', 'gadget', 'gag', 'gain', 'gal', 'galactic', 'galaxy', 'gallery', 'gam', 'gambl', 'gamer', 'gandh', 'gang', 'gangst', 'gap', 'gar', 'garb', 'garbo', 'gard', 'garland', 'garn', 'gary', 'gas', 'gasp', 'gat', 'gath', 'gav', 'gay', 'gaz', 'gear', 'geek', 'gem', 'gen', 'gend', 'genet', 'geni', 'genius', 'genr', 'gentl', 'gentlem', 'genuin', 'geny', 'georg', 'ger', 'gerard', 'germ', 'germany', 'gershwin', 'gest', 'get', 'ghetto', 'ghost', 'giallo', 'giant', 'gibson', 'gielgud', 'gift', 'gig', 'giggl', 'gil', 'gilbert', 'gilliam', 'gimmick', 'gin', 'ging', 'giovann', 'girl', 'girlfriend', 'giv', 'glad', 'glady', 'glam', 'glant', 'glar', 'glass', 'gle', 'glen', 'glimps', 'glob', 'gloom', 'glor', 'glory', 'glov', 'glow', 'glu', 'go', 'goal', 'goat', 'god', 'godard', 'godfath', 'godzill', 'goe', 'goer', 'going', 'gold', 'goldberg', 'goldbl', 'goldsworthy', 'goldy', 'gon', 'gonn', 'good', 'goodby', 'goodm', 'goody', 'goof', 'goofy', 'gor', 'gordon', 'gorg', 'gory', 'gosh', 'got', 'goth', 'gott', 'govern', 'govind', 'grab', 'grac', 'grad', 'gradu', 'graham', 'grainy', 'gram', 'grand', 'grandfath', 'grandm', 'grandmoth', 'grandp', 'grant', 'graph', 'grasp', 'grass', 'grat', 'gratuit', 'grav', 'graveyard', 'gray', 'grayson', 'gre', 'great', 'greatest', 'gree', 'greedy', 'greek', 'green', 'greet', 'greg', 'grew', 'grey', 'grief', 'griev', 'griffi', 'grim', 'grin', 'grinch', 'grind', 'grip', 'gritty', 'gro', 'gross', 'grotesqu', 'ground', 'group', 'grow', 'grown', 'grudg', 'gruesom', 'guar', 'guarantee', 'guard', 'guess', 'guest', 'guid', 'guil', 'guilt', 'guin', 'guine', 'guit', 'gum', 'gun', 'gundam', 'gunfight', 'gung', 'gut', 'guy', 'gwyne', 'gypo', 'gypsy', 'ha', 'habit', 'hack', 'hackm', 'hackney', 'hadley', 'hag', 'hail', 'hain', 'hair', 'hal', 'half', 'halfway', 'hallmark', 'halloween', 'hallucin', 'ham', 'hamilton', 'hamlet', 'hammy', 'han', 'hand', 'handicap', 'handl', 'handsom', 'hang', 'hank', 'hannah', 'hap', 'hapless', 'happy', 'har', 'harass', 'harb', 'hard', 'hardc', 'hardy', 'hark', 'harlow', 'harm', 'harmless', 'harold', 'harp', 'harriet', 'harrison', 'harrow', 'harry', 'harsh', 'hart', 'hartley', 'harvey', 'hat', 'hatch', 'haunt', 'havoc', 'hawk', 'hawn', 'hay', 'haywor', 'hbo', 'hea', 'head', 'headach', 'heal', 'healthy', 'heap', 'hear', 'heard', 'heart', 'heartbreak', 'heartfelt', 'heartwarm', 'heat', 'heav', 'heavy', 'heck', 'hect', 'heel', 'height', 'heist', 'hel', 'held', 'helicopt', 'hello', 'helm', 'helmet', 'help', 'helpless', 'henchm', 'henry', 'hent', 'hepburn', 'her', 'herbert', 'herd', 'here', 'herm', 'hero', 'heroin', 'hesit', 'heston', 'hey', 'hi', 'hick', 'hid', 'high', 'highest', 'highlight', 'highway', 'hil', 'him', 'hind', 'hint', 'hip', 'hippy', 'hir', 'hist', 'hit', 'hitch', 'hitchcock', 'hitl', 'hk', 'hmmm', 'ho', 'hoffm', 'hog', 'hokey', 'hol', 'hold', 'holiday', 'hollow', 'hollywood', 'holm', 'holocaust', 'holy', 'hom', 'homeless', 'homicid', 'homosex', 'hon', 'honest', 'honesty', 'hong', 'hono', 'hood', 'hook', 'hoop', 'hoot', 'hop', 'hopeless', 'hopkin', 'hor', 'horn', 'horny', 'horr', 'horrend', 'horrid', 'hors', 'hospit', 'host', 'hostel', 'hostil', 'hot', 'hotel', 'hound', 'hour', 'hous', 'household', 'housew', 'how', 'howard', 'howev', 'howl', 'http', 'hudson', 'hug', 'hugh', 'huh', 'hulk', 'hum', 'humbl', 'humo', 'humy', 'hundr', 'hung', 'hungry', 'hunt', 'hurry', 'hurt', 'husband', 'hustl', 'huston', 'hybrid', 'hyd', 'hyp', 'hypnot', 'hyst', 'ian', 'ic', 'icon', 'id', 'ide', 'idea', 'ident', 'ideolog', 'idiot', 'idol', 'ie', 'if', 'ign', 'ii', 'il', 'illeg', 'illog', 'illud', 'illust', 'im', 'imagery', 'imagin', 'imdb', 'imit', 'immedy', 'immens', 'immers', 'immigr', 'immort', 'imo', 'imp', 'impact', 'impecc', 'imperson', 'impl', 'implaus', 'imply', 'import', 'impos', 'imposs', 'impress', 'imprison', 'improb', 'improv', 'impuls', 'in', 'inacc', 'inadvert', 'inappropry', 'incap', 'incarn', 'incest', 'inch', 'incid', 'inclin', 'includ', 'incoh', 'incompet', 'incomprehens', 'inconsist', 'incorp', 'incorrect', 'increas', 'incred', 'ind', 'indee', 'independ', 'indian', 'indiff', 'individ', 'induc', 'indulg', 'indust', 'industry', 'indy', 'inept', 'inevit', 'inexpery', 'inexpl', 'inf', 'infam', 'infect', 'infery', 'infinit', 'inflict', 'influ', 'info', 'inform', 'ing', 'ingeny', 'ingredy', 'ingrid', 'inh', 'inhabit', 'inherit', 'init', 'inject', 'injury', 'injust', 'inm', 'innoc', 'innov', 'innuendo', 'ins', 'insec', 'insect', 'insert', 'insid', 'insight', 'insign', 'insipid', 'insist', 'insomn', 'inspect', 'inspir', 'inst', 'instal', 'instead', 'instinct', 'institut', 'instru', 'instruct', 'insult', 'int', 'intact', 'integr', 'intellect', 'intellig', 'intend', 'intens', 'interact', 'interest', 'interf', 'intern', 'internet', 'interpret', 'interrupt', 'intertwin', 'interv', 'interview', 'intery', 'intim', 'intol', 'intrigu', 'intro', 'introduc', 'intrud', 'inv', 'invad', 'invas', 'invest', 'investig', 'invis', 'invit', 'involv', 'iq', 'ir', 'iraq', 'ireland', 'iron', 'irony', 'irrelev', 'irrit', 'is', 'isabel', 'ish', 'islam', 'island', 'isol', 'israel', 'issu', 'it', 'ita', 'item', 'iturb', 'iv', 'jack', 'jacket', 'jackson', 'jacky', 'jacob', 'jacqu', 'jad', 'jaff', 'jag', 'jail', 'jak', 'jam', 'jamy', 'jan', 'jap', 'japanes', 'jar', 'jason', 'jaw', 'jay', 'jazz', 'jeal', 'jealousy', 'jean', 'jed', 'jeff', 'jeffrey', 'jen', 'jenn', 'jenny', 'jeremy', 'jerk', 'jerry', 'jersey', 'jes', 'jess', 'jessic', 'jet', 'jew', 'jewel', 'jil', 'jim', 'jimmy', 'joan', 'job', 'jock', 'jody', 'joe', 'joel', 'joey', 'johansson', 'john', 'johnny', 'johnson', 'join', 'joint', 'jok', 'joly', 'jon', 'jonath', 'jord', 'jos', 'joseph', 'josh', 'journ', 'journey', 'jov', 'joy', 'jr', 'juan', 'jud', 'judg', 'judy', 'juic', 'jul', 'juliet', 'july', 'jump', 'jun', 'jungl', 'junk', 'juny', 'jury', 'just', 'justin', 'juvenil', 'kan', 'kansa', 'kapo', 'kar', 'karl', 'karloff', 'kat', 'kathleen', 'kathryn', 'kathy', 'katy', 'kay', 'kaz', 'keaton', 'keen', 'keep', 'kei', 'kel', 'ken', 'kenne', 'kennedy', 'kent', 'kept', 'kevin', 'key', 'khan', 'kick', 'kid', 'kiddy', 'kidm', 'kidnap', 'kil', 'kim', 'kind', 'king', 'kingdom', 'kinnear', 'kirk', 'kiss', 'kit', 'kitch', 'kitty', 'klin', 'kne', 'knew', 'knif', 'knight', 'knightley', 'knock', 'know', 'knowledg', 'known', 'kolchak', 'kong', 'kor', 'kore', 'kri', 'kubrick', 'kudo', 'kum', 'kung', 'kurosaw', 'kurt', 'kyl', 'la', 'lab', 'label', 'lac', 'lack', 'lacklust', 'lad', 'lady', 'laid', 'lak', 'lam', 'lamb', 'lampoon', 'lan', 'land', 'landmark', 'landscap', 'lang', 'langu', 'lant', 'laput', 'lar', 'larg', 'larry', 'las', 'last', 'lat', 'latest', 'latin', 'latino', 'laugh', 'laught', 'launch', 'laur', 'laurel', 'laury', 'lav', 'law', 'lawr', 'lawy', 'lay', 'lazy', 'le', 'lead', 'leagu', 'lean', 'leap', 'learn', 'least', 'leath', 'leav', 'lect', 'led', 'lee', 'left', 'leg', 'legend', 'legitim', 'leigh', 'lemmon', 'len', 'lend', 'leng', 'lengthy', 'lennon', 'leo', 'leon', 'leonard', 'les', 'lesb', 'less', 'lesson', 'lest', 'let', 'leth', 'lev', 'level', 'lew', 'lex', 'li', 'liam', 'lib', 'liberty', 'libr', 'licens', 'lie', 'lif', 'life', 'lifeless', 'lifestyl', 'lifetim', 'lift', 'light', 'lightn', 'lik', 'likew', 'lil', 'lily', 'limb', 'limit', 'lin', 'lincoln', 'lind', 'lindsay', 'linear', 'ling', 'link', 'lion', 'lionel', 'lip', 'lis', 'list', 'lit', 'littl', 'liu', 'liv', 'liz', 'lizard', 'lloyd', 'load', 'loath', 'loc', 'lock', 'log', 'loi', 'lol', 'lon', 'london', 'long', 'longest', 'longor', 'look', 'loom', 'loop', 'loos', 'lor', 'lord', 'lorett', 'los', 'loss', 'lost', 'lot', 'lou', 'loud', 'lousy', 'lov', 'love', 'low', 'lowest', 'loy', 'loyal', 'luc', 'luca', 'lucil', 'luck', 'lucky', 'lucy', 'ludicr', 'lugos', 'lui', 'luk', 'luka', 'lumet', 'lun', 'lunch', 'lundgr', 'lung', 'lur', 'lurk', 'lush', 'lust', 'luth', 'luxury', 'lying', 'lynch', 'lyr', 'mabel', 'mac', 'macabr', 'macarth', 'macdonald', 'machin', 'macho', 'macy', 'mad', 'made', 'madm', 'madonn', 'mads', 'mae', 'maf', 'mag', 'magazin', 'maggy', 'magn', 'maid', 'mail', 'main', 'mainstream', 'maintain', 'maj', 'mak', 'makeup', 'mal', 'malon', 'mam', 'man', 'mand', 'mandy', 'mang', 'manhat', 'maniac', 'manifest', 'manip', 'mankind', 'manufact', 'many', 'map', 'mar', 'marc', 'march', 'margaret', 'margin', 'marilyn', 'marin', 'mario', 'mark', 'market', 'marl', 'marlon', 'marqu', 'marry', 'marsh', 'marshal', 'mart', 'marth', 'martin', 'marty', 'marvel', 'marx', 'mary', 'mask', 'masoch', 'mason', 'mass', 'massacr', 'mast', 'masterpiec', 'masterson', 'masturb', 'mat', 'match', 'mathieu', 'matrix', 'matthau', 'matthew', 'maureen', 'max', 'maxim', 'may', 'mayb', 'mayhem', 'mccarthy', 'mccoy', 'mclaglen', 'mcqueen', 'me', 'meadow', 'meal', 'mean', 'meand', 'meaningless', 'meant', 'meantim', 'meanwhil', 'meas', 'meat', 'mech', 'med', 'medicin', 'mediev', 'mediocr', 'medit', 'meek', 'meet', 'meg', 'mel', 'meliss', 'melodram', 'melody', 'melt', 'melvyn', 'mem', 'memb', 'men', 'menac', 'ment', 'mer', 'merc', 'merciless', 'mercy', 'merit', 'mermaid', 'merry', 'meryl', 'mesm', 'mess', 'messy', 'met', 'metaph', 'method', 'mex', 'mexico', 'mey', 'mgm', 'miam', 'mic', 'mich', 'michael', 'michel', 'mick', 'mickey', 'mid', 'middl', 'midget', 'midl', 'midnight', 'midst', 'might', 'mighty', 'miik', 'mik', 'mil', 'mild', 'mildr', 'milit', 'milk', 'millionair', 'milo', 'mim', 'min', 'mind', 'mindless', 'minim', 'minisery', 'minnell', 'minut', 'mir', 'mirac', 'mirand', 'mis', 'miscast', 'misery', 'misfit', 'misfortun', 'misguid', 'mislead', 'miss', 'missil', 'mist', 'mistak', 'mistress', 'misunderstand', 'misunderstood', 'mitch', 'mitchel', 'mix', 'mixt', 'miyazak', 'mm', 'mob', 'mobl', 'mobst', 'mock', 'mod', 'model', 'modern', 'modest', 'modesty', 'moe', 'mol', 'molest', 'mom', 'moment', 'mon', 'money', 'monit', 'monk', 'monkey', 'monolog', 'monoton', 'monst', 'mont', 'montan', 'month', 'monty', 'monu', 'mood', 'moody', 'moon', 'moor', 'mor', 'morbid', 'more', 'moreov', 'morg', 'mormon', 'morn', 'moron', 'mort', 'moss', 'most', 'mot', 'moth', 'motorcyc', 'mou', 'mount', 'mountain', 'mourn', 'mous', 'mouth', 'mov', 'movie', 'movies', 'movy', 'mr', 'mrs', 'ms', 'mst', 'mtv', 'much', 'muddl', 'mug', 'mult', 'multipl', 'mum', 'mummy', 'mund', 'muppet', 'murd', 'murky', 'murph', 'murray', 'mus', 'musc', 'muse', 'muslim', 'must', 'mut', 'mutil', 'myer', 'myrtl', 'myst', 'mystery', 'myth', 'mytholog', 'nad', 'nail', 'naiv', 'nak', 'nam', 'nant', 'nar', 'narrow', 'naschy', 'nasty', 'nat', 'nata', 'natal', 'nath', 'naughty', 'naus', 'navy', 'naz', 'nbc', 'nd', 'near', 'nearby', 'neat', 'necess', 'neck', 'ned', 'nee', 'needless', 'neg', 'neglect', 'neighb', 'neighbo', 'neil', 'neith', 'nelson', 'nemes', 'neo', 'nephew', 'nerd', 'nerv', 'net', 'netflix', 'network', 'neurot', 'neut', 'nev', 'nevertheless', 'new', 'newcom', 'newm', 'newspap', 'next', 'nic', 'nichola', 'nicholson', 'nick', 'nicol', 'nicola', 'niec', 'night', 'nightclub', 'nightm', 'nin', 'ninj', 'niro', 'niv', 'no', 'nobl', 'nobody', 'nod', 'noir', 'nois', 'nol', 'nolt', 'nomin', 'non', 'nonetheless', 'nonsens', 'nop', 'nor', 'norm', 'northam', 'northern', 'nos', 'nostalg', 'not', 'notch', 'noteworthy', 'noth', 'novak', 'novel', 'now', 'nowaday', 'nowh', 'nuant', 'nuclear', 'nud', 'num', 'numb', 'nun', 'nurs', 'nut', 'ny', 'nyc', 'object', 'oblig', 'obnoxy', 'obsc', 'observ', 'obsess', 'obstac', 'obtain', 'obvy', 'oc', 'occ', 'occas', 'occult', 'occup', 'occupy', 'occur', 'octob', 'od', 'odyssey', 'off', 'offb', 'offend', 'oft', 'oh', 'oil', 'ok', 'okay', 'ol', 'old', 'oldest', 'oliv', 'olivy', 'olymp', 'om', 'omin', 'omit', 'on', 'one', 'onlin', 'onto', 'op', 'oper', 'opin', 'oppon', 'opportun', 'oppos', 'opposit', 'oppress', 'opt', 'optim', 'or', 'orang', 'orchest', 'ord', 'ordin', 'org', 'orgy', 'origin', 'orl', 'orph', 'orson', 'ory', 'osc', 'oth', 'othello', 'otherw', 'otto', 'ought', 'out', 'outcom', 'outd', 'outdo', 'outfit', 'outland', 'outlaw', 'outlin', 'outright', 'outsid', 'outstand', 'ov', 'over', 'overact', 'overal', 'overblown', 'overboard', 'overcom', 'overdon', 'overlong', 'overlook', 'overshadow', 'overt', 'overwhelm', 'ow', 'owl', 'own', 'oz', 'pac', 'pacino', 'pack', 'pad', 'pag', 'paid', 'pain', 'paint', 'pair', 'pal', 'palac', 'palestin', 'palm', 'paltrow', 'pamel', 'pan', 'pant', 'pap', 'par', 'parad', 'parallel', 'paramount', 'parano', 'paranoid', 'park', 'parody', 'parrot', 'parson', 'part', 'particip', 'particul', 'partn', 'party', 'pass', 'passeng', 'past', 'pat', 'patch', 'path', 'pathet', 'patho', 'patric', 'patrick', 'patriot', 'patron', 'pattern', 'paty', 'pau', 'paul', 'paus', 'paxton', 'pay', 'paycheck', 'payoff', 'pc', 'peac', 'peak', 'pearl', 'peck', 'peculi', 'pedest', 'pee', 'peer', 'peg', 'pen', 'penelop', 'penguin', 'penny', 'peopl', 'people', 'pep', 'per', 'perc', 'perceiv', 'perfect', 'perform', 'perhap', 'peril', 'period', 'perm', 'permit', 'perpet', 'perry', 'person', 'perspect', 'persuad', 'pervers', 'pervert', 'pet', 'petty', 'pfeiff', 'pg', 'phantasm', 'phantom', 'phas', 'phenom', 'phenomenon', 'phil', 'philip', 'phillip', 'philosoph', 'phoenix', 'phon', 'phony', 'photo', 'photograph', 'phrase', 'phys', 'piano', 'pick', 'pickford', 'pict', 'pie', 'piec', 'pier', 'pierc', 'pig', 'pil', 'pilot', 'pin', 'pink', 'pion', 'pip', 'pir', 'pistol', 'pit', 'pitch', 'pity', 'pivot', 'pix', 'plac', 'place', 'plagu', 'plain', 'plan', 'planet', 'plant', 'plast', 'plat', 'platform', 'plaus', 'play', 'playboy', 'playwright', 'pleas', 'please', 'plenty', 'plight', 'plod', 'plot', 'plu', 'plug', 'plum', 'pocket', 'poe', 'poem', 'poet', 'poetry', 'poign', 'point', 'pointless', 'poison', 'pok', 'pokemon', 'pol', 'polansk', 'policem', 'policy', 'polit', 'pond', 'pool', 'poor', 'pop', 'popcorn', 'popul', 'porn', 'porno', 'pornograph', 'port', 'portrait', 'portray', 'pos', 'posey', 'posit', 'poss', 'possess', 'post', 'pot', 'pound', 'pour', 'poverty', 'pow', 'powel', 'pra', 'pract', 'prank', 'pray', 'pre', 'preach', 'preachy', 'prec', 'precy', 'pred', 'predecess', 'predict', 'pref', 'prefer', 'pregn', 'prejud', 'prem', 'premy', 'prep', 'prepost', 'prequel', 'pres', 'preserv', 'presid', 'press', 'preston', 'presum', 'pretend', 'pretenty', 'pretty', 'prev', 'prevail', 'preview', 'prevy', 'prey', 'pri', 'pric', 'priceless', 'prid', 'priest', 'prim', 'primit', 'princess', 'princip', 'principl', 'print', 'prison', 'priv', 'privileg', 'priz', 'pro', 'prob', 'problem', 'proc', 'process', 'proclaim', 'produc', 'prof', 'profess', 'profil', 'profit', 'profound', 'program', 'progress', 'project', 'prolog', 'prom', 'promin', 'promot', 'prompt', 'pronount', 'proof', 'prop', 'propagand', 'property', 'prophecy', 'prophet', 'proport', 'propos', 'prosecut', 'prospect', 'prostitut', 'protagon', 'protect', 'protest', 'proud', 'prov', 'provid', 'provoc', 'provok', 'ps', 'pseudo', 'psych', 'psycho', 'psycholog', 'psychopa', 'psychot', 'psychy', 'pub', 'publ', 'puerto', 'pul', 'pulp', 'pumba', 'pump', 'pun', 'punch', 'punk', 'puppet', 'puppy', 'pur', 'purchas', 'purpl', 'purpos', 'pursu', 'pursuit', 'push', 'put', 'puzzl', 'python', 'quaid', 'qual', 'quant', 'quart', 'quas', 'queen', 'quentin', 'quest', 'quick', 'quiet', 'quin', 'quintess', 'quirky', 'quit', 'quot', 'rabbit', 'rac', 'rachel', 'rack', 'rad', 'radio', 'rady', 'rag', 'raid', 'rail', 'rain', 'rainy', 'rais', 'raj', 'ralph', 'ram', 'rambl', 'rambo', 'ramon', 'ramp', 'ran', 'ranch', 'randolph', 'random', 'randy', 'rang', 'rank', 'rant', 'rao', 'rap', 'rapid', 'rapt', 'rar', 'rat', 'rath', 'ratso', 'rav', 'raw', 'ray', 'raymond', 'raz', 'rd', 'rea', 'reach', 'react', 'read', 'ready', 'real', 'really', 'realm', 'rear', 'reason', 'rebel', 'rec', 'recal', 'receiv', 'recit', 'reckless', 'recogn', 'recognit', 'recommend', 'record', 'recov', 'recr', 'recruit', 'recyc', 'red', 'redeem', 'redempt', 'redneck', 'reduc', 'redund', 'ree', 'reel', 'reev', 'ref', 'refer', 'reflect', 'refresh', 'refug', 'refus', 'reg', 'regain', 'regard', 'regardless', 'regim', 'regret', 'regul', 'rehash', 'rehears', 'reid', 'reign', 'reincarn', 'reinforc', 'reis', 'reject', 'rel', 'relax', 'releas', 'relentless', 'relev', 'reliev', 'relig', 'religy', 'reluct', 'rely', 'remad', 'remain', 'remak', 'remark', 'rememb', 'remind', 'reminisc', 'remot', 'remov', 'ren', 'renaiss', 'rend', 'rendit', 'rent', 'rep', 'repetit', 'replac', 'replay', 'reply', 'report', 'repr', 'repres', 'repress', 'republ', 'repuls', 'reput', 'request', 'requir', 'rerun', 'res', 'rescu', 'research', 'resembl', 'reserv', 'resid', 'resist', 'resolv', 'reson', 'resort', 'resourc', 'respect', 'respond', 'respons', 'rest', 'resta', 'restrain', 'restraint', 'restrict', 'result', 'resum', 'resurrect', 'ret', 'retain', 'retard', 'retir', 'retriev', 'retrospect', 'return', 'reun', 'reunit', 'rev', 'revel', 'reveng', 'revers', 'review', 'revisit', 'revolt', 'revolv', 'reward', 'rewrit', 'rex', 'reynold', 'rhym', 'rhythm', 'ric', 'rich', 'richard', 'richardson', 'rick', 'ricky', 'rid', 'riddl', 'ridic', 'riff', 'rifl', 'rig', 'right', 'ring', 'riot', 'rip', 'ripoff', 'ris', 'risk', 'rit', 'ritchy', 'riv', 'rivet', 'road', 'roam', 'roar', 'rob', 'robbery', 'robbin', 'robby', 'robert', 'robertson', 'robin', 'robinson', 'robot', 'rochest', 'rock', 'rocket', 'rocky', 'rod', 'rog', 'rohm', 'rol', 'rom', 'romeo', 'romero', 'romp', 'ron', 'ronald', 'ronny', 'roof', 'rooky', 'room', 'rooney', 'root', 'rop', 'ros', 'rosario', 'rosem', 'ross', 'rot', 'roth', 'rough', 'round', 'rous', 'rout', 'routin', 'row', 'rowland', 'roy', 'rub', 'ruby', 'rud', 'rug', 'ruin', 'rukh', 'rul', 'rum', 'run', 'runaway', 'rur', 'rush', 'russ', 'russel', 'ruth', 'ruthless', 'ryan', 'sabot', 'sabrin', 'sack', 'sacr', 'sad', 'saddl', 'saf', 'sag', 'said', 'sail', 'saint', 'sak', 'sal', 'salesm', 'salm', 'saloon', 'salt', 'salv', 'sam', 'samanth', 'sammo', 'samura', 'san', 'sand', 'sandl', 'sandr', 'sang', 'sant', 'sap', 'sappy', 'sar', 'sarah', 'sarandon', 'sarcasm', 'sarcast', 'sassy', 'sat', 'satir', 'satisfact', 'satisfy', 'saturday', 'sav', 'saw', 'say', 'scal', 'scan', 'scand', 'scar', 'scarc', 'scarecrow', 'scarfac', 'scariest', 'scarlet', 'scary', 'scat', 'scen', 'scenario', 'scenery', 'schedule', 'scheme', 'schlock', 'schneider', 'school', 'schools', 'sci', 'scif', 'scooby', 'scoop', 'scop', 'scor', 'scorses', 'scot', 'scotland', 'scratch', 'scream', 'screaming', 'screams', 'screen', 'screening', 'screenplay', 'screens', 'screenwriter', 'screenwriters', 'screw', 'screwball', 'screwed', 'script', 'scripted', 'scripting', 'scripts', 'scrooge', 'se', 'sea', 'seag', 'seal', 'sean', 'search', 'season', 'seat', 'sebast', 'sec', 'second', 'secret', 'sect', 'seduc', 'see', 'seedy', 'seek', 'seem', 'seen', 'seg', 'sel', 'seldom', 'select', 'self', 'sem', 'sen', 'send', 'sens', 'senseless', 'sensit', 'sent', 'sentinel', 'senty', 'seny', 'sep', 'septemb', 'sequ', 'sequel', 'ser', 'serb', 'serg', 'serv', 'sery', 'sess', 'set', 'settl', 'setup', 'sev', 'seventy', 'sew', 'sex', 'sexy', 'seymo', 'sf', 'sg', 'sgt', 'sh', 'shad', 'shadow', 'shaggy', 'shah', 'shahid', 'shak', 'shakespear', 'shaky', 'shal', 'shallow', 'sham', 'shameless', 'shangha', 'shap', 'shar', 'shark', 'sharon', 'sharp', 'shat', 'shav', 'shaw', 'she', 'shed', 'sheen', 'sheet', 'shel', 'shelf', 'shelley', 'shelt', 'shepard', 'shepherd', 'sheriff', 'shield', 'shift', 'shin', 'ship', 'shirley', 'shirt', 'sho', 'shock', 'shoddy', 'shoot', 'shootout', 'shop', 'shor', 'short', 'shortcom', 'shot', 'shotgun', 'should', 'shout', 'shov', 'show', 'showcas', 'showdown', 'shown', 'shut', 'shy', 'sibl', 'sick', 'sid', 'sidekick', 'sidewalk', 'sidney', 'sigh', 'sight', 'sign', 'sil', 'silv', 'sim', 'simil', 'simmon', 'simon', 'simpl', 'simply', 'simpson', 'simult', 'sin', 'sinatr', 'sing', 'singl', 'sink', 'sint', 'sir', 'sirk', 'sissy', 'sist', 'sit', 'sitcom', 'situ', 'six', 'sixteen', 'sixty', 'siz', 'skat', 'skept', 'sketch', 'ski', 'skil', 'skin', 'skinny', 'skip', 'skit', 'skul', 'sky', 'slack', 'slam', 'slap', 'slapstick', 'slash', 'slat', 'slaught', 'slav', 'slay', 'sleaz', 'sleazy', 'sleep', 'sleepwalk', 'slic', 'slick', 'slid', 'slight', 'slightest', 'slim', 'slimy', 'slip', 'slo', 'sloppy', 'slow', 'slug', 'slut', 'sly', 'smack', 'smal', 'smart', 'smash', 'smel', 'smi', 'smil', 'smok', 'smoo', 'smug', 'smuggl', 'snak', 'snap', 'snatch', 'sneak', 'snip', 'snl', 'snob', 'snow', 'snowm', 'snuff', 'so', 'soap', 'sob', 'soc', 'socc', 'socy', 'soderbergh', 'soft', 'sol', 'sold', 'soldy', 'solid', 'solo', 'solv', 'somebody', 'someday', 'somehow', 'someon', 'someth', 'sometim', 'somewh', 'son', 'sondr', 'song', 'sonny', 'soon', 'soph', 'soprano', 'sor', 'sorrow', 'sorry', 'sort', 'sou', 'sought', 'soul', 'sound', 'soundtrack', 'soup', 'sour', 'sourc', 'southern', 'soviet', 'sox', 'soyl', 'spac', 'spacey', 'spad', 'spaghett', 'spain', 'span', 'spar', 'spark', 'sparkl', 'spawn', 'speak', 'spear', 'spec', 'spect', 'spectac', 'spectacul', 'specy', 'spee', 'speech', 'spel', 'spend', 'spent', 'spi', 'spic', 'spid', 'spielberg', 'spik', 'spil', 'spin', 'spir', 'spirit', 'spit', 'splatter', 'splendid', 'split', 'spock', 'spoil', 'spok', 'spont', 'spoof', 'spooky', 'spoon', 'sport', 'spot', 'spotlight', 'spout', 'spread', 'spree', 'spring', 'springer', 'spy', 'squ', 'squad', 'squeez', 'st', 'stab', 'stabl', 'stack', 'stad', 'staff', 'stag', 'stair', 'stak', 'stal', 'stalk', 'stallon', 'stamp', 'stan', 'stand', 'standard', 'standout', 'stanley', 'stant', 'stanwyck', 'star', 'stardom', 'stardust', 'starg', 'stark', 'start', 'startl', 'starv', 'stat', 'statu', 'stay', 'ste', 'steady', 'steam', 'steel', 'stell', 'step', 'steph', 'stephany', 'stereotyp', 'sterl', 'stern', 'stev', 'stewart', 'stick', 'stiff', 'stil', 'stilt', 'stim', 'stink', 'stir', 'stock', 'stol', 'stomach', 'ston', 'stood', 'stoog', 'stop', 'stor', 'storm', 'story', 'storylin', 'storytel', 'straight', 'straightforward', 'stranded', 'strange', 'strangely', 'stranger', 'strangers', 'stream', 'streep', 'street', 'streets', 'streisand', 'strength', 'strengths', 'stress', 'stretch', 'stretched', 'strict', 'strictly', 'strike', 'strikes', 'striking', 'string', 'strings', 'strip', 'stroke', 'strong', 'stronger', 'strongest', 'strongly', 'struck', 'structure', 'struggle', 'struggles', 'struggling', 'stuart', 'stuck', 'stud', 'studio', 'study', 'stuff', 'stumbl', 'stun', 'stunt', 'stupid', 'styl', 'sub', 'subject', 'sublim', 'submarin', 'submit', 'subplot', 'subsequ', 'subst', 'substitut', 'subt', 'subtext', 'subtitl', 'subtl', 'suburb', 'subvert', 'subway', 'success', 'suck', 'sud', 'sue', 'suff', 'sufficy', 'sug', 'suggest', 'suicid', 'suit', 'sul', 'sum', 'summ', 'sun', 'sund', 'sunday', 'sung', 'sunk', 'sunny', 'sunr', 'sunset', 'sunshin', 'sup', 'superb', 'superbl', 'superf', 'superhero', 'superm', 'supern', 'superst', 'supery', 'supply', 'support', 'suppos', 'suppress', 'suprem', 'sur', 'surf', 'surfac', 'surgery', 'surpass', 'surpr', 'surrend', 'surround', 'surv', 'sus', 'susp', 'suspect', 'suspend', 'suspens', 'suspicy', 'sustain', 'sutherland', 'swallow', 'sway', 'swe', 'swear', 'swed', 'sweep', 'sweet', 'swept', 'swift', 'swim', 'swing', 'switch', 'sword', 'sydney', 'symbol', 'sympath', 'sympathet', 'sympathy', 'syndrom', 'synops', 'system', 'tabl', 'taboo', 'tack', 'tackl', 'tacky', 'tact', 'tad', 'tag', 'tail', 'tak', 'tal', 'talk', 'talky', 'tam', 'tang', 'tank', 'tap', 'tar', 'tarantino', 'target', 'tarz', 'task', 'tast', 'tasteless', 'tat', 'tattoo', 'taught', 'tax', 'tayl', 'tcm', 'tea', 'teach', 'team', 'tear', 'teas', 'tech', 'techn', 'technicol', 'technolog', 'ted', 'tedy', 'tee', 'teen', 'tel', 'televid', 'temp', 'templ', 'tempt', 'ten', 'tend', 'tens', 'tent', 'ter', 'term', 'termin', 'terr', 'territ', 'terry', 'test', 'testa', 'texa', 'text', 'th', 'thank', 'that', 'the', 'thelm', 'them', 'therapy', 'there', 'theref', 'thick', 'thief', 'thiev', 'thin', 'thing', 'think', 'thinking', 'third', 'thirty', 'this', 'tho', 'thoma', 'thompson', 'thorn', 'thorough', 'though', 'thought', 'thousand', 'thread', 'threat', 'threatened', 'threatening', 'threatens', 'three', 'threw', 'thrill', 'thrilled', 'thriller', 'thrillers', 'thrilling', 'thrills', 'throat', 'throughout', 'throw', 'throwing', 'thrown', 'throws', 'thru', 'thu', 'thug', 'thumb', 'thund', 'thunderbird', 'thurm', 'tick', 'ticket', 'tid', 'tie', 'tied', 'tierney', 'tig', 'tight', 'til', 'tim', 'timberlak', 'time', 'timeless', 'timmy', 'timon', 'timothy', 'tin', 'tiny', 'tip', 'tir', 'tiresom', 'tit', 'titl', 'toby', 'tod', 'today', 'toe', 'togeth', 'toilet', 'tok', 'tokyo', 'tol', 'told', 'tom', 'tomato', 'tomb', 'tome', 'tommy', 'tomorrow', 'ton', 'tongu', 'tonight', 'tony', 'too', 'took', 'tool', 'top', 'topless', 'tor', 'torch', 'torn', 'toronto', 'tort', 'toss', 'tot', 'touch', 'tough', 'tour', 'tow', 'toward', 'town', 'toy', 'trac', 'track', 'tracy', 'trad', 'trademark', 'tradit', 'traff', 'trag', 'tragedy', 'trail', 'train', 'trait', 'tramp', 'transcend', 'transf', 'transform', 'transit', 'transl', 'transmit', 'transp', 'transpl', 'transport', 'trap', 'trash', 'trashy', 'traum', 'trav', 'travel', 'travesty', 'tre', 'treas', 'trek', 'tremend', 'trend', 'tri', 'triangl', 'trib', 'tribut', 'trick', 'trig', 'trilog', 'trio', 'trip', 'tripl', 'trit', 'triumph', 'triv', 'trom', 'troop', 'troubl', 'tru', 'truck', 'trum', 'trust', 'truth', 'try', 'tub', 'tuck', 'tun', 'tunnel', 'turd', 'turk', 'turkey', 'turmoil', 'turn', 'turtl', 'tv', 'twelv', 'twenty', 'twic', 'twilight', 'twin', 'twist', 'two', 'tyl', 'typ', 'ug', 'ugh', 'uh', 'uk', 'ultim', 'ultimat', 'ultr', 'um', 'un', 'unansw', 'unattract', 'unaw', 'unbear', 'unbeliev', 'unc', 'uncanny', 'uncomfort', 'unconscy', 'unconv', 'unconvint', 'uncov', 'uncut', 'und', 'undead', 'undeny', 'under', 'underground', 'undermin', 'undernea', 'underst', 'understand', 'understood', 'undertak', 'underw', 'underwear', 'underworld', 'undoubt', 'uneasy', 'unev', 'unexpect', 'unexplain', 'unfair', 'unfold', 'unforg', 'unforget', 'unfortun', 'unfunny', 'unhappy', 'uniform', 'unimagin', 'uninspir', 'unint', 'uninterest', 'unit', 'univers', 'unknown', 'unleash', 'unless', 'unlik', 'unnecess', 'unorigin', 'unpleas', 'unpredict', 'unr', 'unravel', 'unrel', 'uns', 'unsatisfy', 'unseen', 'unsettl', 'unst', 'unsuspect', 'unsympathet', 'unus', 'unw', 'unwatch', 'unwil', 'up', 'upcom', 'upd', 'uplift', 'upon', 'upset', 'upsid', 'urb', 'urg', 'us', 'useless', 'ustinov', 'ut', 'util', 'uw', 'vac', 'vacu', 'vad', 'vagu', 'vain', 'val', 'valentin', 'valid', 'valley', 'valu', 'vampir', 'van', 'vaness', 'vanill', 'vant', 'vary', 'vast', 'vault', 'veg', 'vega', 'vehic', 'vein', 'ven', 'venezuel', 'veng', 'venom', 'vent', 'ver', 'verb', 'verdict', 'verg', 'verhoev', 'vers', 'versatil', 'vert', 'vet', 'vhs', 'via', 'vib', 'vibr', 'vic', 'vict', 'victim', 'victor', 'vicy', 'vid', 'video', 'vietnam', 'view', 'viewpoint', 'vigil', 'vignet', 'vil', 'villain', 'vint', 'viol', 'vir', 'virgin', 'virt', 'virtu', 'vis', 'viscont', 'visit', 'vit', 'viv', 'vivid', 'voc', 'voic', 'void', 'voight', 'volum', 'volunt', 'vomit', 'von', 'vonnegut', 'vot', 'voy', 'vs', 'vulg', 'vuln', 'wacky', 'wag', 'wagn', 'wait', 'waitress', 'wak', 'wal', 'walk', 'wallac', 'walsh', 'walt', 'wan', 'wand', 'wang', 'wann', 'wannab', 'want', 'war', 'ward', 'wardrob', 'warhol', 'warm', 'warn', 'warp', 'warry', 'was', 'wash', 'washington', 'wast', 'wat', 'watch', 'watson', 'wav', 'wax', 'way', 'wayn', 'weak', 'weakest', 'weal', 'wealthy', 'weapon', 'wear', 'weary', 'weath', 'weav', 'web', 'websit', 'wed', 'wee', 'week', 'weekend', 'weight', 'weird', 'wel', 'welcom', 'well', 'wendigo', 'wendy', 'went', 'werewolf', 'werewolv', 'wes', 'west', 'western', 'wet', 'whack', 'whal', 'what', 'whatev', 'whatsoev', 'wheel', 'wheelchair', 'whenev', 'wherea', 'wheth', 'whilst', 'whin', 'whiny', 'whip', 'whistl', 'whit', 'who', 'whoev', 'whol', 'wholesom', 'whoop', 'whor', 'whos', 'why', 'wick', 'wid', 'widescreen', 'widmark', 'widow', 'wield', 'wif', 'wig', 'wil', 'wild', 'william', 'wilson', 'win', 'winchest', 'wind', 'window', 'wing', 'wint', 'wip', 'wir', 'wis', 'wisdom', 'wish', 'wit', 'witch', 'witchcraft', 'within', 'without', 'witty', 'wiv', 'wizard', 'woe', 'wolf', 'wom', 'wond', 'wonderland', 'wong', 'wont', 'woo', 'wood', 'woody', 'wor', 'word', 'work', 'world', 'worm', 'worn', 'worry', 'wors', 'worst', 'worthless', 'worthwhil', 'worthy', 'would', 'wound', 'wow', 'wrap', 'wreck', 'wrench', 'wrestl', 'wretch', 'wright', 'writ', 'wrong', 'wrot', 'wtf', 'ww', 'wwe', 'wwi', 'www', 'ya', 'yank', 'yard', 'yawn', 'ye', 'yeah', 'year', 'yearn', 'years', 'yel', 'yellow', 'yep', 'yesterday', 'yet', 'yoka', 'york', 'you', 'young', 'youngest', 'youngst', 'youth', 'youtub', 'zan', 'zen', 'zero', 'zizek', 'zomb', 'zomby', 'zon', 'zoom', 'zorro']

预测结果

使用随机森林分类器进行分类

from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators = 100) 
forest = forest.fit(train_data_features, train["sentiment"])

输出提交结果

test = pd.read_csv("./data/testData.tsv", header=0, delimiter="\t",quoting=3 )
num_reviews = len(test["review"])
clean_test_reviews = [] 
for i in range(num_reviews):
clean_review = review_to_words( test["review"][i] )
clean_test_reviews.append( clean_review )
test_data_features = vectorizer.transform(clean_test_reviews)
test_data_features = test_data_features.toarray()
result = forest.predict(test_data_features)
output = pd.DataFrame( data={ 
"id":test["id"], "sentiment":result} )
output.to_csv( "Bag_of_Words_model.csv", index=False, quoting=3 )

尝试使用xgb

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train_data_features, train["sentiment"], test_size=0.2)
model = XGBClassifier()
eval_set = [(X_test, y_test)]
model.fit(X_train, y_train, early_stopping_rounds=10, eval_metric="logloss", eval_set=eval_set, verbose=True)
result = model.predict(test_data_features)
output = pd.DataFrame( data={ 
"id":test["id"], "sentiment":result} )
output.to_csv( "xgbBag_of_Words_model.csv", index=False, quoting=3 )
[0]	validation_0-logloss:0.676219
Will train until validation_0-logloss hasn't improved in 10 rounds.
[1]	validation_0-logloss:0.662174
[2]	validation_0-logloss:0.651057
[3]	validation_0-logloss:0.640874
[4]	validation_0-logloss:0.632644
[5]	validation_0-logloss:0.625076
[6]	validation_0-logloss:0.617619
[7]	validation_0-logloss:0.611682
[8]	validation_0-logloss:0.605249
[9]	validation_0-logloss:0.599587
[10]	validation_0-logloss:0.594965
[11]	validation_0-logloss:0.589799
[12]	validation_0-logloss:0.585117
[13]	validation_0-logloss:0.580564
[14]	validation_0-logloss:0.576377
[15]	validation_0-logloss:0.572584
[16]	validation_0-logloss:0.568511
[17]	validation_0-logloss:0.565177
[18]	validation_0-logloss:0.561793
[19]	validation_0-logloss:0.558281
[20]	validation_0-logloss:0.55503
[21]	validation_0-logloss:0.552451
[22]	validation_0-logloss:0.549323
[23]	validation_0-logloss:0.546664
[24]	validation_0-logloss:0.544006
[25]	validation_0-logloss:0.54108
[26]	validation_0-logloss:0.538433
[27]	validation_0-logloss:0.535872
[28]	validation_0-logloss:0.533465
[29]	validation_0-logloss:0.5312
[30]	validation_0-logloss:0.528723
[31]	validation_0-logloss:0.526622
[32]	validation_0-logloss:0.524268
[33]	validation_0-logloss:0.522295
[34]	validation_0-logloss:0.519956
[35]	validation_0-logloss:0.518042
[36]	validation_0-logloss:0.515848
[37]	validation_0-logloss:0.514131
[38]	validation_0-logloss:0.512278
[39]	validation_0-logloss:0.510431
[40]	validation_0-logloss:0.508723
[41]	validation_0-logloss:0.506938
[42]	validation_0-logloss:0.505074
[43]	validation_0-logloss:0.50362
[44]	validation_0-logloss:0.501969
[45]	validation_0-logloss:0.500489
[46]	validation_0-logloss:0.499067
[47]	validation_0-logloss:0.497414
[48]	validation_0-logloss:0.496192
[49]	validation_0-logloss:0.494645
[50]	validation_0-logloss:0.493216
[51]	validation_0-logloss:0.49187
[52]	validation_0-logloss:0.490369
[53]	validation_0-logloss:0.489028
[54]	validation_0-logloss:0.487349
[55]	validation_0-logloss:0.486212
[56]	validation_0-logloss:0.485081
[57]	validation_0-logloss:0.483909
[58]	validation_0-logloss:0.482761
[59]	validation_0-logloss:0.481767
[60]	validation_0-logloss:0.480625
[61]	validation_0-logloss:0.479329
[62]	validation_0-logloss:0.478402
[63]	validation_0-logloss:0.477328
[64]	validation_0-logloss:0.476377
[65]	validation_0-logloss:0.475029
[66]	validation_0-logloss:0.473751
[67]	validation_0-logloss:0.472692
[68]	validation_0-logloss:0.471596
[69]	validation_0-logloss:0.470421
[70]	validation_0-logloss:0.469413
[71]	validation_0-logloss:0.468299
[72]	validation_0-logloss:0.467431
[73]	validation_0-logloss:0.466318
[74]	validation_0-logloss:0.465558
[75]	validation_0-logloss:0.464642
[76]	validation_0-logloss:0.463728
[77]	validation_0-logloss:0.462841
[78]	validation_0-logloss:0.46207
[79]	validation_0-logloss:0.461132
[80]	validation_0-logloss:0.460134
[81]	validation_0-logloss:0.45898
[82]	validation_0-logloss:0.458173
[83]	validation_0-logloss:0.457472
[84]	validation_0-logloss:0.456591
[85]	validation_0-logloss:0.456256
[86]	validation_0-logloss:0.455629
[87]	validation_0-logloss:0.454958
[88]	validation_0-logloss:0.454081
[89]	validation_0-logloss:0.453485
[90]	validation_0-logloss:0.452779
[91]	validation_0-logloss:0.452121
[92]	validation_0-logloss:0.45126
[93]	validation_0-logloss:0.450549
[94]	validation_0-logloss:0.450048
[95]	validation_0-logloss:0.44925
[96]	validation_0-logloss:0.448478
[97]	validation_0-logloss:0.447839
[98]	validation_0-logloss:0.447183
[99]	validation_0-logloss:0.446421

还是随机森林好用

在这里插入图片描述

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/143707.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)
blank

相关推荐

  • 查看tensorflow版本号

    查看tensorflow版本号https://blog.csdn.net/sinat_29957455/article/details/80636683

  • 清北学堂模拟赛d3t6 c

    清北学堂模拟赛d3t6 c分析:比较神奇的一道题.要把树变成环肯定要先变成链,然后把链给拼接成环.接下来考虑一个脑洞大开的树形dp:设f[i][0]表示i不与父节点相连的链数,f[i][1]表示i与父节点相连的链数,先考虑怎么

  • phpstorm 激活码 2021【在线注册码/序列号/破解码】

    phpstorm 激活码 2021【在线注册码/序列号/破解码】,https://javaforall.cn/100143.html。详细ieda激活码不妨到全栈程序员必看教程网一起来了解一下吧!

  • Maven根据Profiled读取不同配置文件

    Maven根据Profiled读取不同配置文件 前言在日常开发中,我们大多都会有开发环境(dev)、测试环境(test)、生产环境(prod),不同环境的参数肯定不一样,我们需要在打包的时候,不同环境打不同当包,如果手动改,一方面效率低,容易出错,而且每次打包都改动,非常麻烦,所以Maven给我们提供了profile的配置。 正文Mavenresourcesplugin支持明确声明&lt;directory&gt;指定…

  • 零散学习笔记(一)—-单相逆变电路设计

    零散学习笔记(一)—-单相逆变电路设计这几天帮别人设计然后画一个电路图,只设计电路图,没有具体实现功能。这题是一道电赛题,大家都知道设计一个电路简单,但是要具体实现功能可不是那么简单的。而本文章是最简单的一部分—电路部分,不涉及程序部分和调试。AC-DC电路设计电源输入电压为220V交流电压,在一般设计中只要是输入为220V交流电肯定需要将交流电转换成直流电压。一般有两个方法:模电方法:使用转换电路——整流(几个二极管组合起来把正负电压变成单向)–滤波(使波形平滑)–稳压(固定输出)–直流电压数电方法:使用DC-AC芯片进行转换(博

  • 迭代器和生成器

    迭代器可迭代的数据类型查看数据类型的所有方法可迭代协议:迭代器协议和可迭代对象判断一个数据类型是否是迭代器和可迭代对象:迭代器协议的原理1#基于迭代器协议2li=[1,2,3]

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号