自然语言处理之词袋模型Bag_of_words

自然语言处理之词袋模型Bag_of_words文章目录读取训练数据BeautifulSoup处理获取词袋和向量预测结果使用随机森林分类器进行分类输出提交结果尝试使用xgb还是随机森林好用教程地址:https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words读取训练数据训练数据的内容是2500条电影评论。impor…

大家好,又见面了,我是你们的朋友全栈君。

教程地址:


https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words

读取训练数据

训练数据的内容是2500条电影评论。

import pandas as pd
train = pd.read_csv("./data/labeledTrainData.tsv", header=0, delimiter="\t", quoting=3)
train.head(3)
id sentiment review
0 “5814_8” 1 “With all this stuff going down at the moment …
1 “2381_9” 1 “\”The Classic War of the Worlds\” by Timothy …
2 “7759_3” 0 “The film starts with a manager (Nicholas Bell…
train.shape
(25000, 3)
example = train['review'][0]
example
'"With all this stuff going down at the moment with MJ i\'ve started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ\'s feeling towards the press and also the obvious message of drugs are bad m\'kay.<br /><br />Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br /><br />The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci\'s character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ\'s music.<br /><br />Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br /><br />Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ\'s bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i\'ve gave this subject....hmmm well i don\'t know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."'

train当中的review项里面包含的数据是HTML类型,为了去除HTML标签,保存纯粹的评论,使用BeautifulSoup。

BeautifulSoup处理

from bs4 import BeautifulSoup
# 创建 beautifulsoup 对象
soup = BeautifulSoup(example)
#格式化输出内容
print(soup.prettify())
<html>
 <body>
  <p>
   "With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.
   <br/>
   <br/>
   Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.
   <br/>
   <br/>
   The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.
   <br/>
   <br/>
   Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.
   <br/>
   <br/>
   Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
  </p>
 </body>
</html>
# 查找各个标签
print(soup.title)
print(soup.head)
print(soup.a)
print(soup.p)
None
None
None
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# 遍历孩子
for child in  soup.body.children:
    print (child)
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# find_all是一个很神奇的函数,可以传入字符、列表、正则表达式、函数等等等。
print(soup.find_all('br'))
[<br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>]
# 可以在soup.select里面直接使用css代码
print(soup.select('.p'))
[]
# 获取text
print(soup.get_text())
"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
import nltk
import re
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
lancaster_stemmer = LancasterStemmer()
print (stopwords.words("english"))
['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't", 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]

获取词袋和向量

# 写一个处理函数
def review_to_words( raw_review ):
    review_text = BeautifulSoup(raw_review).get_text()
    # 去除标点和数字,仅保留强烈语气词
    letters_only = re.sub("[^a-zA-Z?!]", " ", review_text) 
    # 统一转换为小写字母
    words = letters_only.lower().split()                             
    # 由于set的搜索速度更快,所以把list转换成set
    stops = set(stopwords.words("english"))                  
    # 移除停止词,并且将词转为原形形式
    meaningful_words = [lancaster_stemmer.stem(w) for w in words if not w in stops]   
    # 返回标准语句
    return( " ".join( meaningful_words ))  
# 获取字符串列表
num_reviews = train["review"].size
clean_train_reviews = []
for i in range(num_reviews):
    clean_train_reviews.append( review_to_words( train["review"][i] ) )

uk edit show rath less extrav us vert person concern get new kitch perhap bedroom bathroom wond grat got us vert show everyth real tv instead mak improv hous occup could afford entir hous get rebuilt know show try show lousy welf system ex us beg hard enough receiv rath vulg produc plac tak plac particul sear also uncal rsther turn on famy depr are pot millionair would far bet help commun whol instead spend hundr thousand doll on hom build someth whol commun perhap plac diy pow tool borrow return along build mat everyon benefit want giv on person caus enorm res among rest loc commun stil liv run hous

在进行下一步之前,有必要介绍一下词袋模型,要将几个句子转化成向量,第一步是把它们包含的所有词不重复地装到一个袋子里,然后这几个句子就可以转换成和袋子里的词的数量一样长的向量,这个向量的每一个位置都对应着袋子里面某一个词在句子中出现的次数,如果没有出现就是0.

from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(analyzer = "word", tokenizer = None, preprocessor = None, stop_words = None, max_features = 5000)
train_data_features = vectorizer.fit_transform(clean_train_reviews)
train_data_features = train_data_features.toarray()
print(train_data_features.shape)
(25000, 5000)

查看词袋里面装的具体内容

vocab = vectorizer.get_feature_names()
print(len(vocab))
print(vocab)
5000
['abandon', 'abbot', 'abc', 'abduc', 'abl', 'abomin', 'aborigin', 'abort', 'abound', 'about', 'abraham', 'abrupt', 'abs', 'absolv', 'absorb', 'absurd', 'abud', 'abund', 'abus', 'abysm', 'ac', 'academy', 'acc', 'acceiv', 'access', 'accid', 'acclaim', 'accompany', 'accompl', 'accord', 'account', 'accus', 'ach', 'achiev', 'acid', 'acknowledg', 'acquaint', 'acquir', 'across', 'act', 'actress', 'ad', 'adam', 'adapt', 'addict', 'addit', 'address', 'adel', 'adequ', 'adjust', 'admin', 'admir', 'admit', 'adolesc', 'adopt', 'adr', 'adult', 'adv', 'advers', 'advert', 'aesthet', 'af', 'affair', 'affect', 'affirm', 'affleck', 'afford', 'afr', 'afraid', 'afric', 'afterma', 'afternoon', 'afterward', 'ag', 'again', 'agend', 'aggress', 'ago', 'agon', 'agony', 'agr', 'agree', 'ah', 'ahead', 'aid', 'aim', 'aimless', 'air', 'airpl', 'airport', 'ak', 'akin', 'akshay', 'al', 'ala', 'alarm', 'alb', 'albeit', 'albert', 'alcohol', 'alec', 'alert', 'alex', 'alexand', 'alexandr', 'alfr', 'alic', 'alik', 'alison', 'all', 'alleg', 'alley', 'allow', 'allud', 'almost', 'alon', 'along', 'alongsid', 'alot', 'already', 'alright', 'also', 'alt', 'altern', 'although', 'altm', 'altogeth', 'alvin', 'alway', 'aly', 'am', 'amand', 'amaz', 'amazon', 'amb', 'ambigu', 'ambit', 'amby', 'americ', 'amidst', 'amitabh', 'among', 'amongst', 'amount', 'ampl', 'amrit', 'amus', 'amy', 'an', 'analys', 'anch', 'anct', 'and', 'anderson', 'andr', 'andre', 'andrew', 'andy', 'ang', 'angel', 'angl', 'angry', 'angst', 'angy', 'anil', 'anim', 'ann', 'annount', 'annoy', 'anny', 'anoth', 'answ', 'ant', 'antagon', 'antholog', 'anthony', 'anticip', 'anton', 'antonio', 'antonion', 'antwon', 'anxy', 'anybody', 'anyhow', 'anym', 'anyon', 'anyone', 'anyth', 'anytim', 'anyway', 'anywh', 'ap', 'apart', 'apocalypt', 'apolog', 'app', 'appal', 'appear', 'appl', 'applaud', 'apply', 'apprecy', 'approach', 'appropry', 'approv', 'approxim', 'april', 'apt', 'ar', 'arab', 'arc', 'arch', 'archaeolog', 'architect', 'are', 'area', 'argentin', 'argu', 'ariel', 'aristocr', 'arkin', 'arm', 'armstrong', 'army', 'arnold', 'around', 'arquet', 'arrang', 'arrest', 'arrog', 'arrow', 'art', 'arth', 'artic', 'artsy', 'artwork', 'arty', 'as', 'ash', 'asham', 'ashley', 'asid', 'ask', 'asleep', 'aspect', 'aspir', 'ass', 'assassin', 'assault', 'assembl', 'assert', 'asset', 'assign', 'assist', 'assocy', 'assort', 'assum', 'astair', 'aston', 'astound', 'astronaut', 'asyl', 'at', 'athlet', 'atl', 'atmosph', 'atroc', 'atrocy', 'attach', 'attack', 'attempt', 'attenborough', 'attend', 'attitud', 'attorney', 'attract', 'attribut', 'aud', 'audio', 'audit', 'audrey', 'audy', 'august', 'aunt', 'aur', 'aussy', 'aust', 'austin', 'austral', 'aut', 'auth', 'auto', 'autobiograph', 'autom', 'av', 'avail', 'aveng', 'avid', 'avoid', 'aw', 'await', 'awak', 'award', 'away', 'awesom', 'awhil', 'awkward', 'ax', 'aztec', 'bab', 'baby', 'babysit', 'bacal', 'bach', 'bachch', 'bachel', 'back', 'backdrop', 'background', 'backst', 'backward', 'bacon', 'bad', 'baddy', 'baffl', 'bag', 'bait', 'bak', 'baksh', 'bal', 'bald', 'baldwin', 'ballet', 'ban', 'band', 'bang', 'bank', 'bant', 'bar', 'barb', 'barbar', 'barbr', 'bargain', 'bark', 'barn', 'barney', 'baron', 'barrel', 'barry', 'barrym', 'bas', 'basebal', 'bash', 'basket', 'basketbal', 'bastard', 'bat', 'bath', 'bathroom', 'batm', 'battl', 'battlefield', 'bau', 'bay', 'bbc', 'be', 'beach', 'bean', 'bear', 'beard', 'beast', 'beat', 'beatl', 'beatty', 'beauty', 'beav', 'becam', 'beckham', 'beckins', 'becom', 'bed', 'bedroom', 'beer', 'beetl', 'befriend', 'beg', 'begin', 'begun', 'behav', 'behavio', 'behavy', 'behind', 'behold', 'being', 'bel', 'belg', 'believ', 'belong', 'belov', 'belt', 'belush', 'ben', 'bend', 'benea', 'benefit', 'bennet', 'bent', 'beowulf', 'bergm', 'berkeley', 'berlin', 'bernard', 'besid', 'best', 'bet', 'betray', 'better', 'betty', 'bev', 'bew', 'bewild', 'beyond', 'bias', 'bibl', 'big', 'biggest', 'bik', 'bikin', 'biko', 'bil', 'bimbo', 'bin', 'bind', 'bing', 'biograph', 'biop', 'bir', 'bird', 'birthday', 'bit', 'bitch', 'bittersweet', 'bizar', 'bla', 'black', 'blackmail', 'blad', 'blah', 'blair', 'blak', 'blam', 'bland', 'blank', 'blast', 'blat', 'blaz', 'bleak', 'blee', 'blend', 'bless', 'blew', 'blind', 'blink', 'bliss', 'blob', 'block', 'blockbust', 'blond', 'blood', 'bloody', 'bloom', 'blossom', 'blow', 'blown', 'blu', 'blunt', 'blur', 'bo', 'board', 'boast', 'boat', 'bob', 'bobby', 'body', 'bog', 'bogart', 'boggl', 'boil', 'bol', 'bold', 'bollywood', 'bomb', 'bon', 'bond', 'bonny', 'boo', 'boob', 'boog', 'book', 'boom', 'boost', 'boot', 'bor', 'bord', 'boredom', 'born', 'borrow', 'boss', 'boston', 'both', 'bottl', 'bottom', 'bought', 'bound', 'bount', 'bounty', 'bourn', 'bout', 'bow', 'bowl', 'box', 'boy', 'boyfriend', 'boyl', 'brad', 'brady', 'brain', 'brainless', 'branagh', 'branch', 'brand', 'brando', 'brat', 'brav', 'braveheart', 'bravo', 'brazil', 'brea', 'bread', 'break', 'breakdown', 'breakfast', 'breast', 'breath', 'breathtak', 'bree', 'brend', 'brent', 'bret', 'bri', 'brick', 'brid', 'bridg', 'bridget', 'brief', 'bright', 'bril', 'bring', 'brit', 'britain', 'bro', 'broad', 'broadcast', 'broadway', 'brok', 'bronson', 'bront', 'brood', 'brook', 'brooklyn', 'brosn', 'broth', 'brought', 'brow', 'brown', 'bruc', 'bruno', 'brush', 'brut', 'bry', 'bsg', 'btw', 'bubbl', 'buck', 'bucket', 'bud', 'buddy', 'budget', 'buff', 'buffalo', 'bug', 'build', 'built', 'bul', 'bulk', 'bullet', 'bum', 'bumbl', 'bump', 'bunch', 'bunny', 'bur', 'burd', 'burk', 'burn', 'burst', 'burt', 'burton', 'bury', 'bus', 'busey', 'bush', 'businessm', 'bust', 'busy', 'but', 'butch', 'butl', 'button', 'buy', 'buzz', 'bye', 'cab', 'cabin', 'cabl', 'caf', 'cag', 'cagney', 'cain', 'cak', 'cal', 'calc', 'calib', 'californ', 'calm', 'cam', 'cambod', 'camcord', 'cameo', 'camer', 'camera', 'cameram', 'cameron', 'camp', 'campaign', 'campbel', 'campy', 'can', 'canad', 'cancel', 'candid', 'candl', 'candy', 'cannib', 'cannon', 'cannot', 'cant', 'canyon', 'cap', 'capac', 'capit', 'capot', 'capt', 'captain', 'car', 'card', 'cardboard', 'carel', 'cares', 'caretak', 'carey', 'carl', 'carlito', 'carlo', 'carm', 'carn', 'carol', 'carolin', 'caron', 'carp', 'carradin', 'carrey', 'carry', 'cart', 'cartoon', 'cary', 'cas', 'casablanc', 'cash', 'casino', 'casp', 'cassavet', 'cassidy', 'cast', 'castl', 'cat', 'catalog', 'catastroph', 'catch', 'catchy', 'categ', 'catherin', 'cathol', 'cattl', 'caught', 'caus', 'caut', 'cav', 'cbs', 'cd', 'ceas', 'cecil', 'ceil', 'cel', 'celebr', 'celest', 'celluloid', 'cemetery', 'cens', 'cent', 'century', 'cerebr', 'ceremony', 'certain', 'cg', 'cgi', 'chain', 'chainsaw', 'chair', 'challeng', 'chamb', 'chamberlain', 'champ', 'chan', 'chang', 'channel', 'chant', 'chao', 'chaplin', 'chapt', 'char', 'charact', 'charg', 'charism', 'charl', 'charlot', 'charlton', 'charm', 'chas', 'chat', 'chavez', 'che', 'cheadl', 'cheap', 'cheaply', 'check', 'cheek', 'chees', 'cheesy', 'chem', 'cher', 'chess', 'chest', 'chew', 'chib', 'chicago', 'chick', 'chief', 'chil', 'child', 'childr', 'chin', 'chines', 'chip', 'cho', 'chocol', 'choir', 'chok', 'chong', 'choos', 'chop', 'choppy', 'chor', 'choreograph', 'chos', 'chris', 'christ', 'christian', 'christianity', 'christians', 'christie', 'christina', 'christine', 'christmas', 'christopher', 'christy', 'chronicles', 'chuck', 'chuckl', 'church', 'churn', 'cia', 'cigaret', 'cinderell', 'cindy', 'cinem', 'cinema', 'cinematograph', 'circ', 'circumst', 'cit', 'city', 'civil', 'cla', 'clad', 'claim', 'clair', 'clan', 'clar', 'clark', 'clash', 'class', 'classm', 'classy', 'claud', 'claustrophob', 'claw', 'clay', 'cle', 'clear', 'clerk', 'clev', 'cli', 'clich', 'click', 'cliff', 'cliffhang', 'clim', 'climact', 'climax', 'climb', 'clin', 'clint', 'clip', 'cliv', 'cloak', 'clock', 'clon', 'clooney', 'clos', 'closest', 'closet', 'closeup', 'cloth', 'cloud', 'clown', 'clu', 'club', 'clueless', 'clumsy', 'clunky', 'clut', 'co', 'coach', 'coast', 'coat', 'cod', 'cody', 'coff', 'coffin', 'coh', 'coher', 'coincid', 'cok', 'col', 'cold', 'colin', 'coll', 'collab', 'collaps', 'colleagu', 'collect', 'colleg', 'collet', 'collin', 'colm', 'colo', 'colon', 'colonel', 'colony', 'columb', 'columbo', 'com', 'comb', 'combin', 'comeback', 'comedy', 'comfort', 'command', 'commend', 'commerc', 'commit', 'common', 'commun', 'comp', 'company', 'comparison', 'compass', 'compel', 'compens', 'compet', 'competit', 'compl', 'complain', 'complaint', 'complet', 'complex', 'comply', 'compos', 'composit', 'compound', 'compr', 'comprehend', 'comprom', 'compuls', 'comput', 'con', 'conceit', 'conceiv', 'concern', 'concert', 'conclud', 'concoct', 'cond', 'condemn', 'condit', 'conduc', 'conf', 'confess', 'confid', 'confin', 'confirm', 'conflict', 'confront', 'confus', 'congrat', 'connect', 'connery', 'conqu', 'conrad', 'conscy', 'consequ', 'conserv', 'consid', 'consist', 'conspir', 'const', 'constitut', 'construct', 'consum', 'cont', 'contact', 'contain', 'contemp', 'contempl', 'contempt', 'contend', 'contest', 'context', 'contin', 'continu', 'contract', 'contradict', 'contrast', 'contribut', 'control', 'controvers', 'controversy', 'conv', 'conveny', 'convers', 'convert', 'convey', 'convict', 'convint', 'convolv', 'cook', 'cooky', 'cool', 'coop', 'cop', 'cor', 'corbet', 'corey', 'corm', 'corn', 'corny', 'corp', 'corps', 'correct', 'corrid', 'corrupt', 'cost', 'costum', 'couch', 'could', 'counsel', 'count', 'counterpart', 'countless', 'country', 'countrysid', 'county', 'coup', 'coupl', 'cour', 'cours', 'court', 'courtroom', 'cousin', 'cov', 'cow', 'coward', 'cowboy', 'cox', 'crack', 'craft', 'craig', 'cram', 'crap', 'crappy', 'crash', 'crav', 'crawford', 'crawl', 'craz', 'crazy', 'cre', 'cream', 'creasy', 'cred', 'credit', 'creek', 'creep', 'creepy', 'crew', 'cri', 'crim', 'crimin', 'cring', 'crippl', 'cris', 'crisp', 'crit', 'crocodil', 'crook', 'crop', 'crosby', 'cross', 'crow', 'crowd', 'crown', 'cru', 'cruc', 'crud', 'cruel', 'crush', 'cry', 'crypt', 'cryst', 'cub', 'cue', 'culmin', 'cult', 'cum', 'cunningham', 'cup', 'cur', 'curios', 'curs', 'curt', 'curtain', 'cury', 'cusack', 'cush', 'custom', 'cut', 'cyborg', 'cyc', 'cyn', 'cyph', 'da', 'dad', 'daddy', 'dahl', 'dahm', 'dai', 'daisy', 'dal', 'dalton', 'dam', 'damn', 'damon', 'dan', 'dandy', 'dang', 'daniel', 'danny', 'dant', 'dar', 'dark', 'darl', 'darn', 'dash', 'dat', 'daught', 'dav', 'david', 'davy', 'dawn', 'dawson', 'day', 'daylight', 'dazzl', 'de', 'dea', 'dead', 'deaf', 'deal', 'dealt', 'dean', 'deann', 'dear', 'death', 'deb', 'debby', 'debr', 'debt', 'debut', 'dec', 'decad', 'decapit', 'deceas', 'deceiv', 'decid', 'deck', 'decl', 'declin', 'ded', 'dee', 'deem', 'deep', 'deeply', 'deer', 'def', 'defend', 'defens', 'defin', 'definit', 'defy', 'deg', 'degr', 'degrad', 'del', 'delay', 'delet', 'delib', 'delicy', 'delight', 'deliry', 'delivery', 'delud', 'delv', 'dem', 'demand', 'demil', 'democr', 'demon', 'demonst', 'den', 'deniro', 'denou', 'dent', 'deny', 'denzel', 'dep', 'depart', 'depend', 'depict', 'deprav', 'depress', 'depth', 'deputy', 'der', 'derang', 'derek', 'des', 'desc', 'descend', 'describ', 'desert', 'deserv', 'design', 'desir', 'desp', 'despair', 'despit', 'destin', 'destiny', 'destroy', 'destruct', 'det', 'detach', 'detail', 'detect', 'determin', 'detery', 'detract', 'detroit', 'dev', 'devast', 'develop', 'devil', 'devo', 'devoid', 'devot', 'devy', 'dialog', 'diamond', 'dian', 'diary', 'dick', 'dict', 'did', 'die', 'died', 'diff', 'difficul', 'difficult', 'dig', 'digest', 'digit', 'dign', 'dil', 'dilemm', 'dim', 'dimend', 'dimin', 'din', 'dinosa', 'dir', 'direct', 'dirt', 'dirty', 'dis', 'disagr', 'disappear', 'disappoint', 'disast', 'disbeliev', 'disc', 'discern', 'disciplin', 'disco', 'discov', 'discovery', 'discuss', 'diseas', 'disgrac', 'disgu', 'disgust', 'dish', 'disjoint', 'dislik', 'dism', 'dismiss', 'disney', 'disord', 'dispatch', 'display', 'dispos', 'disregard', 'disrespect', 'dissolv', 'dist', 'distinct', 'distort', 'distract', 'distress', 'distribut', 'district', 'disturb', 'div', 'divers', 'divert', 'divid', 'divin', 'divorc', 'dixon', 'dj', 'do', 'doc', 'doct', 'docu', 'dodg', 'dog', 'dogm', 'dol', 'doll', 'dolph', 'dom', 'domest', 'domin', 'domino', 'don', 'donald', 'donn', 'dont', 'doo', 'doom', 'door', 'dor', 'dorothy', 'dos', 'dot', 'doubl', 'doubt', 'dougla', 'down', 'downey', 'downhil', 'download', 'downright', 'doyl', 'doz', 'dr', 'drab', 'dracul', 'draft', 'drag', 'dragon', 'drain', 'drak', 'dram', 'drama', 'draw', 'drawn', 'dre', 'dread', 'dream', 'dreamy', 'dreck', 'dress', 'drew', 'drift', 'dril', 'drink', 'drip', 'driv', 'drivel', 'dron', 'drop', 'drov', 'drown', 'drug', 'drum', 'drunk', 'dry', 'du', 'dub', 'duby', 'duck', 'dud', 'dudley', 'due', 'duel', 'duh', 'duk', 'dul', 'dumb', 'dumbest', 'dump', 'dun', 'duo', 'dur', 'dust', 'dustin', 'dutch', 'duty', 'duval', 'dvd', 'dvds', 'dwarf', 'dwel', 'dying', 'dyl', 'dynam', 'dysfunct', 'eag', 'eagl', 'ear', 'earl', 'earn', 'earnest', 'eas', 'east', 'eastern', 'eastwood', 'easy', 'eat', 'ebert', 'ecc', 'echo', 'econom', 'ed', 'eddy', 'edg', 'edgy', 'edi', 'edison', 'edit', 'educ', 'edward', 'edy', 'eery', 'effect', 'efficy', 'effort', 'effortless', 'eg', 'ego', 'egypt', 'eight', 'eighty', 'einstein', 'eith', 'el', 'elab', 'eld', 'elect', 'electron', 'eleg', 'eleph', 'elev', 'elimin', 'elit', 'elizabe', 'elliot', 'elm', 'els', 'else', 'elsewh', 'elud', 'elv', 'elvir', 'em', 'embark', 'embarrass', 'embody', 'embrac', 'emerg', 'emil', 'emm', 'emot', 'emp', 'empath', 'empathy', 'emphas', 'empir', 'employ', 'empty', 'emy', 'en', 'ench', 'enco', 'encount', 'end', 'endear', 'ending', 'endless', 'enemy', 'energet', 'energy', 'enforc', 'eng', 'engin', 'engl', 'england', 'engross', 'enh', 'enigm', 'enjoy', 'enl', 'enlight', 'enorm', 'enough', 'ens', 'ensembl', 'ensu', 'ent', 'enterpr', 'entertain', 'enthral', 'enthusiasm', 'enthusiast', 'entir', 'entitl', 'entry', 'environ', 'envy', 'ep', 'episod', 'epitom', 'eq', 'equ', 'equip', 'er', 'eras', 'erik', 'erot', 'errol', 'escap', 'esp', 'espec', 'esquir', 'ess', 'est', 'esth', 'estrang', 'et', 'etc', 'etern', 'eth', 'ethn', 'eug', 'europ', 'ev', 'evalu', 'evelyn', 'ever', 'every', 'everybody', 'everyday', 'everyon', 'everyth', 'everywh', 'evid', 'evil', 'evok', 'evolv', 'ex', 'exact', 'exag', 'examin', 'exampl', 'exceiv', 'excel', 'excess', 'exchang', 'excit', 'exclud', 'excrucy', 'excus', 'execut', 'exempl', 'exerc', 'exhaust', 'exhibit', 'exit', 'exorc', 'exot', 'expand', 'expect', 'expedit', 'expend', 'expens', 'expert', 'expery', 'expl', 'explain', 'explicit', 'explod', 'exploit', 'expos', 'exposit', 'express', 'exquisit', 'ext', 'extend', 'extery', 'extinct', 'extr', 'extra', 'extraordin', 'extrem', 'ey', 'eyebrow', 'eyr', 'fab', 'fabl', 'fabr', 'fac', 'facil', 'fact', 'fad', 'fai', 'fail', 'faint', 'fair', 'fairbank', 'fairy', 'faith', 'fak', 'fal', 'falk', 'fallon', 'fals', 'fam', 'famili', 'famy', 'fan', 'fant', 'fantast', 'fantasy', 'far', 'farc', 'farm', 'farrel', 'fart', 'fasc', 'fascin', 'fash', 'fassbind', 'fast', 'fat', 'fath', 'fault', 'fav', 'favo', 'favorit', 'favourit', 'fay', 'fbi', 'fear', 'feast', 'feat', 'fed', 'fee', 'feebl', 'feel', 'feet', 'feinston', 'fel', 'felix', 'fellin', 'fellow', 'felt', 'fem', 'femin', 'feminin', 'fent', 'fer', 'ferrel', 'fest', 'fet', 'fetch', 'fev', 'fi', 'fiant', 'fict', 'fido', 'field', 'fiend', 'fierc', 'fif', 'fifteen', 'fifty', 'fig', 'fight', 'fil', 'film', 'filmmak', 'filt', 'filthy', 'fin', 'find', 'finest', 'fing', 'finney', 'fir', 'firm', 'first', 'fish', 'fishburn', 'fist', 'fit', 'fiv', 'fix', 'flag', 'flair', 'flam', 'flash', 'flashback', 'flashy', 'flat', 'flav', 'flaw', 'flawless', 'fle', 'fleet', 'flem', 'flesh', 'fli', 'flick', 'flight', 'flimsy', 'flip', 'flirt', 'flo', 'flock', 'flood', 'flop', 'flor', 'florid', 'flow', 'fluff', 'fluid', 'fly', 'flyn', 'foc', 'focus', 'fog', 'foil', 'folk', 'follow', 'fond', 'fontain', 'food', 'fool', 'foot', 'footbal', 'for', 'forbid', 'forc', 'ford', 'foreign', 'foremost', 'forest', 'forev', 'forg', 'forget', 'forgot', 'form', 'formul', 'formula', 'forrest', 'fort', 'fortun', 'forty', 'forward', 'fost', 'fought', 'foul', 'found', 'four', 'fox', 'foxx', 'frag', 'fragil', 'frail', 'fram', 'franch', 'francisco', 'franco', 'frank', 'frankenstein', 'franklin', 'franky', 'frant', 'fraud', 'fre', 'freak', 'freaky', 'fred', 'freddy', 'freedom', 'freem', 'freez', 'french', 'frenzy', 'frequ', 'fresh', 'fri', 'friday', 'friend', 'fright', 'frog', 'from', 'front', 'fronty', 'frost', 'froz', 'fruit', 'frust', 'fry', 'fu', 'fuel', 'ful', 'fulc', 'fulfil', 'fun', 'funct', 'fund', 'funda', 'funniest', 'funny', 'furnit', 'furtherm', 'fury', 'fut', 'fuzzy', 'fx', 'gabl', 'gabriel', 'gadget', 'gag', 'gain', 'gal', 'galactic', 'galaxy', 'gallery', 'gam', 'gambl', 'gamer', 'gandh', 'gang', 'gangst', 'gap', 'gar', 'garb', 'garbo', 'gard', 'garland', 'garn', 'gary', 'gas', 'gasp', 'gat', 'gath', 'gav', 'gay', 'gaz', 'gear', 'geek', 'gem', 'gen', 'gend', 'genet', 'geni', 'genius', 'genr', 'gentl', 'gentlem', 'genuin', 'geny', 'georg', 'ger', 'gerard', 'germ', 'germany', 'gershwin', 'gest', 'get', 'ghetto', 'ghost', 'giallo', 'giant', 'gibson', 'gielgud', 'gift', 'gig', 'giggl', 'gil', 'gilbert', 'gilliam', 'gimmick', 'gin', 'ging', 'giovann', 'girl', 'girlfriend', 'giv', 'glad', 'glady', 'glam', 'glant', 'glar', 'glass', 'gle', 'glen', 'glimps', 'glob', 'gloom', 'glor', 'glory', 'glov', 'glow', 'glu', 'go', 'goal', 'goat', 'god', 'godard', 'godfath', 'godzill', 'goe', 'goer', 'going', 'gold', 'goldberg', 'goldbl', 'goldsworthy', 'goldy', 'gon', 'gonn', 'good', 'goodby', 'goodm', 'goody', 'goof', 'goofy', 'gor', 'gordon', 'gorg', 'gory', 'gosh', 'got', 'goth', 'gott', 'govern', 'govind', 'grab', 'grac', 'grad', 'gradu', 'graham', 'grainy', 'gram', 'grand', 'grandfath', 'grandm', 'grandmoth', 'grandp', 'grant', 'graph', 'grasp', 'grass', 'grat', 'gratuit', 'grav', 'graveyard', 'gray', 'grayson', 'gre', 'great', 'greatest', 'gree', 'greedy', 'greek', 'green', 'greet', 'greg', 'grew', 'grey', 'grief', 'griev', 'griffi', 'grim', 'grin', 'grinch', 'grind', 'grip', 'gritty', 'gro', 'gross', 'grotesqu', 'ground', 'group', 'grow', 'grown', 'grudg', 'gruesom', 'guar', 'guarantee', 'guard', 'guess', 'guest', 'guid', 'guil', 'guilt', 'guin', 'guine', 'guit', 'gum', 'gun', 'gundam', 'gunfight', 'gung', 'gut', 'guy', 'gwyne', 'gypo', 'gypsy', 'ha', 'habit', 'hack', 'hackm', 'hackney', 'hadley', 'hag', 'hail', 'hain', 'hair', 'hal', 'half', 'halfway', 'hallmark', 'halloween', 'hallucin', 'ham', 'hamilton', 'hamlet', 'hammy', 'han', 'hand', 'handicap', 'handl', 'handsom', 'hang', 'hank', 'hannah', 'hap', 'hapless', 'happy', 'har', 'harass', 'harb', 'hard', 'hardc', 'hardy', 'hark', 'harlow', 'harm', 'harmless', 'harold', 'harp', 'harriet', 'harrison', 'harrow', 'harry', 'harsh', 'hart', 'hartley', 'harvey', 'hat', 'hatch', 'haunt', 'havoc', 'hawk', 'hawn', 'hay', 'haywor', 'hbo', 'hea', 'head', 'headach', 'heal', 'healthy', 'heap', 'hear', 'heard', 'heart', 'heartbreak', 'heartfelt', 'heartwarm', 'heat', 'heav', 'heavy', 'heck', 'hect', 'heel', 'height', 'heist', 'hel', 'held', 'helicopt', 'hello', 'helm', 'helmet', 'help', 'helpless', 'henchm', 'henry', 'hent', 'hepburn', 'her', 'herbert', 'herd', 'here', 'herm', 'hero', 'heroin', 'hesit', 'heston', 'hey', 'hi', 'hick', 'hid', 'high', 'highest', 'highlight', 'highway', 'hil', 'him', 'hind', 'hint', 'hip', 'hippy', 'hir', 'hist', 'hit', 'hitch', 'hitchcock', 'hitl', 'hk', 'hmmm', 'ho', 'hoffm', 'hog', 'hokey', 'hol', 'hold', 'holiday', 'hollow', 'hollywood', 'holm', 'holocaust', 'holy', 'hom', 'homeless', 'homicid', 'homosex', 'hon', 'honest', 'honesty', 'hong', 'hono', 'hood', 'hook', 'hoop', 'hoot', 'hop', 'hopeless', 'hopkin', 'hor', 'horn', 'horny', 'horr', 'horrend', 'horrid', 'hors', 'hospit', 'host', 'hostel', 'hostil', 'hot', 'hotel', 'hound', 'hour', 'hous', 'household', 'housew', 'how', 'howard', 'howev', 'howl', 'http', 'hudson', 'hug', 'hugh', 'huh', 'hulk', 'hum', 'humbl', 'humo', 'humy', 'hundr', 'hung', 'hungry', 'hunt', 'hurry', 'hurt', 'husband', 'hustl', 'huston', 'hybrid', 'hyd', 'hyp', 'hypnot', 'hyst', 'ian', 'ic', 'icon', 'id', 'ide', 'idea', 'ident', 'ideolog', 'idiot', 'idol', 'ie', 'if', 'ign', 'ii', 'il', 'illeg', 'illog', 'illud', 'illust', 'im', 'imagery', 'imagin', 'imdb', 'imit', 'immedy', 'immens', 'immers', 'immigr', 'immort', 'imo', 'imp', 'impact', 'impecc', 'imperson', 'impl', 'implaus', 'imply', 'import', 'impos', 'imposs', 'impress', 'imprison', 'improb', 'improv', 'impuls', 'in', 'inacc', 'inadvert', 'inappropry', 'incap', 'incarn', 'incest', 'inch', 'incid', 'inclin', 'includ', 'incoh', 'incompet', 'incomprehens', 'inconsist', 'incorp', 'incorrect', 'increas', 'incred', 'ind', 'indee', 'independ', 'indian', 'indiff', 'individ', 'induc', 'indulg', 'indust', 'industry', 'indy', 'inept', 'inevit', 'inexpery', 'inexpl', 'inf', 'infam', 'infect', 'infery', 'infinit', 'inflict', 'influ', 'info', 'inform', 'ing', 'ingeny', 'ingredy', 'ingrid', 'inh', 'inhabit', 'inherit', 'init', 'inject', 'injury', 'injust', 'inm', 'innoc', 'innov', 'innuendo', 'ins', 'insec', 'insect', 'insert', 'insid', 'insight', 'insign', 'insipid', 'insist', 'insomn', 'inspect', 'inspir', 'inst', 'instal', 'instead', 'instinct', 'institut', 'instru', 'instruct', 'insult', 'int', 'intact', 'integr', 'intellect', 'intellig', 'intend', 'intens', 'interact', 'interest', 'interf', 'intern', 'internet', 'interpret', 'interrupt', 'intertwin', 'interv', 'interview', 'intery', 'intim', 'intol', 'intrigu', 'intro', 'introduc', 'intrud', 'inv', 'invad', 'invas', 'invest', 'investig', 'invis', 'invit', 'involv', 'iq', 'ir', 'iraq', 'ireland', 'iron', 'irony', 'irrelev', 'irrit', 'is', 'isabel', 'ish', 'islam', 'island', 'isol', 'israel', 'issu', 'it', 'ita', 'item', 'iturb', 'iv', 'jack', 'jacket', 'jackson', 'jacky', 'jacob', 'jacqu', 'jad', 'jaff', 'jag', 'jail', 'jak', 'jam', 'jamy', 'jan', 'jap', 'japanes', 'jar', 'jason', 'jaw', 'jay', 'jazz', 'jeal', 'jealousy', 'jean', 'jed', 'jeff', 'jeffrey', 'jen', 'jenn', 'jenny', 'jeremy', 'jerk', 'jerry', 'jersey', 'jes', 'jess', 'jessic', 'jet', 'jew', 'jewel', 'jil', 'jim', 'jimmy', 'joan', 'job', 'jock', 'jody', 'joe', 'joel', 'joey', 'johansson', 'john', 'johnny', 'johnson', 'join', 'joint', 'jok', 'joly', 'jon', 'jonath', 'jord', 'jos', 'joseph', 'josh', 'journ', 'journey', 'jov', 'joy', 'jr', 'juan', 'jud', 'judg', 'judy', 'juic', 'jul', 'juliet', 'july', 'jump', 'jun', 'jungl', 'junk', 'juny', 'jury', 'just', 'justin', 'juvenil', 'kan', 'kansa', 'kapo', 'kar', 'karl', 'karloff', 'kat', 'kathleen', 'kathryn', 'kathy', 'katy', 'kay', 'kaz', 'keaton', 'keen', 'keep', 'kei', 'kel', 'ken', 'kenne', 'kennedy', 'kent', 'kept', 'kevin', 'key', 'khan', 'kick', 'kid', 'kiddy', 'kidm', 'kidnap', 'kil', 'kim', 'kind', 'king', 'kingdom', 'kinnear', 'kirk', 'kiss', 'kit', 'kitch', 'kitty', 'klin', 'kne', 'knew', 'knif', 'knight', 'knightley', 'knock', 'know', 'knowledg', 'known', 'kolchak', 'kong', 'kor', 'kore', 'kri', 'kubrick', 'kudo', 'kum', 'kung', 'kurosaw', 'kurt', 'kyl', 'la', 'lab', 'label', 'lac', 'lack', 'lacklust', 'lad', 'lady', 'laid', 'lak', 'lam', 'lamb', 'lampoon', 'lan', 'land', 'landmark', 'landscap', 'lang', 'langu', 'lant', 'laput', 'lar', 'larg', 'larry', 'las', 'last', 'lat', 'latest', 'latin', 'latino', 'laugh', 'laught', 'launch', 'laur', 'laurel', 'laury', 'lav', 'law', 'lawr', 'lawy', 'lay', 'lazy', 'le', 'lead', 'leagu', 'lean', 'leap', 'learn', 'least', 'leath', 'leav', 'lect', 'led', 'lee', 'left', 'leg', 'legend', 'legitim', 'leigh', 'lemmon', 'len', 'lend', 'leng', 'lengthy', 'lennon', 'leo', 'leon', 'leonard', 'les', 'lesb', 'less', 'lesson', 'lest', 'let', 'leth', 'lev', 'level', 'lew', 'lex', 'li', 'liam', 'lib', 'liberty', 'libr', 'licens', 'lie', 'lif', 'life', 'lifeless', 'lifestyl', 'lifetim', 'lift', 'light', 'lightn', 'lik', 'likew', 'lil', 'lily', 'limb', 'limit', 'lin', 'lincoln', 'lind', 'lindsay', 'linear', 'ling', 'link', 'lion', 'lionel', 'lip', 'lis', 'list', 'lit', 'littl', 'liu', 'liv', 'liz', 'lizard', 'lloyd', 'load', 'loath', 'loc', 'lock', 'log', 'loi', 'lol', 'lon', 'london', 'long', 'longest', 'longor', 'look', 'loom', 'loop', 'loos', 'lor', 'lord', 'lorett', 'los', 'loss', 'lost', 'lot', 'lou', 'loud', 'lousy', 'lov', 'love', 'low', 'lowest', 'loy', 'loyal', 'luc', 'luca', 'lucil', 'luck', 'lucky', 'lucy', 'ludicr', 'lugos', 'lui', 'luk', 'luka', 'lumet', 'lun', 'lunch', 'lundgr', 'lung', 'lur', 'lurk', 'lush', 'lust', 'luth', 'luxury', 'lying', 'lynch', 'lyr', 'mabel', 'mac', 'macabr', 'macarth', 'macdonald', 'machin', 'macho', 'macy', 'mad', 'made', 'madm', 'madonn', 'mads', 'mae', 'maf', 'mag', 'magazin', 'maggy', 'magn', 'maid', 'mail', 'main', 'mainstream', 'maintain', 'maj', 'mak', 'makeup', 'mal', 'malon', 'mam', 'man', 'mand', 'mandy', 'mang', 'manhat', 'maniac', 'manifest', 'manip', 'mankind', 'manufact', 'many', 'map', 'mar', 'marc', 'march', 'margaret', 'margin', 'marilyn', 'marin', 'mario', 'mark', 'market', 'marl', 'marlon', 'marqu', 'marry', 'marsh', 'marshal', 'mart', 'marth', 'martin', 'marty', 'marvel', 'marx', 'mary', 'mask', 'masoch', 'mason', 'mass', 'massacr', 'mast', 'masterpiec', 'masterson', 'masturb', 'mat', 'match', 'mathieu', 'matrix', 'matthau', 'matthew', 'maureen', 'max', 'maxim', 'may', 'mayb', 'mayhem', 'mccarthy', 'mccoy', 'mclaglen', 'mcqueen', 'me', 'meadow', 'meal', 'mean', 'meand', 'meaningless', 'meant', 'meantim', 'meanwhil', 'meas', 'meat', 'mech', 'med', 'medicin', 'mediev', 'mediocr', 'medit', 'meek', 'meet', 'meg', 'mel', 'meliss', 'melodram', 'melody', 'melt', 'melvyn', 'mem', 'memb', 'men', 'menac', 'ment', 'mer', 'merc', 'merciless', 'mercy', 'merit', 'mermaid', 'merry', 'meryl', 'mesm', 'mess', 'messy', 'met', 'metaph', 'method', 'mex', 'mexico', 'mey', 'mgm', 'miam', 'mic', 'mich', 'michael', 'michel', 'mick', 'mickey', 'mid', 'middl', 'midget', 'midl', 'midnight', 'midst', 'might', 'mighty', 'miik', 'mik', 'mil', 'mild', 'mildr', 'milit', 'milk', 'millionair', 'milo', 'mim', 'min', 'mind', 'mindless', 'minim', 'minisery', 'minnell', 'minut', 'mir', 'mirac', 'mirand', 'mis', 'miscast', 'misery', 'misfit', 'misfortun', 'misguid', 'mislead', 'miss', 'missil', 'mist', 'mistak', 'mistress', 'misunderstand', 'misunderstood', 'mitch', 'mitchel', 'mix', 'mixt', 'miyazak', 'mm', 'mob', 'mobl', 'mobst', 'mock', 'mod', 'model', 'modern', 'modest', 'modesty', 'moe', 'mol', 'molest', 'mom', 'moment', 'mon', 'money', 'monit', 'monk', 'monkey', 'monolog', 'monoton', 'monst', 'mont', 'montan', 'month', 'monty', 'monu', 'mood', 'moody', 'moon', 'moor', 'mor', 'morbid', 'more', 'moreov', 'morg', 'mormon', 'morn', 'moron', 'mort', 'moss', 'most', 'mot', 'moth', 'motorcyc', 'mou', 'mount', 'mountain', 'mourn', 'mous', 'mouth', 'mov', 'movie', 'movies', 'movy', 'mr', 'mrs', 'ms', 'mst', 'mtv', 'much', 'muddl', 'mug', 'mult', 'multipl', 'mum', 'mummy', 'mund', 'muppet', 'murd', 'murky', 'murph', 'murray', 'mus', 'musc', 'muse', 'muslim', 'must', 'mut', 'mutil', 'myer', 'myrtl', 'myst', 'mystery', 'myth', 'mytholog', 'nad', 'nail', 'naiv', 'nak', 'nam', 'nant', 'nar', 'narrow', 'naschy', 'nasty', 'nat', 'nata', 'natal', 'nath', 'naughty', 'naus', 'navy', 'naz', 'nbc', 'nd', 'near', 'nearby', 'neat', 'necess', 'neck', 'ned', 'nee', 'needless', 'neg', 'neglect', 'neighb', 'neighbo', 'neil', 'neith', 'nelson', 'nemes', 'neo', 'nephew', 'nerd', 'nerv', 'net', 'netflix', 'network', 'neurot', 'neut', 'nev', 'nevertheless', 'new', 'newcom', 'newm', 'newspap', 'next', 'nic', 'nichola', 'nicholson', 'nick', 'nicol', 'nicola', 'niec', 'night', 'nightclub', 'nightm', 'nin', 'ninj', 'niro', 'niv', 'no', 'nobl', 'nobody', 'nod', 'noir', 'nois', 'nol', 'nolt', 'nomin', 'non', 'nonetheless', 'nonsens', 'nop', 'nor', 'norm', 'northam', 'northern', 'nos', 'nostalg', 'not', 'notch', 'noteworthy', 'noth', 'novak', 'novel', 'now', 'nowaday', 'nowh', 'nuant', 'nuclear', 'nud', 'num', 'numb', 'nun', 'nurs', 'nut', 'ny', 'nyc', 'object', 'oblig', 'obnoxy', 'obsc', 'observ', 'obsess', 'obstac', 'obtain', 'obvy', 'oc', 'occ', 'occas', 'occult', 'occup', 'occupy', 'occur', 'octob', 'od', 'odyssey', 'off', 'offb', 'offend', 'oft', 'oh', 'oil', 'ok', 'okay', 'ol', 'old', 'oldest', 'oliv', 'olivy', 'olymp', 'om', 'omin', 'omit', 'on', 'one', 'onlin', 'onto', 'op', 'oper', 'opin', 'oppon', 'opportun', 'oppos', 'opposit', 'oppress', 'opt', 'optim', 'or', 'orang', 'orchest', 'ord', 'ordin', 'org', 'orgy', 'origin', 'orl', 'orph', 'orson', 'ory', 'osc', 'oth', 'othello', 'otherw', 'otto', 'ought', 'out', 'outcom', 'outd', 'outdo', 'outfit', 'outland', 'outlaw', 'outlin', 'outright', 'outsid', 'outstand', 'ov', 'over', 'overact', 'overal', 'overblown', 'overboard', 'overcom', 'overdon', 'overlong', 'overlook', 'overshadow', 'overt', 'overwhelm', 'ow', 'owl', 'own', 'oz', 'pac', 'pacino', 'pack', 'pad', 'pag', 'paid', 'pain', 'paint', 'pair', 'pal', 'palac', 'palestin', 'palm', 'paltrow', 'pamel', 'pan', 'pant', 'pap', 'par', 'parad', 'parallel', 'paramount', 'parano', 'paranoid', 'park', 'parody', 'parrot', 'parson', 'part', 'particip', 'particul', 'partn', 'party', 'pass', 'passeng', 'past', 'pat', 'patch', 'path', 'pathet', 'patho', 'patric', 'patrick', 'patriot', 'patron', 'pattern', 'paty', 'pau', 'paul', 'paus', 'paxton', 'pay', 'paycheck', 'payoff', 'pc', 'peac', 'peak', 'pearl', 'peck', 'peculi', 'pedest', 'pee', 'peer', 'peg', 'pen', 'penelop', 'penguin', 'penny', 'peopl', 'people', 'pep', 'per', 'perc', 'perceiv', 'perfect', 'perform', 'perhap', 'peril', 'period', 'perm', 'permit', 'perpet', 'perry', 'person', 'perspect', 'persuad', 'pervers', 'pervert', 'pet', 'petty', 'pfeiff', 'pg', 'phantasm', 'phantom', 'phas', 'phenom', 'phenomenon', 'phil', 'philip', 'phillip', 'philosoph', 'phoenix', 'phon', 'phony', 'photo', 'photograph', 'phrase', 'phys', 'piano', 'pick', 'pickford', 'pict', 'pie', 'piec', 'pier', 'pierc', 'pig', 'pil', 'pilot', 'pin', 'pink', 'pion', 'pip', 'pir', 'pistol', 'pit', 'pitch', 'pity', 'pivot', 'pix', 'plac', 'place', 'plagu', 'plain', 'plan', 'planet', 'plant', 'plast', 'plat', 'platform', 'plaus', 'play', 'playboy', 'playwright', 'pleas', 'please', 'plenty', 'plight', 'plod', 'plot', 'plu', 'plug', 'plum', 'pocket', 'poe', 'poem', 'poet', 'poetry', 'poign', 'point', 'pointless', 'poison', 'pok', 'pokemon', 'pol', 'polansk', 'policem', 'policy', 'polit', 'pond', 'pool', 'poor', 'pop', 'popcorn', 'popul', 'porn', 'porno', 'pornograph', 'port', 'portrait', 'portray', 'pos', 'posey', 'posit', 'poss', 'possess', 'post', 'pot', 'pound', 'pour', 'poverty', 'pow', 'powel', 'pra', 'pract', 'prank', 'pray', 'pre', 'preach', 'preachy', 'prec', 'precy', 'pred', 'predecess', 'predict', 'pref', 'prefer', 'pregn', 'prejud', 'prem', 'premy', 'prep', 'prepost', 'prequel', 'pres', 'preserv', 'presid', 'press', 'preston', 'presum', 'pretend', 'pretenty', 'pretty', 'prev', 'prevail', 'preview', 'prevy', 'prey', 'pri', 'pric', 'priceless', 'prid', 'priest', 'prim', 'primit', 'princess', 'princip', 'principl', 'print', 'prison', 'priv', 'privileg', 'priz', 'pro', 'prob', 'problem', 'proc', 'process', 'proclaim', 'produc', 'prof', 'profess', 'profil', 'profit', 'profound', 'program', 'progress', 'project', 'prolog', 'prom', 'promin', 'promot', 'prompt', 'pronount', 'proof', 'prop', 'propagand', 'property', 'prophecy', 'prophet', 'proport', 'propos', 'prosecut', 'prospect', 'prostitut', 'protagon', 'protect', 'protest', 'proud', 'prov', 'provid', 'provoc', 'provok', 'ps', 'pseudo', 'psych', 'psycho', 'psycholog', 'psychopa', 'psychot', 'psychy', 'pub', 'publ', 'puerto', 'pul', 'pulp', 'pumba', 'pump', 'pun', 'punch', 'punk', 'puppet', 'puppy', 'pur', 'purchas', 'purpl', 'purpos', 'pursu', 'pursuit', 'push', 'put', 'puzzl', 'python', 'quaid', 'qual', 'quant', 'quart', 'quas', 'queen', 'quentin', 'quest', 'quick', 'quiet', 'quin', 'quintess', 'quirky', 'quit', 'quot', 'rabbit', 'rac', 'rachel', 'rack', 'rad', 'radio', 'rady', 'rag', 'raid', 'rail', 'rain', 'rainy', 'rais', 'raj', 'ralph', 'ram', 'rambl', 'rambo', 'ramon', 'ramp', 'ran', 'ranch', 'randolph', 'random', 'randy', 'rang', 'rank', 'rant', 'rao', 'rap', 'rapid', 'rapt', 'rar', 'rat', 'rath', 'ratso', 'rav', 'raw', 'ray', 'raymond', 'raz', 'rd', 'rea', 'reach', 'react', 'read', 'ready', 'real', 'really', 'realm', 'rear', 'reason', 'rebel', 'rec', 'recal', 'receiv', 'recit', 'reckless', 'recogn', 'recognit', 'recommend', 'record', 'recov', 'recr', 'recruit', 'recyc', 'red', 'redeem', 'redempt', 'redneck', 'reduc', 'redund', 'ree', 'reel', 'reev', 'ref', 'refer', 'reflect', 'refresh', 'refug', 'refus', 'reg', 'regain', 'regard', 'regardless', 'regim', 'regret', 'regul', 'rehash', 'rehears', 'reid', 'reign', 'reincarn', 'reinforc', 'reis', 'reject', 'rel', 'relax', 'releas', 'relentless', 'relev', 'reliev', 'relig', 'religy', 'reluct', 'rely', 'remad', 'remain', 'remak', 'remark', 'rememb', 'remind', 'reminisc', 'remot', 'remov', 'ren', 'renaiss', 'rend', 'rendit', 'rent', 'rep', 'repetit', 'replac', 'replay', 'reply', 'report', 'repr', 'repres', 'repress', 'republ', 'repuls', 'reput', 'request', 'requir', 'rerun', 'res', 'rescu', 'research', 'resembl', 'reserv', 'resid', 'resist', 'resolv', 'reson', 'resort', 'resourc', 'respect', 'respond', 'respons', 'rest', 'resta', 'restrain', 'restraint', 'restrict', 'result', 'resum', 'resurrect', 'ret', 'retain', 'retard', 'retir', 'retriev', 'retrospect', 'return', 'reun', 'reunit', 'rev', 'revel', 'reveng', 'revers', 'review', 'revisit', 'revolt', 'revolv', 'reward', 'rewrit', 'rex', 'reynold', 'rhym', 'rhythm', 'ric', 'rich', 'richard', 'richardson', 'rick', 'ricky', 'rid', 'riddl', 'ridic', 'riff', 'rifl', 'rig', 'right', 'ring', 'riot', 'rip', 'ripoff', 'ris', 'risk', 'rit', 'ritchy', 'riv', 'rivet', 'road', 'roam', 'roar', 'rob', 'robbery', 'robbin', 'robby', 'robert', 'robertson', 'robin', 'robinson', 'robot', 'rochest', 'rock', 'rocket', 'rocky', 'rod', 'rog', 'rohm', 'rol', 'rom', 'romeo', 'romero', 'romp', 'ron', 'ronald', 'ronny', 'roof', 'rooky', 'room', 'rooney', 'root', 'rop', 'ros', 'rosario', 'rosem', 'ross', 'rot', 'roth', 'rough', 'round', 'rous', 'rout', 'routin', 'row', 'rowland', 'roy', 'rub', 'ruby', 'rud', 'rug', 'ruin', 'rukh', 'rul', 'rum', 'run', 'runaway', 'rur', 'rush', 'russ', 'russel', 'ruth', 'ruthless', 'ryan', 'sabot', 'sabrin', 'sack', 'sacr', 'sad', 'saddl', 'saf', 'sag', 'said', 'sail', 'saint', 'sak', 'sal', 'salesm', 'salm', 'saloon', 'salt', 'salv', 'sam', 'samanth', 'sammo', 'samura', 'san', 'sand', 'sandl', 'sandr', 'sang', 'sant', 'sap', 'sappy', 'sar', 'sarah', 'sarandon', 'sarcasm', 'sarcast', 'sassy', 'sat', 'satir', 'satisfact', 'satisfy', 'saturday', 'sav', 'saw', 'say', 'scal', 'scan', 'scand', 'scar', 'scarc', 'scarecrow', 'scarfac', 'scariest', 'scarlet', 'scary', 'scat', 'scen', 'scenario', 'scenery', 'schedule', 'scheme', 'schlock', 'schneider', 'school', 'schools', 'sci', 'scif', 'scooby', 'scoop', 'scop', 'scor', 'scorses', 'scot', 'scotland', 'scratch', 'scream', 'screaming', 'screams', 'screen', 'screening', 'screenplay', 'screens', 'screenwriter', 'screenwriters', 'screw', 'screwball', 'screwed', 'script', 'scripted', 'scripting', 'scripts', 'scrooge', 'se', 'sea', 'seag', 'seal', 'sean', 'search', 'season', 'seat', 'sebast', 'sec', 'second', 'secret', 'sect', 'seduc', 'see', 'seedy', 'seek', 'seem', 'seen', 'seg', 'sel', 'seldom', 'select', 'self', 'sem', 'sen', 'send', 'sens', 'senseless', 'sensit', 'sent', 'sentinel', 'senty', 'seny', 'sep', 'septemb', 'sequ', 'sequel', 'ser', 'serb', 'serg', 'serv', 'sery', 'sess', 'set', 'settl', 'setup', 'sev', 'seventy', 'sew', 'sex', 'sexy', 'seymo', 'sf', 'sg', 'sgt', 'sh', 'shad', 'shadow', 'shaggy', 'shah', 'shahid', 'shak', 'shakespear', 'shaky', 'shal', 'shallow', 'sham', 'shameless', 'shangha', 'shap', 'shar', 'shark', 'sharon', 'sharp', 'shat', 'shav', 'shaw', 'she', 'shed', 'sheen', 'sheet', 'shel', 'shelf', 'shelley', 'shelt', 'shepard', 'shepherd', 'sheriff', 'shield', 'shift', 'shin', 'ship', 'shirley', 'shirt', 'sho', 'shock', 'shoddy', 'shoot', 'shootout', 'shop', 'shor', 'short', 'shortcom', 'shot', 'shotgun', 'should', 'shout', 'shov', 'show', 'showcas', 'showdown', 'shown', 'shut', 'shy', 'sibl', 'sick', 'sid', 'sidekick', 'sidewalk', 'sidney', 'sigh', 'sight', 'sign', 'sil', 'silv', 'sim', 'simil', 'simmon', 'simon', 'simpl', 'simply', 'simpson', 'simult', 'sin', 'sinatr', 'sing', 'singl', 'sink', 'sint', 'sir', 'sirk', 'sissy', 'sist', 'sit', 'sitcom', 'situ', 'six', 'sixteen', 'sixty', 'siz', 'skat', 'skept', 'sketch', 'ski', 'skil', 'skin', 'skinny', 'skip', 'skit', 'skul', 'sky', 'slack', 'slam', 'slap', 'slapstick', 'slash', 'slat', 'slaught', 'slav', 'slay', 'sleaz', 'sleazy', 'sleep', 'sleepwalk', 'slic', 'slick', 'slid', 'slight', 'slightest', 'slim', 'slimy', 'slip', 'slo', 'sloppy', 'slow', 'slug', 'slut', 'sly', 'smack', 'smal', 'smart', 'smash', 'smel', 'smi', 'smil', 'smok', 'smoo', 'smug', 'smuggl', 'snak', 'snap', 'snatch', 'sneak', 'snip', 'snl', 'snob', 'snow', 'snowm', 'snuff', 'so', 'soap', 'sob', 'soc', 'socc', 'socy', 'soderbergh', 'soft', 'sol', 'sold', 'soldy', 'solid', 'solo', 'solv', 'somebody', 'someday', 'somehow', 'someon', 'someth', 'sometim', 'somewh', 'son', 'sondr', 'song', 'sonny', 'soon', 'soph', 'soprano', 'sor', 'sorrow', 'sorry', 'sort', 'sou', 'sought', 'soul', 'sound', 'soundtrack', 'soup', 'sour', 'sourc', 'southern', 'soviet', 'sox', 'soyl', 'spac', 'spacey', 'spad', 'spaghett', 'spain', 'span', 'spar', 'spark', 'sparkl', 'spawn', 'speak', 'spear', 'spec', 'spect', 'spectac', 'spectacul', 'specy', 'spee', 'speech', 'spel', 'spend', 'spent', 'spi', 'spic', 'spid', 'spielberg', 'spik', 'spil', 'spin', 'spir', 'spirit', 'spit', 'splatter', 'splendid', 'split', 'spock', 'spoil', 'spok', 'spont', 'spoof', 'spooky', 'spoon', 'sport', 'spot', 'spotlight', 'spout', 'spread', 'spree', 'spring', 'springer', 'spy', 'squ', 'squad', 'squeez', 'st', 'stab', 'stabl', 'stack', 'stad', 'staff', 'stag', 'stair', 'stak', 'stal', 'stalk', 'stallon', 'stamp', 'stan', 'stand', 'standard', 'standout', 'stanley', 'stant', 'stanwyck', 'star', 'stardom', 'stardust', 'starg', 'stark', 'start', 'startl', 'starv', 'stat', 'statu', 'stay', 'ste', 'steady', 'steam', 'steel', 'stell', 'step', 'steph', 'stephany', 'stereotyp', 'sterl', 'stern', 'stev', 'stewart', 'stick', 'stiff', 'stil', 'stilt', 'stim', 'stink', 'stir', 'stock', 'stol', 'stomach', 'ston', 'stood', 'stoog', 'stop', 'stor', 'storm', 'story', 'storylin', 'storytel', 'straight', 'straightforward', 'stranded', 'strange', 'strangely', 'stranger', 'strangers', 'stream', 'streep', 'street', 'streets', 'streisand', 'strength', 'strengths', 'stress', 'stretch', 'stretched', 'strict', 'strictly', 'strike', 'strikes', 'striking', 'string', 'strings', 'strip', 'stroke', 'strong', 'stronger', 'strongest', 'strongly', 'struck', 'structure', 'struggle', 'struggles', 'struggling', 'stuart', 'stuck', 'stud', 'studio', 'study', 'stuff', 'stumbl', 'stun', 'stunt', 'stupid', 'styl', 'sub', 'subject', 'sublim', 'submarin', 'submit', 'subplot', 'subsequ', 'subst', 'substitut', 'subt', 'subtext', 'subtitl', 'subtl', 'suburb', 'subvert', 'subway', 'success', 'suck', 'sud', 'sue', 'suff', 'sufficy', 'sug', 'suggest', 'suicid', 'suit', 'sul', 'sum', 'summ', 'sun', 'sund', 'sunday', 'sung', 'sunk', 'sunny', 'sunr', 'sunset', 'sunshin', 'sup', 'superb', 'superbl', 'superf', 'superhero', 'superm', 'supern', 'superst', 'supery', 'supply', 'support', 'suppos', 'suppress', 'suprem', 'sur', 'surf', 'surfac', 'surgery', 'surpass', 'surpr', 'surrend', 'surround', 'surv', 'sus', 'susp', 'suspect', 'suspend', 'suspens', 'suspicy', 'sustain', 'sutherland', 'swallow', 'sway', 'swe', 'swear', 'swed', 'sweep', 'sweet', 'swept', 'swift', 'swim', 'swing', 'switch', 'sword', 'sydney', 'symbol', 'sympath', 'sympathet', 'sympathy', 'syndrom', 'synops', 'system', 'tabl', 'taboo', 'tack', 'tackl', 'tacky', 'tact', 'tad', 'tag', 'tail', 'tak', 'tal', 'talk', 'talky', 'tam', 'tang', 'tank', 'tap', 'tar', 'tarantino', 'target', 'tarz', 'task', 'tast', 'tasteless', 'tat', 'tattoo', 'taught', 'tax', 'tayl', 'tcm', 'tea', 'teach', 'team', 'tear', 'teas', 'tech', 'techn', 'technicol', 'technolog', 'ted', 'tedy', 'tee', 'teen', 'tel', 'televid', 'temp', 'templ', 'tempt', 'ten', 'tend', 'tens', 'tent', 'ter', 'term', 'termin', 'terr', 'territ', 'terry', 'test', 'testa', 'texa', 'text', 'th', 'thank', 'that', 'the', 'thelm', 'them', 'therapy', 'there', 'theref', 'thick', 'thief', 'thiev', 'thin', 'thing', 'think', 'thinking', 'third', 'thirty', 'this', 'tho', 'thoma', 'thompson', 'thorn', 'thorough', 'though', 'thought', 'thousand', 'thread', 'threat', 'threatened', 'threatening', 'threatens', 'three', 'threw', 'thrill', 'thrilled', 'thriller', 'thrillers', 'thrilling', 'thrills', 'throat', 'throughout', 'throw', 'throwing', 'thrown', 'throws', 'thru', 'thu', 'thug', 'thumb', 'thund', 'thunderbird', 'thurm', 'tick', 'ticket', 'tid', 'tie', 'tied', 'tierney', 'tig', 'tight', 'til', 'tim', 'timberlak', 'time', 'timeless', 'timmy', 'timon', 'timothy', 'tin', 'tiny', 'tip', 'tir', 'tiresom', 'tit', 'titl', 'toby', 'tod', 'today', 'toe', 'togeth', 'toilet', 'tok', 'tokyo', 'tol', 'told', 'tom', 'tomato', 'tomb', 'tome', 'tommy', 'tomorrow', 'ton', 'tongu', 'tonight', 'tony', 'too', 'took', 'tool', 'top', 'topless', 'tor', 'torch', 'torn', 'toronto', 'tort', 'toss', 'tot', 'touch', 'tough', 'tour', 'tow', 'toward', 'town', 'toy', 'trac', 'track', 'tracy', 'trad', 'trademark', 'tradit', 'traff', 'trag', 'tragedy', 'trail', 'train', 'trait', 'tramp', 'transcend', 'transf', 'transform', 'transit', 'transl', 'transmit', 'transp', 'transpl', 'transport', 'trap', 'trash', 'trashy', 'traum', 'trav', 'travel', 'travesty', 'tre', 'treas', 'trek', 'tremend', 'trend', 'tri', 'triangl', 'trib', 'tribut', 'trick', 'trig', 'trilog', 'trio', 'trip', 'tripl', 'trit', 'triumph', 'triv', 'trom', 'troop', 'troubl', 'tru', 'truck', 'trum', 'trust', 'truth', 'try', 'tub', 'tuck', 'tun', 'tunnel', 'turd', 'turk', 'turkey', 'turmoil', 'turn', 'turtl', 'tv', 'twelv', 'twenty', 'twic', 'twilight', 'twin', 'twist', 'two', 'tyl', 'typ', 'ug', 'ugh', 'uh', 'uk', 'ultim', 'ultimat', 'ultr', 'um', 'un', 'unansw', 'unattract', 'unaw', 'unbear', 'unbeliev', 'unc', 'uncanny', 'uncomfort', 'unconscy', 'unconv', 'unconvint', 'uncov', 'uncut', 'und', 'undead', 'undeny', 'under', 'underground', 'undermin', 'undernea', 'underst', 'understand', 'understood', 'undertak', 'underw', 'underwear', 'underworld', 'undoubt', 'uneasy', 'unev', 'unexpect', 'unexplain', 'unfair', 'unfold', 'unforg', 'unforget', 'unfortun', 'unfunny', 'unhappy', 'uniform', 'unimagin', 'uninspir', 'unint', 'uninterest', 'unit', 'univers', 'unknown', 'unleash', 'unless', 'unlik', 'unnecess', 'unorigin', 'unpleas', 'unpredict', 'unr', 'unravel', 'unrel', 'uns', 'unsatisfy', 'unseen', 'unsettl', 'unst', 'unsuspect', 'unsympathet', 'unus', 'unw', 'unwatch', 'unwil', 'up', 'upcom', 'upd', 'uplift', 'upon', 'upset', 'upsid', 'urb', 'urg', 'us', 'useless', 'ustinov', 'ut', 'util', 'uw', 'vac', 'vacu', 'vad', 'vagu', 'vain', 'val', 'valentin', 'valid', 'valley', 'valu', 'vampir', 'van', 'vaness', 'vanill', 'vant', 'vary', 'vast', 'vault', 'veg', 'vega', 'vehic', 'vein', 'ven', 'venezuel', 'veng', 'venom', 'vent', 'ver', 'verb', 'verdict', 'verg', 'verhoev', 'vers', 'versatil', 'vert', 'vet', 'vhs', 'via', 'vib', 'vibr', 'vic', 'vict', 'victim', 'victor', 'vicy', 'vid', 'video', 'vietnam', 'view', 'viewpoint', 'vigil', 'vignet', 'vil', 'villain', 'vint', 'viol', 'vir', 'virgin', 'virt', 'virtu', 'vis', 'viscont', 'visit', 'vit', 'viv', 'vivid', 'voc', 'voic', 'void', 'voight', 'volum', 'volunt', 'vomit', 'von', 'vonnegut', 'vot', 'voy', 'vs', 'vulg', 'vuln', 'wacky', 'wag', 'wagn', 'wait', 'waitress', 'wak', 'wal', 'walk', 'wallac', 'walsh', 'walt', 'wan', 'wand', 'wang', 'wann', 'wannab', 'want', 'war', 'ward', 'wardrob', 'warhol', 'warm', 'warn', 'warp', 'warry', 'was', 'wash', 'washington', 'wast', 'wat', 'watch', 'watson', 'wav', 'wax', 'way', 'wayn', 'weak', 'weakest', 'weal', 'wealthy', 'weapon', 'wear', 'weary', 'weath', 'weav', 'web', 'websit', 'wed', 'wee', 'week', 'weekend', 'weight', 'weird', 'wel', 'welcom', 'well', 'wendigo', 'wendy', 'went', 'werewolf', 'werewolv', 'wes', 'west', 'western', 'wet', 'whack', 'whal', 'what', 'whatev', 'whatsoev', 'wheel', 'wheelchair', 'whenev', 'wherea', 'wheth', 'whilst', 'whin', 'whiny', 'whip', 'whistl', 'whit', 'who', 'whoev', 'whol', 'wholesom', 'whoop', 'whor', 'whos', 'why', 'wick', 'wid', 'widescreen', 'widmark', 'widow', 'wield', 'wif', 'wig', 'wil', 'wild', 'william', 'wilson', 'win', 'winchest', 'wind', 'window', 'wing', 'wint', 'wip', 'wir', 'wis', 'wisdom', 'wish', 'wit', 'witch', 'witchcraft', 'within', 'without', 'witty', 'wiv', 'wizard', 'woe', 'wolf', 'wom', 'wond', 'wonderland', 'wong', 'wont', 'woo', 'wood', 'woody', 'wor', 'word', 'work', 'world', 'worm', 'worn', 'worry', 'wors', 'worst', 'worthless', 'worthwhil', 'worthy', 'would', 'wound', 'wow', 'wrap', 'wreck', 'wrench', 'wrestl', 'wretch', 'wright', 'writ', 'wrong', 'wrot', 'wtf', 'ww', 'wwe', 'wwi', 'www', 'ya', 'yank', 'yard', 'yawn', 'ye', 'yeah', 'year', 'yearn', 'years', 'yel', 'yellow', 'yep', 'yesterday', 'yet', 'yoka', 'york', 'you', 'young', 'youngest', 'youngst', 'youth', 'youtub', 'zan', 'zen', 'zero', 'zizek', 'zomb', 'zomby', 'zon', 'zoom', 'zorro']

预测结果

使用随机森林分类器进行分类

from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators = 100) 
forest = forest.fit(train_data_features, train["sentiment"])

输出提交结果

test = pd.read_csv("./data/testData.tsv", header=0, delimiter="\t",quoting=3 )
num_reviews = len(test["review"])
clean_test_reviews = [] 

for i in range(num_reviews):
    clean_review = review_to_words( test["review"][i] )
    clean_test_reviews.append( clean_review )
test_data_features = vectorizer.transform(clean_test_reviews)
test_data_features = test_data_features.toarray()
result = forest.predict(test_data_features)
output = pd.DataFrame( data={ 
   "id":test["id"], "sentiment":result} )
output.to_csv( "Bag_of_Words_model.csv", index=False, quoting=3 )

尝试使用xgb

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train_data_features, train["sentiment"], test_size=0.2)
model = XGBClassifier()
eval_set = [(X_test, y_test)]
model.fit(X_train, y_train, early_stopping_rounds=10, eval_metric="logloss", eval_set=eval_set, verbose=True)
result = model.predict(test_data_features)
output = pd.DataFrame( data={ 
   "id":test["id"], "sentiment":result} )
output.to_csv( "xgbBag_of_Words_model.csv", index=False, quoting=3 )
[0]	validation_0-logloss:0.676219
Will train until validation_0-logloss hasn't improved in 10 rounds.
[1]	validation_0-logloss:0.662174
[2]	validation_0-logloss:0.651057
[3]	validation_0-logloss:0.640874
[4]	validation_0-logloss:0.632644
[5]	validation_0-logloss:0.625076
[6]	validation_0-logloss:0.617619
[7]	validation_0-logloss:0.611682
[8]	validation_0-logloss:0.605249
[9]	validation_0-logloss:0.599587
[10]	validation_0-logloss:0.594965
[11]	validation_0-logloss:0.589799
[12]	validation_0-logloss:0.585117
[13]	validation_0-logloss:0.580564
[14]	validation_0-logloss:0.576377
[15]	validation_0-logloss:0.572584
[16]	validation_0-logloss:0.568511
[17]	validation_0-logloss:0.565177
[18]	validation_0-logloss:0.561793
[19]	validation_0-logloss:0.558281
[20]	validation_0-logloss:0.55503
[21]	validation_0-logloss:0.552451
[22]	validation_0-logloss:0.549323
[23]	validation_0-logloss:0.546664
[24]	validation_0-logloss:0.544006
[25]	validation_0-logloss:0.54108
[26]	validation_0-logloss:0.538433
[27]	validation_0-logloss:0.535872
[28]	validation_0-logloss:0.533465
[29]	validation_0-logloss:0.5312
[30]	validation_0-logloss:0.528723
[31]	validation_0-logloss:0.526622
[32]	validation_0-logloss:0.524268
[33]	validation_0-logloss:0.522295
[34]	validation_0-logloss:0.519956
[35]	validation_0-logloss:0.518042
[36]	validation_0-logloss:0.515848
[37]	validation_0-logloss:0.514131
[38]	validation_0-logloss:0.512278
[39]	validation_0-logloss:0.510431
[40]	validation_0-logloss:0.508723
[41]	validation_0-logloss:0.506938
[42]	validation_0-logloss:0.505074
[43]	validation_0-logloss:0.50362
[44]	validation_0-logloss:0.501969
[45]	validation_0-logloss:0.500489
[46]	validation_0-logloss:0.499067
[47]	validation_0-logloss:0.497414
[48]	validation_0-logloss:0.496192
[49]	validation_0-logloss:0.494645
[50]	validation_0-logloss:0.493216
[51]	validation_0-logloss:0.49187
[52]	validation_0-logloss:0.490369
[53]	validation_0-logloss:0.489028
[54]	validation_0-logloss:0.487349
[55]	validation_0-logloss:0.486212
[56]	validation_0-logloss:0.485081
[57]	validation_0-logloss:0.483909
[58]	validation_0-logloss:0.482761
[59]	validation_0-logloss:0.481767
[60]	validation_0-logloss:0.480625
[61]	validation_0-logloss:0.479329
[62]	validation_0-logloss:0.478402
[63]	validation_0-logloss:0.477328
[64]	validation_0-logloss:0.476377
[65]	validation_0-logloss:0.475029
[66]	validation_0-logloss:0.473751
[67]	validation_0-logloss:0.472692
[68]	validation_0-logloss:0.471596
[69]	validation_0-logloss:0.470421
[70]	validation_0-logloss:0.469413
[71]	validation_0-logloss:0.468299
[72]	validation_0-logloss:0.467431
[73]	validation_0-logloss:0.466318
[74]	validation_0-logloss:0.465558
[75]	validation_0-logloss:0.464642
[76]	validation_0-logloss:0.463728
[77]	validation_0-logloss:0.462841
[78]	validation_0-logloss:0.46207
[79]	validation_0-logloss:0.461132
[80]	validation_0-logloss:0.460134
[81]	validation_0-logloss:0.45898
[82]	validation_0-logloss:0.458173
[83]	validation_0-logloss:0.457472
[84]	validation_0-logloss:0.456591
[85]	validation_0-logloss:0.456256
[86]	validation_0-logloss:0.455629
[87]	validation_0-logloss:0.454958
[88]	validation_0-logloss:0.454081
[89]	validation_0-logloss:0.453485
[90]	validation_0-logloss:0.452779
[91]	validation_0-logloss:0.452121
[92]	validation_0-logloss:0.45126
[93]	validation_0-logloss:0.450549
[94]	validation_0-logloss:0.450048
[95]	validation_0-logloss:0.44925
[96]	validation_0-logloss:0.448478
[97]	validation_0-logloss:0.447839
[98]	validation_0-logloss:0.447183
[99]	validation_0-logloss:0.446421

还是随机森林好用

在这里插入图片描述

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。

发布者:全栈程序员-用户IM,转载请注明出处:https://javaforall.cn/143707.html原文链接:https://javaforall.cn

【正版授权,激活自己账号】: Jetbrains全家桶Ide使用,1年售后保障,每天仅需1毛

【官方授权 正版激活】: 官方授权 正版激活 支持Jetbrains家族下所有IDE 使用个人JB账号...

(0)


相关推荐

  • uniapp页面跳转传参_uni怎么做api跳转

    uniapp页面跳转传参_uni怎么做api跳转今天看Dcloud官网更新了个uni-app,据说一套代码三端发布(Android,iOS,微信小程序),果断一试。uni.navigateTo(OBJECT)保留当前页面,跳转到应用内的某个页面,使用uni.navigateBack可以返回到原页面。OBJECT参数说明参数 类型 必填 说明 url String 是 需要跳转的应用内非…

  • 最炫python表白代码_有趣的python代码表白

    最炫python表白代码_有趣的python代码表白文章目录前言演示网站制作部署网站二维码制作总结前言跟着我做,不要跳着看,否则你会失败。第一步是制作二维码;第二步是制作网站。演示具体成果地址:https://yanghanwen.xyz/ai/网站制作首先你需要下载我的这个完整项目:链接:https://pan.baidu.com/s/1EmRehx_gRnT5hLjJvKuAIg提取码:pz1y–来自百度网盘超级会员V2的分享下载好后文件目录如下:然后你需要注意的是我把img里面的图片删了,涉及隐私,大家自己替换自己追

  • 惊艳四射的意思_词语什么四射

    惊艳四射的意思_词语什么四射分享一些CSS3相关的按钮和导航,大部分素材应该都来自一些老外的设计,希望接下来的几篇文章对你会有所帮助,当然你的支持和点评也是我坚持做下去的动力。正文今天的这款CSS3按钮应该说是非常的光彩夺目,因为不仅它的色彩调得非常的和谐,更美妙的是如果你用chrome或者safari浏览器还能看到按钮发光的特效。以下是效果截图在线示例    |    源码下载这里的发光效果主要是如

    2022年10月29日
  • jenkins 邮件_邮件发送协议邮件接收协议

    jenkins 邮件_邮件发送协议邮件接收协议前言前面已经实现在jenkins上展示html的测试报告,接下来只差最后一步,把报告发给你的领导,展示你的劳动成果了。安装EmailExtensionPlugin插件jenkins首页-

  • git设置ssh key(git ssh配置)

    gitclone支持https和git(即ssh)两种方式下载源码:当使用git方式下载时,如果没有配置过sshkey,则会有如下错误提示:下面就介绍一下如何配置git的sshkey,以便我们可以用git方式下载源码。首先用如下命令(如未特别说明,所有命令均默认在GitBash工具下执行)检查一下用户名和邮箱是否配置(github支持我们用用户名或邮箱登录):git

  • 《前端运维》二、Nginx–2请求处理流程及核心模块

    前一篇内容,我们学习了nginx的一些基本概念、安装和目录的作用。这篇文章我们来学习一些更加深入的内容。一、Nginx请求处理流程我们先来看张图吧:我们看上图,首先客户端请求到Nginx服务器,

发表回复

您的电子邮箱地址不会被公开。

关注全栈程序员社区公众号