Skip to content

Latest commit

 

History

History

text-words

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

acronyms-defined-dict: a dictionary of defined acronyms from technical jargon
adjective-words-list: a list of various English adjective words
alfred-hitchcock-movies: all films produced by Alfred Hitchcock http://infolab.stanford.edu/pub/movies/Hitch.html
buildings-word-list: list of words related to buildings from http://domainsbot.com/Content/Data/terms/buildings.txt
curse-words-list: Warning! list of vulgar (i.e. "curse") words
english-top-1000: 1,000 most used words in the english language
english-words-various: various english words taken from some old ZIP files
etc-anonymizer-names: copy of Splunk's etc/anonymizer/names.txt
female-names-list: alphabetized and capitalized list of female names
first-names-list: list of first names often used by people
first20hours-google-20k: 20,000 words parsed from Google https://github.com/first20hours/google-10000-english/blob/master/20k.txt
fortune-global500-list: Fortune 500 companies list with rank, company name, revenue.. http://fortune.com/global500/
geographic-stop-words: geographical stop words, i.e. words that are ignored during natural language processing
ieee-journal-names: a list of journals published by the IEEE
infosec-glossary-terms: glossary of information security terminology copied from RFC4949: https://tools.ietf.org/html/rfc4949
jargon-common-words: common information technology jargon words
jargon-common-bases: common base words in information technology jargon
last-names-list: list of last names often used by people in the Americas
linux-words-dict: words taken from /usr/share/dict/words on Linux install
longest-english-words: alphabetized list of English words that are longer than twenty letters
mrrobot-season3-subtitles: Subtitles for season three of the Mr. Robot television series https://www.podnapisi.net/subtitles/search/mr-robot-2015/SOM?seasons=3
not-found-translations: "Not Found" translations on iBiblio
obama-nobel-speech: Barack Obama's 2009 Nobel Peace Prize award acceptance speech
objects-name-list: listing that contains names of various objects from http://domainsbot.com/Content/Data/terms/objects.txt
occupations-frequency-list: list of occupations sorted by numeric score--higher means more popular from http://sunlight.s3.amazonaws.com/all_occupations.txt
one-hundred-thousand: the numbers 1-97935 with one on each line
reliable-passgen-wordlist: wordlist.txt from BURP
rogets-thesaurus-ebook: Roget's Thesaurus EBook from Project Gutenberg
secureblackbox-client-list: brief list of corporations with household names https://www.secureblackbox.com/company/clients.aspx
sfbay-companies-list: List of companies based in the San Francisco Bay Area
spike-proxy-allwords: dictionary distributed with ImmunitySec SPIKE Proxy dpkg
technical-manual-words: various forms of technical root words likely to be found in a manual from http://scrapmaker.com/data/wordlists/technology/TechnicalManualWords(1495).txt
tweet-word-ngrams: binary sequence and words parsed from tweets with cardinality
usgovt-manual-acronyms: U.S. Government Manual Commonly Used Agency Acronyms
word-cluster-ngrams: word clusters parsed from the text in millions of tweets
yo-momma-jokes: over one thousand crude "yo momma!" jokes
zoo-animal-list: an alphabetically sorted list of names for zoo-kept animals