Skip to content

Latest commit

 

History

History
 
 

text-words

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

werdlists/text-words

      Folder  Name       Description of Contents
acronyms-defined-dict dictionary of acronyms defined from technical jargon
adjectives-abuse-list list of adjectives that [RobinAbuseBot](https://github.com/llamasoft/RobinAbuseBot "A random insult generating abuse bot for Reddit's Robin chat.
") constructs insults with via https://github.com/llamasoft/RobinAbuseBot/blob/master/RobinAbuseBot.user.js
adjective-words-list list of various English adjective words
alfred-hitchcock-movies all films produced by Alfred Hitchcock http://infolab.stanford.edu/pub/movies/Hitch.html
bbs-subjects-list list of forum topics from an electronic bulletin board system
book-titles-list Loosely formatted list of book titles
buildings-word-list list of words related to buildings from http://domainsbot.com/Content/Data/terms/buildings.txt
business-noun-words a list of words that can be categorized as business-related nouns
common-english-words a brief listing of the most commonly used words in English (the top 30)
curse-words-list Warning! list of vulgar (i.e. "curse") words
elasticlunr-default-stop list of ElasticLunr stop words from http://elasticlunr.com/docs/stop_word_filter.js.html
english-connective-words list of "connective" English words--these are words that can often be dropped from simple search queries
english-top-1000 1,000 most used words in the English language
english-top-1500 1,500 most used words in the English language taken from htpwdScan
english-words-various various english words taken from some old ZIP files
espionage-techniques-list an alphabetized list of espionage techniques
etc-anonymizer-names copy of Splunk's etc/anonymizer/names.txt
female-names-list alphabetized and capitalized list of female names
first-names-list list of first names often used by people
first20hours-google-20k 20,000 words parsed from Google https://github.com/first20hours/google-10000-english/blob/master/20k.txt
fortune-global500-list Fortune 500 companies list with rank, company name, revenue.. http://fortune.com/global500/
geographic-stop-words geographical stop words, i.e. words that are ignored during natural language processing
indefinite-nouns-words a list of indefinite nouns--in other words, not "proper" nouns and therefore do not need to be capitalized.. https://gist.githubusercontent.com/gardner/25d36eea91523d5a30d3e5197c6cc2b3/raw/a42ac049336b388674ecd1f1f37dd2f0cbd02ae7/nouns.txt
international-address-list various samples of worldwide postal addresses
ieee-journal-names a list of journals published by the IEEE
infosec-glossary-terms glossary of information security terminology copied from RFC4949: https://tools.ietf.org/html/rfc4949
jargon-common-words common information technology jargon words
jargon-common-bases common base words in information technology jargon
keyword-ideas-generator list constructed from the words shown by keywordideasgenerator.com
last-names-list list of last names often used by people in the Americas
linux-words-dict words taken from /usr/share/dict/words on Linux install
longest-english-words alphabetized list of English words that are longer than twenty letters
mrrobot-season3-subtitles Subtitles for season three of the Mr. Robot television series https://www.podnapisi.net/subtitles/search/mr-robot-2015/SOM?seasons=3
multi-lingual-vernacular Popular vernacular from common languages along with associated numeric rating of each
not-found-translations "Not Found" translations on iBiblio
nouns-abuse-list.txt list of nouns that RobinAbuseBot constructs insults with via https://github.com/llamasoft/RobinAbuseBot/blob/master/RobinAbuseBot.user.js
obama-nobel-speech Barack Obama's 2009 Nobel Peace Prize award acceptance speech
objects-name-list listing that contains names of various objects from http://domainsbot.com/Content/Data/terms/objects.txt
occupations-frequency-list list of occupations sorted by numeric score--higher means more popular from http://sunlight.s3.amazonaws.com/all_occupations.txt
one-hundred-thousand the numbers 1-97935 with one on each line
phrack-acronyms-metalshopprivate Phrack
reliable-passgen-wordlist wordlist.txt from BURP
rogets-thesaurus-ebook Roget's Thesaurus EBook from Project Gutenberg
sdbf-count-1edit single character edits file packaged with Smart DNS Brute Forcer
sdbf-count-1w individual word frequency counts packaged with Smart DNS Brute Forcer
sdbf-count-2l double letter sequence frequency counts packaged with Smart DNS Brute Forcer
sdbf-count-2w double word sequence frequency counts packaged with Smart DNS Brute Forcer
sdbf-count-3l triple letter sequence frequency counts packaged with Smart DNS Brute Forcer
sdbf-count-big big word frequency counts packaged with Smart DNS Brute Forcer
search-stop-words commonly used words that a search engine will be programmed to ignore
secureblackbox-client-list brief list of corporations with household names https://www.secureblackbox.com/company/clients.aspx
security-words-dictionary dictionary of some actual real, but mostly made-up security words created manually by yours truly
sfbay-companies-list List of companies based in the San Francisco Bay Area
sierrasoftworks-bender-quotes list of Bender quotes via https://raw.githubusercontent.com/SierraSoftworks/bender/master/configs/quotes.json
spike-proxy-allwords dictionary distributed with ImmunitySec SPIKE Proxy dpkg
technical-manual-words various forms of technical root words likely to be found in a manual from http://scrapmaker.com/data/wordlists/technology/TechnicalManualWords(1495).txt
tweet-word-ngrams binary sequence and words parsed from tweets with cardinality
unicode-words-list list of strings with some containing Unicode characters via https://wing.comp.nus.edu.sg/~forecite/services/keyphrase/lib/x/Porter/output.txt
usenet-name-strings strings parsed from USENET names
usgovt-manual-acronyms U.S. Government Manual Commonly Used Agency Acronyms
various-vocabulary-sorted Various English words in lowercase and sorted
word-cluster-ngrams word clusters parsed from the text in millions of tweets
worker-death-cases sentences sorted by increasing length, each of which details unique cases of worker death
worker-death-summaries2017 Occupational Health and Safety Adminstration Archive Reports of Fatalities and Catastrophes
yo-momma-jokes over one thousand very crude "yo momma!" jokes
zoo-animal-list an alphabetically sorted list of names for zoo-kept animals