"If it is a search engine, then it can be parsed" - Some random guy
Package to query popular search engines and scrape for result titles, links and descriptions. Aims to scrape the widest range of search engines. View all supported engines
Some of the popular search engines include:
- DuckDuckGo
- GitHub
- StackOverflow
- Baidu
- YouTube
View all supported engines
# install only package dependencies
pip install search-engine-parser
# Installs `pysearch` cli tool
pip install "search-engine-parser[cli]"
Clone the repository
git clone git@github.com:bisoncorps/search-engine-parser.git
Create virtual environment and install requirements
mkvirtualenv search_engine_parser
pip install -r requirements/dev.txt
Found on Read the Docs
pytest
Query Results can be scraped from popular search engines as shown in the example snippet below
from search_engine_parser import YahooSearch, GoogleSearch, BingSearch
import pprint
search_args = ('preaching to the choir', 1)
gsearch = GoogleSearch()
ysearch = YahooSearch()
bsearch = BingSearch()
gresults = gsearch.search(*search_args)
yresults = ysearch.search(*search_args)
bresults = bsearch.search(*search_args)
a = {
"Google": gresults,
"Yahoo": yresults,
"Bing": bresults}
# pretty print the result from each engine
for k, v in a.items():
print(f"-------------{k}------------")
pprint.pprint(v)
# print first title from google search
print(gresults["titles"][0])
# print 10th link from yahoo search
print(yresults["links"][9])
# print 6th description from bing search
print(bresults["descriptions"][5])
Search engine parser comes with a CLI tool known as pysearch
e.g
pysearch --engine bing search --query "Preaching to the choir" --type descriptions
Result
'Preaching to the choir' originated in the USA in the 1970s. It is a variant of the earlier 'preaching to the converted', which dates from England in the late 1800s and has the same meaning. Origin - the full story 'Preaching to the choir' (also sometimes spelled quire) is of US origin.
There is a needed argument for the CLI i.e -e Engine
followed by either of two subcommands in the CLI i.e search
and summary
SearchEngineParser
positional arguments:
{search,summary} help for subcommands
search search help
summary summary help
optional arguments:
-h, --help show this help message and exit
-e ENGINE, --engine ENGINE
Engine to use for parsing the query e.g google, yahoo,
bing, duckduckgo (default: google)
summary
just shows the summary of each search engine added with descriptions on the return
pysearch --engine google summary
Full arguments for the search
subcommand shown below
usage: pysearch search [-h] -q QUERY [-p PAGE] [-t TYPE] [-r RANK]
optional arguments:
-h, --help show this help message and exit
-q QUERY, --query QUERY
Query string to search engine for
-p PAGE, --page PAGE Page of the result to return details for (default: 1)
-t TYPE, --type TYPE Type of detail to return i.e full, links, desciptions
or titles (default: full)
-r RANK, --rank RANK ID of Detail to return e.g 5 (default: 0)
All actions performed should adhere to the code of conduct
Before making any contribution, please follow the contribution guide
This project is opened under the MIT 2.0 License which allows very broad use for both academic and commercial purposes.
Thanks goes to these wonderful people (emoji key):
Ed Luff π» |
Diretnan Domnan π |
MeNsaaH π |
Aditya Pal |
Avinash Reddy π |
David Onuh π» |
This project follows the all-contributors specification. Contributions of any kind welcome!