Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add SASL dependency module #13608

Merged
merged 1 commit into from
Mar 16, 2021

Conversation

danielewood
Copy link
Contributor

@danielewood danielewood commented Mar 13, 2021

SUMMARY

Adds required OS dependencies for PyHive/Impyla(Impala) to work with user/password authentication.

Only adds 500KB to final container size.

docker inspect -f "{{ .Size }}" superset:latest
1450281124
docker inspect -f "{{ .Size }}" superset:fix-sasl-module
1450709446

Motivation: Allows connecting to Ascend.io using Superset over TLS with username and password.
impala://<service_account_username>:<service_account_password>@<ascendEnvironment>.sql.ascend.io:10000/<dataService>?auth_mechanism=PLAIN&use_ssl=true

TEST PLAN

  1. Build existing superset master:
    sudo docker build -t superset:latest https://github.com/apache/superset.git#master
  2. Build this PR branch
    sudo docker build -t superset:fix-sasl-module https://github.com/danielewood/superset.git#fix-sasl-module
  3. With current superset:master, attempt to use authentication_method=PLAIN with Impala or auth=CUSTOM with PyHive
    sudo docker run --rm -it -u 0 --entrypoint bash superset:latest
  4. See error from both PyHive and Impyla:
    thrift.transport.TTransport.TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'
  5. With this PR patch, attempt to use authentication_method=PLAIN with Impala or auth=CUSTOM with PyHive
    sudo docker run --rm -it -u 0 --entrypoint bash superset:fix-sasl-module
  6. See authentication error if you dont have a hive cluster to test against and use ascend with invalid credentials, otherwise see no errors because the connection worked.

Example:

sudo docker build -t superset:fix-sasl-module https://github.com/danielewood/superset.git#fix-sasl-module
sudo docker run --rm -it -u 0 --entrypoint bash superset:fix-sasl-module

# If you dont have a hive instance to test against, install socat to tunnel TLS to Ascend.io
# replace hive2.example.local with localhost
# apt update \
#    && apt install -y socat \
#    && socat TCP-LISTEN:10000,fork,reuseaddr openssl:trial.sql.ascend.io:10000,verify=1 &

python3 - <<-"EOF"
from sqlalchemy import create_engine
create_engine("hive://username:password@hive2.example.local:10000/database?auth=CUSTOM",echo=True,echo_pool='debug').connect()
EOF

####

# If you dont have a hive instance to test against, 
#  replace hive2.example.local with trial.sql.ascend.io, 
#  and add ";use_ssl=true" to the end of the connection string

pip install impyla
python3 - <<-"EOF"
from sqlalchemy import create_engine
create_engine("impala://username:password@hive2.example.local:10000/database?auth_mechanism=PLAIN",echo=True,echo_pool='debug').connect()
EOF

ADDITIONAL INFORMATION

@danielewood danielewood changed the title Fix: Add SASL dependency module fix: Add SASL dependency module Mar 13, 2021
@junlincc junlincc added the data:connect:hive Related to Hive label Mar 14, 2021
@junlincc junlincc requested a review from betodealmeida March 14, 2021 05:56
Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of this PR is beautiful, thanks for taking the time for such a detailed writeup! ❤️

@betodealmeida betodealmeida merged commit 65b4be7 into apache:master Mar 16, 2021
allanco91 pushed a commit to allanco91/superset that referenced this pull request May 21, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:connect:hive Related to Hive size/XS 🚢 1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQLAlchemy URI for hive problem Hive connection
4 participants