Skip to content
/ pyjdk Public

Gives you Python3 and OpenJDK as a primer for PySpark (or anything else ...)

Notifications You must be signed in to change notification settings

loum/pyjdk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python3 and OpenJDK on Ubuntu

Overview

This repository manages the customised Docker image build of Python3 on Ubuntu. You can target any Python 3 verions against any Ubuntu release. Just follow the makester settings below.

Bypassing the Docker Hub Official Image Python image build is much more work, but gives us more flexibility to address CVEs.

The image build process is based on GitHub Python project's Docker build with a switch to Ubuntu base. Not sure why there isn't a Ubuntu variant available in the Official Python images?

Quick links

Prerequisites

Getting started

Makester is used as the Integrated Developer Platform.

(macOS users only) upgrading GNU Make

Follow these notes to get GNU make.

Creating the local environment

Get the code and change into the top level git project directory:

git clone https://github.com/loum/pyjdk.git && cd pyjdk

NOTE: Run all commands from the top-level directory of the git repository.

For first-time setup, get the Makester project:

git submodule update --init

Initialise the environment:

make init

Local environment maintenance

Keep Makester project up-to-date with:

git submodule update --remote --merge

Help

There should be a make target to be able to get most things done. Check the help for more information:

make help

Docker image development and management

Building the image

NOTE: Ubuntu base image is jammy 22.04

Build the image with:

make image-buildx

Searching images

To list the available Docker images::

make image-search

Image tagging

By default, makester will tag the new Docker image with the current branch hash. This provides a degree of uniqueness but is not very intuitive. That is where the image-tag-version Makefile target can help. To apply tag as per project tagging convention <ubuntu-code>-<python3-version>-<image-release-number>:

make image-tag-version

Sample output:

### Tagging container image "loum/pyjdk" as "python3.10-openjdk11-1"

To tag the image as latest

make image-tag-latest

Sample output:

### Tagging container image "loum/pyjdk" as "latest"

To tag the image main line (without the <image-release-number> that ensures the latest Ubuntu release:

make image-tag-main

Sample output:

### Tagging container image "loum/pyjdk" as "python3.10-openjdk11"

Building the image with a different Python 3 version

During the image build, a fresh compile of the Python binaries is performed. In theory, any Python release under https://www.python.org/ftp/python/ can be used. You will need to supply the PYTHON_MAJOR_MINOR_VERSION to the image build target. For example, to build an image with the latest Python 3.9:

PYTHON_MAJOR_MINOR_VERSION=3.9 make image-buildx

To validate the image runs as expected:

make container-run

By default, the container-run target will drop you into the Python REPL:

Python 3.9.16 (main, Jan 29 2023, 10:42:18)
[GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

On success this will drop you into the Python interpreter.

Interact with the loum/pyjdk image

The container-run target is a convenience action that will drop into the Python REPL of the current image build:

make container-run

To get the container image Python version:

make container-run CMD=--version

NOTE: Override the CMD variable to pass any CLI options to the Python executable.

PySpark REPL

PySpark is not installed by default. This is to keep the image size as small as possible. However, the environment is ready to support a PySpark install. loum/pyjdk can serve as a base image for your larger project. If you only want a quick and simple PySpark REPL, then provide a PySpark version to the BUILD_PYSPARK_VERSION environment variable:

BUILD_PYSPARK_VERSION=3.3.1 m container-run

Without a CMD, this will drop you into the PySpark REPL:

Successfully installed py4j-0.10.9.5 pyspark-3.3.1
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.3.1
      /_/

Using Scala version 2.12.15, OpenJDK 64-Bit Server VM, 11.0.17
Branch HEAD
Compiled by user yumwang on 2022-10-15T09:47:01Z
Revision fbbcf9434ac070dd4ced4fb9efe32899c6db12a9
Url https://github.com/apache/spark
Type --help for more information.

top