Skip to content

BIP 1: Support Integration testing using docker

hellozepp edited this page Sep 21, 2022 · 25 revisions

Status: MERGED

Author: @hellozepp

Contributor: @hellozepp

Date: 2022.02.24

Pull Requests: byzer-lang #1702 ,byzer-build #23

中文版本在文章下面描述。

Motivation

Byzer lacks a process for cluster testing of yarn and ray, there are some problems as follows:

  1. It is often reported that the Byzer on local mode is normal, but there will be problems when submitting to yarn.

  2. After Pull Requests is submitted, it is difficult for regular scalatest local tests to discover potential problems that may exist in the cluster environment.

  3. The manual test before release lacks the yarn test environment and the yarn environment for automated release testing.

Goal

  • Improve the CI process, support integration tests triggered by PR that can automatically run in the yarn environment

Change point

  1. Change modules: byzer-build, streamingpro-it

  2. Changes:

  • byzer-build project supports building Byzer on yarn images

  • Integration testing supports yarn cluster testing via docker locally and on github

Implementation

Preconditions:

  • Our packaging script only supports Python3 environment, if you need multiple versions, please install conda first and use .
  1. Install Docker Desktop: Download the installation package suitable for your operating system from Docker official website, install and use it.

  2. Test

  • Start test class streamingpro-it/src/test/scala/tech/mlsql/it/ByzerScriptTestSuite in IDE

  • or via the maven command

sh -x dev/run-test.sh 3.0
  1. Effect demonstration The it module will start the hadoop3 and byzer-lang containers in turn. You can see the hadoop startup log in the console:

lL9DVnmI1n

byzer-lang startup log:

middle_img_v2_482b6ac1-1aae-4149-a139-863a1c2e41fg

After startup, you will see 3 docker instances:

middle_img_v2_7eb03599-af71-4c2a-a8fb-5f4e636131eg

In fact, it is a cluster simulated by multiple sets of containers, and then the logs are collected into the test task, so that the yarn environment can be tested like a local test.

In the test task ByzerScriptTestSuite, the simulated http task is actually submitted to the container. Similarly, we can manually access the web entry after the container runs successfully:

middle_img_v2_0319edb5-0fbb-44dc-803d-bdc01c114f9g

Note that the port number is randomly generated and needs to be obtained from the API of testcontainer.

Finally, the test runs successfully and the container is automatically destroyed. The whole process from starting the container to testing and destroying takes 8-10 minutes (the image is already installed).

Technical Design

  1. Github Action + testcontainer + Docker

image

Import maven dependencies as follows:

        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_${scala.binary.version}</artifactId>
            <version>${scalatest.version}</version>
            <scope>test</scope>
        </dependency>
       <dependency>
            <groupId>org.scalactic</groupId>
            <artifactId>scalactic_${scala.binary.version}</artifactId>
            <version>${scalatest.version}</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>com.dimafeng</groupId>
            <artifactId>testcontainers-scala_${scala.binary.version}</artifactId>
            <version>0.40.2</version>
            <scope>test</scope>
        </dependency>

Extract public methods and abstract common classes for 3 container instances: ByzerLangContainer, ChaosContainer, HadoopContainer

Generic class for unified management of containers: ByzerCluster

How to use the API:

  • Integrate the startup class provided by scalatest and logging and other traits: class extends FlatSpec with Suite with BeforeAndAfterAll with Logging

  • Initialize cluster parameters: cluster = ByzerCluster.forSpec()

  • Start and its environment: cluster.start()

  • After the startup is complete, perform script automation testing:

 "javaContainer" should "retrieve non-0 port for any of services" in {
     val url = "http://" + javaContainer.getHost + ":" + javaContainer.getMappedPort(9003) + "/run/script"
     val sql = "select 1 as a,'jack' as b as bbc;"
     val owner = "admin"
     val (status, result) = FunctionsUtils._http(url, "post", Map("sql" -> sql, "owner" -> owner, "jobName" -> jobName),
        Map("Content-Type" -> "application/x-www-form-urlencoded"), Map()
      )
 }
  • Manual cleanup after tests are done: cluster.stop()
  1. dockerClient log pull and certificate

Log pull reference implementation method: ChaosContainer.tailContainerLog(container: GenericContainer) Log storage certificate reference implementation method: DockerUtils.runCommandAsyncWithLogging, the existence will package the log and store it under the target of the classpath image

  1. Byzer Automation Tool

Reference implementation method: FunctionsUtils._http

Compatibility

Only supports spark3 version of byzer on yarn test

Test Plan

  • Test whether the spark2 environment can be skipped normally

  • Test if PR tests have enough resources

  • Test whether the main link, hadoop3 and byzer-lang containers are started normally, and whether http requests can be received normally

  • Whether the startup log can be found in the target after the test result

  • When the tar package of byzer does not exist, will the user be prompted to package it in the test first?

The following is the Chinese translation of the BIP


Status: DISCUSSION

Author: @hellozepp

Contributor: @hellozepp

Date: 2022.02.24

Pull Requests: byzer-lang #1702 ,byzer-build #22

背景

Byzer 缺少一个可以进行集群测试的流程,存在以下的一些问题:

1)经常会有人反馈Byzer local 模式正常,提交到 yarn 上就会有问题的 case

2)PR 提交后常规的 scalatest 本地测试很难发现集群环境潜在的问题

3)发布前人工测试缺少 yarn 测试的环境,缺少 yarn 环境自动化发版测试

预期收益

  • 完善 CI 流程,支持可以自动跑在 yarn 环境的 PR 触发的集成测试

变更点

1)变更模块:byzer-build、streamingpro-it

2)变更内容:

  • byzer-build 支持构建 Byzer on yarn 的镜像

  • 集成测试支持本地和 github 上通过 docker 进行 yarn 集群测试

使用介绍

前提条件:

  • 我们的打包脚本仅支持 Python3 环境,如果需要多版本请先安装 conda 并使用
  1. 安装 Docker Desktop:从 Docker 官网 下载适配您操作系统的安装包,安装并使用。

  2. 测试

  • 方式1:IDE 中启动测试类 streamingpro-it/src/test/scala/tech/mlsql/it/ByzerScriptTestSuite(启动该类前,根目录需要有byzer-lang的tar包,由make-distrbution.sh脚本生成)

  • 方式2:通过 maven 命令启动

sh -x dev/run-test.sh 3.0
  1. 效果演示 it模块会依次启动 hadoop3 和 byzer-lang 容器,可以在控制台看到 hadoop 的启动日志:

lL9DVnmI1n

byzer-lang 启动日志:

middle_img_v2_482b6ac1-1aae-4149-a139-863a1c2e41fg

启动完成后,将会看到3个 Docker 实例:

middle_img_v2_7eb03599-af71-4c2a-a8fb-5f4e636131eg

实际是多套容器模拟的集群,然后日志收集到测试任务里面,这样就可以像 local 测试一样测试 yarn 环境。

在测试任务 ByzerScriptTestSuite 中,实际上是模拟的 http 任务提交到 container。同样我们可以在容器运行成功后手动访问该 web 入口:

middle_img_v2_0319edb5-0fbb-44dc-803d-bdc01c114f9g

注意,端口号为随机生成,需要从 testcontainer 的 API 获取。

最后,测试运行成功,容器会自动销毁。整个流程从启动容器到测试和销毁用时 8-10 分钟(image 已经安装好的情况)。

新增内容:

支持直接写byzer 代码用于测试byzer on yarn环境,代码位置如下:

8oRvdpphcd

只需要添加一个脚本,通过byzer的assert命令做结果验证即可,无需添加额外代码。

技术设计

  1. Github Action + testcontainer + Docker

image

引入 maven 依赖如下:

        <dependency>
            <groupId>org.scalatest</groupId>
            <artifactId>scalatest_${scala.binary.version}</artifactId>
            <version>${scalatest.version}</version>
            <scope>test</scope>
        </dependency>
       <dependency>
            <groupId>org.scalactic</groupId>
            <artifactId>scalactic_${scala.binary.version}</artifactId>
            <version>${scalatest.version}</version>
            <scope>test</scope>
        </dependency>

        <dependency>
            <groupId>com.dimafeng</groupId>
            <artifactId>testcontainers-scala_${scala.binary.version}</artifactId>
            <version>0.40.2</version>
            <scope>test</scope>
        </dependency>

提取公共方法,抽象3个容器实例的通用类:ByzerLangContainerChaosContainerHadoopContainer

统一管理容器的通用类:ByzerCluster

API使用方式:

  • 集成 scalatest 提供的启动类以及日志等特质:class <测试类> extends FlatSpec with Suite with BeforeAndAfterAll with Logging

  • 初始化集群参数:cluster = ByzerCluster.forSpec()

  • 启动及其环境:cluster.start()

  • 启动完成后,进行脚本自动化测试:

 "javaContainer" should "retrieve non-0 port for any of services" in {
     val url = "http://" + javaContainer.getHost + ":" + javaContainer.getMappedPort(9003) + "/run/script"
     val sql = "select 1 as a,'jack' as b as bbc;"
     val owner = "admin"
     val (status, result) = FunctionsUtils._http(url, "post", Map("sql" -> sql, "owner" -> owner, "jobName" -> jobName),
        Map("Content-Type" -> "application/x-www-form-urlencoded"), Map()
      )
 }
  • 测试完成后手动清理:cluster.stop()
  1. dockerClient日志拉取和存证

日志拉取参考实现方法:ChaosContainer.tailContainerLog(container: GenericContainer)

日志存证参考实现方法:DockerUtils.runCommandAsyncWithLogging,会把日志打包存储到 classpath 的 target 下面 image

  1. Byzer自动化工具

参考实现方法:FunctionsUtils._http

兼容性

仅支持 spark3 版本的 byzer on yarn 测试

测试要点

  • 测试 spark2 环境是否可以正常跳过

  • 测试PR测试是否有足够资源

  • 测试主链路,hadoop3 和 byzer-lang 容器是否正常启动,是否可以正常接收 http 请求

  • 测试结果后是否在 target 中可以找到启动日志

  • byzer的tar包不存在时,是否支持自动打包

写在最后

我们很高兴的发现了 testcontainers,作为我们 Byzer 全局的集成测试工具,可以让我们用最小的工作量在不同的环境通过集成测试验证代码变更对集群环境的影响。它与 Docker 结合,将我们的Pull requests测试提升到了一个新的高度,把之前的 yarn 环境和 ray 环境下的分布式测试从不可能变成了可能。社区活跃度很高,响应也比较积极,之前在探索过程中遇到 scala 和 java 代码混编的问题,经过帮助发现了 testcontainers-scala 这个仓库,极大减少代码混编带来的兼容成本。

后续我们会继续完善集成测试框架,并对 Byzer 进行更广泛的测试覆盖,从细节提升 Byzer 框架整体的质量水平。