Skip to content

Spark supports query trace #12084

Open
Open
@jinchengchenghh

Description

Description

Query trace is a very useful feature, but I meets some exceptions when I try to enable it in Gluten.

  1. Gluten QueryCtx queryId is empty "", so the generated directory missed the queryId layer which must be set in query trace. Since Gluten uses single thread execution and auto incremental vid , so the taskId is enough to distinguish the velox plan.
# TaskId
static std::atomic<uint32_t> vtId{0}; // Velox task ID to distinguish from Spark task ID.
  task_ = velox::exec::Task::create(
      fmt::format(
          "Gluten_Stage_{}_TID_{}_VTID_{}",
          std::to_string(taskInfo_.stageId),
          std::to_string(taskInfo_.taskId),
          std::to_string(vtId++)),
      std::move(planFragment),
      0,
      std::move(queryCtx),
      velox::exec::Task::ExecutionMode::kSerial);
# queryId is ""
std::shared_ptr<velox::core::QueryCtx> ctx = velox::core::QueryCtx::create(
      nullptr,
      facebook::velox::core::QueryConfig{getQueryContextConf()},
      connectorConfigs,
      gluten::VeloxBackend::get()->getAsyncDataCache(),
      memoryManager_->getAggregateMemoryPool(),
      spillExecutor_.get(),
      "");

Generated query trace directory.

/tmp/query_trace/
└── Gluten_Stage_0_TID_0_VTID_0
    ├── 7
    │   └── 0
    │       └── 0
    │           ├── op_input_trace.data
    │           └── op_trace_summary.json
    └── task_trace_meta.json

Receives the exception.

/mnt/DP_disk1/code/velox/build/velox/tool/trace# ./velox_query_replayer  --root_dir /tmp/query_trace --task_id Gluten_Stage_0_TID_0_VTID_0 --summary
terminate called after throwing an instance of 'facebook::velox::VeloxUserError'
  what():  Exception: VeloxUserError
Error Source: USER
Error Code: INVALID_ARGUMENT
Reason: --query_id must be provided
Retriable: False
Expression: !FLAGS_query_id.empty()
Function: init
File: /mnt/DP_disk1/code/velox/velox/tool/trace/TraceReplayRunner.cpp
Line: 241
Stack trace:
Stack trace has been disabled. Use --velox_exception_user_stacktrace_enabled=true to enable it.

Aborted (core dumped)

Since QueryCtx does not requires the queryId to be set, so I think the empty queryId is reasonable, so we need to support it in QueryTrace.

  1. Register the Spark functions and distinguish from Presto functions by FLAGS_xx, we cannot register both of them because the functions overwrite may trigger some unexpected behavior.
  2. Spark ValueStreamNode is hard to serialize and deserialize, we may not need to serialize the total plan, extract only the node required to serialize.

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions