We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
原程序hbase版本为1.3.6时mysql数据无法写入hbase,错误显示找不到类! hbase升级到2.1.0后spark读取mysql数据后写入hbase时报! 报错信息如下: 2022-02-24 12:08:09:174[INFO]: [数据采集]:[HBASE]:检查表是否存在:t1 2022-02-24 12:08:09:179[INFO]: [数据采集]:[HBASE]:检查表已存在,检查 列族:cf1 2022-02-24 12:08:09:186[INFO]: [数据采集]:[HBASE]:tableDescriptor:'t1', {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'} 2022-02-24 12:08:09:190[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:09:189[INFO]: [数据采集]:[HBASE]:[WRITE]:writeDS:=====开始======= 2022-02-24 12:08:10:191[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all 2022-02-24 12:08:10:201[INFO]: Code generated in 308.11907 ms 2022-02-24 12:08:10:263[INFO]: [数据采集]:[HBASE]:[WRITE]:DataFrame:=====MapPartitionsRDD[3] at rdd at HbaseDataSources.scala:214 2022-02-24 12:08:10:294[INFO]: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 2022-02-24 12:08:10:299[INFO]: Using output committer class org.apache.hadoop.mapred.FileOutputCommitter 2022-02-24 12:08:10:301[INFO]: File Output Committer Algorithm version is 2 2022-02-24 12:08:10:301[INFO]: FileOutputCommitter skip cleanup temporary folders under output directory:false, ignore cleanup failures: false 2022-02-24 12:08:10:301[WARN]: Output Path is null in setupJob() 2022-02-24 12:08:10:325[INFO]: Starting job: runJob at SparkHadoopWriter.scala:78 2022-02-24 12:08:10:341[INFO]: Got job 0 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions 2022-02-24 12:08:10:342[INFO]: Final stage: ResultStage 0 (runJob at SparkHadoopWriter.scala:78) 2022-02-24 12:08:10:342[INFO]: Parents of final stage: List() 2022-02-24 12:08:10:344[INFO]: Missing parents: List() spark.rdd.scope.noOverride===true spark.jobGroup.id===946377927967121408 spark.rdd.scope==={"id":"6","name":"saveAsHadoopDataset"} spark.job.description===mysql2hbase_2022-02-24 12:07:58_946377927967121408 spark.job.interruptOnCancel===false =====jobStart.properties:{spark.rdd.scope.noOverride=true, spark.jobGroup.id=946377927967121408_, spark.rdd.scope={"id":"6","name":"saveAsHadoopDataset"}, spark.job.description=mysql2hbase_2022-02-24 12:07:58_946377927967121408, spark.job.interruptOnCancel=false} Process:null 2022-02-24 12:08:10:348[INFO]: Submitting ResultStage 0 (MapPartitionsRDD[4] at map at HbaseDataSources.scala:215), which has no missing parents 2022-02-24 12:08:10:348[ERROR]: Listener ServerSparkListener threw an exception scala.MatchError: null at com.zyc.common.ServerSparkListener.onJobStart(ServerSparkListener.scala:32) at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37) at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91) at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92) at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
The text was updated successfully, but these errors were encountered:
你好,已确认,属于bug,影响范围4.7.18及之前版本都无法完成hbase,写入, 将于5.0.0版本修复此bug, 临时解决方案如下:修改zdh_server源码,spark listener 如下图:
使用安装包的同伴,可下载4.7.10版本源码,修改此文件,然后编译,将编译好的class拷贝到jar即可,如下图:
Sorry, something went wrong.
zhaoyachao
No branches or pull requests
原程序hbase版本为1.3.6时mysql数据无法写入hbase,错误显示找不到类!
hbase升级到2.1.0后spark读取mysql数据后写入hbase时报!
报错信息如下:
2022-02-24 12:08:09:174[INFO]: [数据采集]:[HBASE]:检查表是否存在:t1
2022-02-24 12:08:09:179[INFO]: [数据采集]:[HBASE]:检查表已存在,检查 列族:cf1
2022-02-24 12:08:09:186[INFO]: [数据采集]:[HBASE]:tableDescriptor:'t1', {NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2022-02-24 12:08:09:190[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all
2022-02-24 12:08:09:189[INFO]: [数据采集]:[HBASE]:[WRITE]:writeDS:=====开始=======
2022-02-24 12:08:10:191[INFO]: Got an error when resolving hostNames. Falling back to /default-rack for all
2022-02-24 12:08:10:201[INFO]: Code generated in 308.11907 ms
2022-02-24 12:08:10:263[INFO]: [数据采集]:[HBASE]:[WRITE]:DataFrame:=====MapPartitionsRDD[3] at rdd at HbaseDataSources.scala:214
2022-02-24 12:08:10:294[INFO]: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2022-02-24 12:08:10:299[INFO]: Using output committer class org.apache.hadoop.mapred.FileOutputCommitter
2022-02-24 12:08:10:301[INFO]: File Output Committer Algorithm version is 2
2022-02-24 12:08:10:301[INFO]: FileOutputCommitter skip cleanup temporary folders under output directory:false, ignore cleanup failures: false
2022-02-24 12:08:10:301[WARN]: Output Path is null in setupJob()
2022-02-24 12:08:10:325[INFO]: Starting job: runJob at SparkHadoopWriter.scala:78
2022-02-24 12:08:10:341[INFO]: Got job 0 (runJob at SparkHadoopWriter.scala:78) with 1 output partitions
2022-02-24 12:08:10:342[INFO]: Final stage: ResultStage 0 (runJob at SparkHadoopWriter.scala:78)
2022-02-24 12:08:10:342[INFO]: Parents of final stage: List()
2022-02-24 12:08:10:344[INFO]: Missing parents: List()
spark.rdd.scope.noOverride===true
spark.jobGroup.id===946377927967121408
spark.rdd.scope==={"id":"6","name":"saveAsHadoopDataset"}
spark.job.description===mysql2hbase_2022-02-24 12:07:58_946377927967121408
spark.job.interruptOnCancel===false
=====jobStart.properties:{spark.rdd.scope.noOverride=true, spark.jobGroup.id=946377927967121408_, spark.rdd.scope={"id":"6","name":"saveAsHadoopDataset"}, spark.job.description=mysql2hbase_2022-02-24 12:07:58_946377927967121408, spark.job.interruptOnCancel=false}
Process:null
2022-02-24 12:08:10:348[INFO]: Submitting ResultStage 0 (MapPartitionsRDD[4] at map at HbaseDataSources.scala:215), which has no missing parents
2022-02-24 12:08:10:348[ERROR]: Listener ServerSparkListener threw an exception
scala.MatchError: null
at com.zyc.common.ServerSparkListener.onJobStart(ServerSparkListener.scala:32)
at org.apache.spark.scheduler.SparkListenerBus$class.doPostEvent(SparkListenerBus.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
at org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:91)
at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$super$postToAll(AsyncEventQueue.scala:92)
at org.apache.spark.scheduler.AsyncEventQueue$$anonfun$org$apache$spark$scheduler$AsyncEventQueue$$dispatch$1.apply$mcJ$sp(AsyncEventQueue.scala:92)
The text was updated successfully, but these errors were encountered: