pyspark mysql jdbc load 调用 o23.load 时发生错误没有合

时间：2023-08-22

本文介绍了pyspark mysql jdbc load 调用 o23.load 时发生错误没有合适的驱动程序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在 Mac 上使用 docker image sequenceiq/spark 来研究这些spark examples，在学习过程中，我根据这个答案，当我启动Simple Data Operations 例子，这里是发生了什么:

I use docker image sequenceiq/spark on my Mac to study these spark examples, during the study process, I upgrade the spark inside that image to 1.6.1 according to this answer, and the error occurred when I start the Simple Data Operations example, here is what happened:

当我运行 df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load() 它引发错误，与pyspark控制台的完整堆栈如下:

when I run df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load() it raise a error, and the full stack with the pyspark console is as followed:

Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
16/04/12 22:45:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Python version 2.6.6 (r266:84292, Jul 23 2015 15:22:56)
SparkContext available as sc, HiveContext available as sqlContext.
>>> url = "jdbc:mysql://localhost:3306/test?user=root;password=myPassWord"
>>> df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()
16/04/12 22:46:05 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:06 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:11 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/04/12 22:46:11 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/04/12 22:46:16 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/04/12 22:46:17 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/spark/python/pyspark/sql/readwriter.py", line 139, in load
    return self._df(self._jreader.load())
  File "/usr/local/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
  File "/usr/local/spark/python/pyspark/sql/utils.py", line 45, in deco
    return f(*a, **kw)
  File "/usr/local/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o23.load.
: java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:278)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
    at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:57)
    at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
    at py4j.Gateway.invoke(Gateway.java:259)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:209)
    at java.lang.Thread.run(Thread.java:744)

>>>

这是我迄今为止尝试过的:

Here is what I have tried till now:

下载mysql-connector-java-5.0.8-bin.jar，放入/usr/local/spark/lib/.还是一样的错误.

Download mysql-connector-java-5.0.8-bin.jar, and put it in to /usr/local/spark/lib/. It still the same error.

像这样创建t.py:

from pyspark import SparkContext  
from pyspark.sql import SQLContext  

sc = SparkContext(appName="PythonSQL")  
sqlContext = SQLContext(sc)  
df = sqlContext.read.format("jdbc").option("url",url).option("dbtable","people").load()  

df.printSchema()  
countsByAge = df.groupBy("age").count()  
countsByAge.show()  
countsByAge.write.format("json").save("file:///usr/local/mysql/mysql-connector-java-5.0.8/db.json")

然后，我尝试了 spark-submit --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py.结果还是一样.

then, I tried spark-submit --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py. The result is still the same.

然后我尝试了 pyspark --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py，有和没有下面的t.py，还是一样.

Then I tried pyspark --conf spark.executor.extraClassPath=mysql-connector-java-5.0.8-bin.jar --driver-class-path mysql-connector-java-5.0.8-bin.jar --jars mysql-connector-java-5.0.8-bin.jar --master local[4] t.py, both with and without the following t.py, still the same.

在此期间，mysql 正在运行.这是我的操作系统信息:

During all of this, the mysql is running. And here is my os info:

# rpm --query centos-release  
centos-release-6-5.el6.centos.11.2.x86_64

hadoop 版本是 2.6.

And the hadoop version is 2.6.

现在不知道下一步该去哪里，希望有大神帮忙指点一下，谢谢！

Now I don't where to go next, so I hope some one can help give some advice, thanks!

pyspark mysql jdbc load 调用 o23.load 时发生错误没有合

问题描述

推荐答案

相关文章

最新文章

pyspark mysql jdbc load 调用 o23.load 时发生错误 没有合

问题描述

推荐答案

相关文章

最新文章

pyspark mysql jdbc load 调用 o23.load 时发生错误没有合