apache spark - Tablename with spaces at JDBC connection gives error

Question

Welcome To Ask or Share your Answers For Others

apache spark - Tablename with spaces at JDBC connection gives error

asked Jan 29, 2021 in Technique[技术] by 深蓝 (71.8m points)

I'm trying to establish a connection in AWS Glue, using a pyspark script. The JDBC connection is pointing to a Microsoft SQL Server in Azure Cloud.

When I try to enter the connection string, it works until it gets to the table that it should read. That's mainly because of the whitespace inside the table name. Do you have any hint on how to write the syntax here?

source_df = sparksession.read.format("jdbc").option("url","jdbc:sqlserver://00.000.00.00:1433;databaseName=Sample").option("dbtable", "dbo.122 SampleCompany DE$Contract Header").option("user", "sampleuser").option("password", "sampL3p4ssw0rd").load()

When you execute this, it always throws the error:

py4j.protocol.Py4JJavaError: An error occurred while calling o69.load. : com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '.122'

Do you have any idea how to solve this?

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

636 views

1 Answer

深蓝 · Answer 1 · 2021-01-29T04:20:06+0000

Given the presence of spaces (and probably the dollar sign, and the fact the identifier starts with numbers), you need to quote the object name. Quoting object names in SQL Server is done by enclosing it in brackets (or, though this may depend on the session config, double quotes).

Keep in mind that dbo is the schema, while 122 SampleCompany DE$Contract Header is the table name. Schema and table name need to be quoted separately, not as a unit.

So, try to pass "dbo.[122 SampleCompany DE$Contract Header]"

Categories

apache spark - Tablename with spaces at JDBC connection gives error

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags