VirtualGraph query error

leon · September 9, 2021, 9:07pm

Hi, I'm trying to create a virtual graph that points to Databrick.

The virtual graph is created successfully using the following mapping:

PREFIX : <http://www.test.com/2021/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

MAPPING
FROM SQL {
  SELECT *
  FROM "db_name"."table_name"
}
TO {
  ?subject <http://www.test.com/2021/table_name#product_code> "?product_code"^^xsd:integer .
  ?subject <http://www.test.com//2021/table_name#db_schema> ?db_schema .
  ?subject rdf:type :table_name
} WHERE {
  BIND (template("http://www.test.com/2021/table_name/{product_code}") AS ?subject)
}

in the above mapping i have to use double quote in the FROM clause around databrick's database and table name, otherwise, i won't be able to create the virtual graph.

when i try to query the virtual graph using the following query, I got "unexpected end of JSON input" error

PREFIX : <http://www.test.com/2021/>
select * {
    GRAPH <virtual://client> {
        ?test_client a :sbs_client .
    }
}

based on the log in startdog.log, it seems the generated query has double quote around db_name and table_name, which is not working for DataBrick. If i remove the double quote from the original mapping file, i cannot create the virtualgraph, Can someone help me out on what i should do to fix the issue?

Here is the error in stardog.log

== SQL ==
SELECT "product_code"
FROM "db_name"."table_name"
-----^^^
WHERE "product_code" IS NOT NULL
FETCH NEXT 1000 ROWS ONLY

	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:863)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:637)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:144)
	at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:91)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:53)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:637)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:632)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
	at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:646)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.catalyst.parser.ParseException: 
mismatched input '"datacloud_nonprod_dit_raw"' expecting {'(', 'CALLED', 'CLONE', 'COLLECT', 'CONTAINS', 'CONVERT', 'COPY', 'COPY_OPTIONS', 'CREDENTIALS', 'DEEP', 'DEFINER', 'DELTA', 'DETERMINISTIC', 'ENCRYPTION', 'FILES', 'FORMAT_OPTIONS', 'HISTORY', 'INPUT', 'INVOKER', 'LANGUAGE', 'MODIFIES', 'OPTIMIZE', 'PATTERN', 'READS', 'RESTORE', 'RETURN', 'RETURNS', 'SAMPLE', 'SECURITY', 'SHALLOW', 'SPECIFIC', 'SQL', 'TIMESTAMP', 'VERSION', 'ZORDER', 'ADD', 'AFTER', 'ALL', 'ALTER', 'ALWAYS', 'ANALYZE', 'AND', 'ANTI', 'ANY', 'ARCHIVE', 'ARRAY', 'AS', 'ASC', 'AT', 'AUTHORIZATION', 'BETWEEN', 'BOTH', 'BUCKET', 'BUCKETS', 'BY', 'CACHE', 'CASCADE', 'CASE', 'CAST', 'CHANGE', 'CHECK', 'CLEAR', 'CLUSTER', 'CLUSTERED', 'CODEGEN', 'COLLATE', 'COLLECTION', 'COLUMN', 'COLUMNS', 'COMMENT', 'COMMIT', 'COMPACT', 'COMPACTIONS', 'COMPUTE', 'CONCATENATE', 'CONSTRAINT', 'COST', 'CREATE', 'CROSS', 'CUBE', 'CURRENT', 'CURRENT_DATE', 'CURRENT_TIME', 'CURRENT_TIMESTAMP', 'CURRENT_USER', 'DATA', 'DATABASE', DATABASES, 'DBPROPERTIES', 'DEFINED', 'DELETE', 'DELIMITED', 'DESC', 'DESCRIBE', 'DFS', 'DIRECTORIES', 'DIRECTORY', 'DISTINCT', 'DISTRIBUTE', 'DIV', 'DROP', 'ELSE', 'END', 'ESCAPE', 'ESCAPED', 'EXCEPT', 'EXCHANGE', 'EXISTS', 'EXPLAIN', 'EXPORT', 'EXTENDED', 'EXTERNAL', 'EXTRACT', 'FALSE', 'FETCH', 'FIELDS', 'FILTER', 'FILEFORMAT', 'FIRST', 'FOLLOWING', 'FOR', 'FOREIGN', 'FORMAT', 'FORMATTED', 'FROM', 'FULL', 'FUNCTION', 'FUNCTIONS', 'GENERATED', 'GLOBAL', 'GRANT', 'GROUP', 'GROUPING', 'HAVING', 'IF', 'IGNORE', 'IMPORT', 'IN', 'INDEX', 'INDEXES', 'INNER', 'INPATH', 'INPUTFORMAT', 'INSERT', 'INTERSECT', 'INTERVAL', 'INTO', 'IS', 'ITEMS', 'JOIN', 'KEY', 'KEYS', 'LAST', 'LATERAL', 'LAZY', 'LEADING', 'LEFT', 'LIKE', 'LIMIT', 'LINES', 'LIST', 'LOAD', 'LOCAL', 'LOCATION', 'LOCK', 'LOCKS', 'LOGICAL', 'MACRO', 'MAP', 'MATCHED', 'MERGE', 'MSCK', 'NAMESPACE', 'NAMESPACES', 'NATURAL', 'NO', NOT, 'NULL', 'NULLS', 'OF', 'ON', 'ONLY', 'OPTION', 'OPTIONS', 'OR', 'ORDER', 'OUT', 'OUTER', 'OUTPUTFORMAT', 'OVER', 'OVERLAPS', 'OVERLAY', 'OVERWRITE', 'PARTITION', 'PARTITIONED', 'PARTITIONS', 'PERCENT', 'PIVOT', 'PLACING', 'POSITION', 'PRECEDING', 'PRIMARY', 'PRINCIPALS', 'PROPERTIES', 'PURGE', 'QUERY', 'RANGE', 'RECORDREADER', 'RECORDWRITER', 'RECOVER', 'REDUCE', 'REFERENCES', 'REFRESH', 'RENAME', 'REPAIR', 'REPLACE', 'RESET', 'RESTRICT', 'REVOKE', 'RIGHT', RLIKE, 'ROLE', 'ROLES', 'ROLLBACK', 'ROLLUP', 'ROW', 'ROWS', 'SCHEMA', 'SELECT', 'SEMI', 'SEPARATED', 'SERDE', 'SERDEPROPERTIES', 'SESSION_USER', 'SET', 'MINUS', 'SETS', 'SHOW', 'SKEWED', 'SOME', 'SORT', 'SORTED', 'START', 'STATISTICS', 'STORED', 'STRATIFY', 'STRUCT', 'SUBSTR', 'SUBSTRING', 'TABLE', 'TABLES', 'TABLESAMPLE', 'TBLPROPERTIES', TEMPORARY, 'TERMINATED', 'THEN', 'TIME', 'TO', 'TOUCH', 'TRAILING', 'TRANSACTION', 'TRANSACTIONS', 'TRANSFORM', 'TRIM', 'TRUE', 'TRUNCATE', 'TYPE', 'UNARCHIVE', 'UNBOUNDED', 'UNCACHE', 'UNION', 'UNIQUE', 'UNKNOWN', 'UNLOCK', 'UNSET', 'UPDATE', 'USE', 'USER', 'USING', 'VALUES', 'VIEW', 'VIEWS', 'WHEN', 'WHERE', 'WINDOW', 'WITH', 'ZONE', IDENTIFIER, BACKQUOTED_IDENTIFIER}(line 2, pos 5)

jess · September 9, 2021, 9:11pm

Hi Leon,

Can you confirm that you chose the "Databricks and Spark SQL" option as the database platform when creating the virtual graph?

What was the error if you didn't use the quotes? Is your table name a reserved word or does it include special characters?

Jess

leon · September 9, 2021, 10:22pm

yes, you can see it from the screencapture. the virtual graph is created with Databricks and SparkSQL

If i remove the double quote around the db and table name, i will get the following error:

From line 7, column 17 to line 7, column 57: Object 'DATACLOUD_NONPROD_DIT_RAW' not found; did you mean 'datacloud_nonprod_dit_raw'? see the 2nd screen capture

jess · September 9, 2021, 10:47pm

Can you try something? Can you try creating a new datasource and choosing Hive initially and then changing JDBC driver class to the Databricks class name? Also which driver jar are you using?

Jess

leon · September 9, 2021, 11:22pm

i'm using the databrick jdbc driver downloaded from here: https://databricks.com/spark/jdbc-drivers-download

jess · September 9, 2021, 11:27pm

Can you try setting sql.dialect to HIVE in the "other options" on the data source?

leon · September 9, 2021, 11:36pm

setting the sql.dialect to HIVE works. thanks a lot for the prompt help!

jess · September 9, 2021, 11:38pm

Glad to hear. We'll figure this out and get it fixed in Stardog. Let us know if you have any other questions or issues.

system · September 23, 2021, 11:38pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Stardog Studio - SparQL Query issue against Databricks Bug	5	413	February 15, 2022
Difficulties with complex schema names in SMS2 mapping Support	7	485	September 2, 2020
Virtual graph import NPE Bug	5	760	February 23, 2018
SQL errors on querying virtual graphs Support	7	1122	July 21, 2017
How to overcome Table and Column Names getting converted to caps from mapping file while creating Virtual Graph? Support	5	357	February 24, 2022

VirtualGraph query error

Related topics