Skip to content

OPTIONAL MATCH issues + Continued Development and Support \ Community for Morpheus (Spark 3 support, bugfixes )Β #947

Open
@MarcianoAvihay

Description

hi,
i have encountered what i believe is a bug in the optional match implementation , and hoping to get some guidance as to how i could help remediate it (and submit a pull request for all to have this fix) .

the issue is this ->

i have done multiple tests that all lead to the same result -> using OPTIONAL MATCH to match against a non-existent relationship expansion , and afterwards having another OPTIONAL MATCH from the same variable to something that does exist , would never return the latter , and will only return a NULL .

re- example:

test graph ->

val test = morpheus.cypher(
"""
CONSTRUCT
CREATE (p1:Person {name: "Alice"})
CREATE (p2:Person {name: "Bob"})
CREATE (p3:Person {name: "Eve"})
CREATE (p4:Person {name: "Paul"})
CREATE (p1)-[:KNOWS]->(p3)
CREATE (p1)-[:KNOWS2]->(p2)
CREATE (p1)-[:KNOWS3]->(p3)
CREATE (p1)-[:KNOWS4]->(p4)
CREATE (p1)-[:KNOWS5]->(p2)

return GRAPH
""".stripMargin
).graph

--- query -->

val testres = test.cypher(
"""
match (p1:Person )
optional match (p1)-[:KNOWS]->(p2)
optional match (p1)-[:KNOWS15]->(p3)
optional match (p1)-[:KNOWS4]->(p4)
return p1.name, p2.name,p3.name,p4.name
""".stripMargin)

--- yields the following result ->

+-------+-------+-------+-------+
|p1_name|p2_name|p3_name|p4_name|
+-------+-------+-------+-------+
| Bob| null| null| null|
| Eve| null| null| null|
| Paul| null| null| null|
| Alice| Eve| null| null|
+-------+-------+-------+-------+

*** But , if we would change the query order to (having the non existent expansion last):

val testres = test2.cypher(
"""
match (p1:Person )
optional match (p1)-[:KNOWS]->(p2)
optional match (p1)-[:KNOWS4]->(p4)
optional match (p1)-[:KNOWS15]->(p3)
return p1.name, p2.name,p3.name,p4.name
""".stripMargin)

-- we would then get the correct result :
+-------+-------+-------+-------+
|p1_name|p2_name|p3_name|p4_name|
+-------+-------+-------+-------+
| Alice| Eve| null| Paul|
| Bob| null| null| null|
| Eve| null| null| null|
| Paul| null| null| null|

-- which is not expected - since alice is also connected via KNOWS4 to paul.
same query and graph setup in neo4j yields (in any ordering of the query) :

p1.name p2.name p3.name p4.name
"Alice" "Eve" null "Paul"
"Bob" null null null
"Eve" null null null
"Paul" null null null

i found in the OptionalMatchTests.scala in morpheus the following test which doesn't cover the above, and nothing else that does (there is no test in morpheus \ tck that cover doesnt exist - exist :

val g = initGraph(
"""
|CREATE (:DoesExist {property: 42})
|CREATE (:DoesExist {property: 43})
|CREATE (:DoesExist {property: 44})
""".stripMargin)

  val res = g.cypher(
    """
      |OPTIONAL MATCH (f:DoesExist)
      |OPTIONAL MATCH (n:DoesNotExist)
      |RETURN collect(DISTINCT n.property) AS a, collect(DISTINCT f.property) AS b
    """.stripMargin)

can any of the dev's please share they're thoughts as to how hard would it be to try and get to the expected behaviour? and where should i start looking in the solution to try and fix ?

is it a matter of a complex modification to the relation+logical planner to get the required behaviour ? if there is some small tweak that comes to mind it would greatly help me ( i am trying to use morpheus to automate a combination of left outer joins and inner joins needed in a very large datasets

thanks very much !

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions