Hi, Paul:
"One way to do that is to store each attribute in its own node with a fixed properties for the attribute names and values (AttrName=“Width”, AttrValue=“3 inches”) and index on AttrValue. Call that an Attribute (singular) node. Tie the group to an Attributes node which in turn connects to the product node. "
Do you meant this?
Let’s say if I have at least length, width and height attributes, so I can have 3 node types:
Length(attrName:xxx; attrValue:xxx)
Width(attrName:xxx; attrValue:xxx)
Height(attrName:xxx; attrValue:xxx)
If this is the case, the 3 should be simply collapsed into one node type, since they all share same attributes.
Then what do you mean by “Call that an Attribute (singular) node”? I could create complex labels so in addition to Length, Width and Height, I can add an Attribute label to each:
create (a:Length:Attribute …)
create (a:Length:Attribute …)
create (a:Length:Attribute …)
I think even that I can’t effectively use the index in the attribute nodes due to the variable $attrKey and $attrValue, which has to be used in the WHERE clause in Neo4j. I think that’s the fundamental problem.
I can create index for various attributes of the ProductAttribute node in the current graph, and they can speed up a lot for this query, where ‘resolution’ attribute is indexed, due to the use of ‘USING INDEX’:
match (s:Product {type:'Phone'})-[r]->(o:Attributes {resolution:'2000'})
USING INDEX o:Attributes(resolution)
return s, o limit 2
However, since my query has to accepts variables, it has to change to something like this format:
match (s:Product {type:'Phone'})-[r]->(o:Attributes)
// USING INDEX o:Attributes(resolution)
WHERE any(key in keys(o) WHERE key=$attrName AND o[key] contains $attrValue)
return s, o limit 2
Now, USING INDEX is invalid in the above query and that slows down significantly.
If Stardog can solve this problem in a straightforword way, that would be great. So I am not really sure whether changing graph model in Neo4j improve things.