I was making a graph for a recommendation system and added vertices for users, categories and products and edges to represent the connections between them. One product may have connections to categories and a rating as a property for them. Users can also have a rating for each category. So, it is something like this:
-- User preferences.
SELECT * FROM cypher('RecommenderSystem', $$
MATCH (a:Person {name: 'Abigail'}), (A:Category), (C:Category), (H:Category)
WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H'
CREATE (a)-[:RATING {rating: 3}]->(C),
(a)-[:RATING {rating: 1}]->(A),
(a)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);
-- Products rating.
SELECT * FROM cypher('RecommenderSystem', $$
MATCH (product:Product {title: 'Product_Name'}), (A:Category), (C:Category), (H:Category)
WHERE A.name = 'A' AND C.name = 'C' AND H.name = 'H'
CREATE (product)-[:RATING {rating: 0}]->(C),
(product)-[:RATING {rating: 4}]->(A),
(product)-[:RATING {rating: 0}]->(H)
$$) AS (a agtype);
My recommendation system is based on Content Filtering, which uses information we know about people and products as connective tissue for recommendations. So for this, it would be necessary to do a calculation like: [(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating)
. For example, the likelihood of Abigail liking the product from the cypher query above would be:
which in a range from 0 to 4, she is likely going to hate the product. And the closer to 4, the more likely becomes for the user to buy or consume the product.
But then, how could I retrieve every edge rating that is connected to a person and a product and do this type of calculation with it?
3
Answers
The following query should work for this situation
This outputs:
Explanation
You need to find the category for which the person and product both have set the rating using the MATCH clause. Once you get these ratings, the sum of the product of these ratings would give
Now to divide it by the product of
You get
num_categories
usingCOUNT(DISTINCT c)
and I assume that you already know themax_rating
.Hope it helps
Edit
I assumed that by
num_categories
, you meant the total number of categories in the system and not the only ones that are associated with the person and product in common. In case,num_categories
is the count of categories associated with product and person in common, then modify yourWITH
clause asElse is fine
Something like this may work for you:
perName
is the person’s name,x
is a list of the desired category/product name pairs, andaffinity
will be the calculated result.NOTE: Even if not all desired pairs in
x
are found in the data, this query uses the size ofx
in the denominator. Adjust the query if this is not wanted.[UPDATE]
Unfortunately, the ANY predicate function is not part of
openCypher
, so it is not supported by Apache AGE.Even more unfortunately, even though list comprehension is a part of
openCypher
, AGE does not yet support that either.But, on an
openCypher
system that does support list comprehension, we could replace this:with something like this (we don’t care about the generated list’s contents, so we just use arbitrary
1
elements):If I understand correctly, you want to calculate the rating of each product for a user based on the given formula:
[(user_rating_C x product_rating_C) + (user_rating_A x product_rating_A) + (user_rating_H x product_rating_H)] / (num_categories x max_rating)
. According to your model,max_rating
is set to 4 (range from 0 to 4). To perform this calculation, you can use the following query:I added another product (rating 0 with category C, rating 1 with category A and rating 3 with category H) and this query gave me these results: