Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

First of all, I didn't make a minimum example because i think that my problem can be understood without it.

Second, I didn't give you the data because i think that my problem can be solved without it. However, I'm open to give it to you if you ask.

This is my query:

select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf  ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)

{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}

This is the result: enter image description here

As you see, the result contains many values for the same suggestedItem, I want to make group according to the suggestedItem and sum the values of finalSimilarity

I tried this:

select   ?item (SUM(?similarity * ?importance * ?levelImportance ) as ?finalSimilarity)  (group_concat(distinct ?x) as ?likedItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason) where
{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}
group by ?item
order by desc(?finalSimilarity)

but the result is:

enter image description here

this is something wrong in my way because if you look at the finalSimilarity in, the value is 1.7. However, if you sum that manually from the first query, you get 0.62 so I did something wrong,

could you help me discover it?

Please note that the two queries are the same, it is just the select statment are different

Hint

I am already able to solve it using two selects like this:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rs: <http://www.SemanticRecommender.com/rs#>
PREFIX bo: <http://www.BookOntology.com/bo#>
PREFIX :<http://www.SemanticBookOntology.com/sbo#>

select ?suggestedItem ( SUM (?finalSimilarity) as ?summedFinalSimilarity)  (group_concat(distinct strafter(str(?likedItem), "#")) as ?becauseYouHaveLikedThisItem) (group_concat(?becauseOf ; separator = " ,and ") as ?reason)
where {
select distinct (?x as ?likedItem) (?item as ?suggestedItem) ?similarity ?becauseOf  ((?similarity * ?importance * ?levelImportance) as ?finalSimilarity)
where
{
  values ?user {bo:ania}
  #the variable ?x is bound to the items the user :ania has liked.
  ?user rs:hasRated ?ratings.
  ?ratings a rs:Likes.
  ?ratings rs:aboutItem ?x.
  ?ratings rs:ratesBy   ?ratingValue.
  #level 0 class similarities
  {
    #extract all the items that are from the same class (type) as the liked items.
    #I assumed the being from the same class accounts for 50% of the similarities.
    #This value can be changed according to the test or the application domain.
    values ?classImportance {0.5} #class level
    bind (?classImportance as ?importance)
    bind( 4/7 as ?levelImportance)
  ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasClassSimilarity ?classSimilarity .
  ?classSimilarity rs:appliedOnClass ?class .
  ?classSimilarity rs:hasClassSimilarityValue ?similarity .
  ?item a ?class.
  bind (concat("it shares the same class, which is ", strafter(str(?class), "#"), ", with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  union
   #level 0 instance similarities
  {
  #extract the items that share the same value for important predicates with the already liked items..
  #I assumed that having the same instance for important predicates account for 100% of the similarities.
  #This value can be changed according to the test or the application domain.
   values ?instanceImportance {1} #instance level
   bind (?instanceImportance as ?importance)
   bind( 4/7 as ?levelImportance)
   ?x  a ?class.
  ?class rdfs:subClassOf ?mainClass .
  ?mainClass rdfs:subClassOf rs:RecommendableClass .
  ?mainClass rs:hasSimilarityConfiguration ?similarityConfiguration .
  ?similarityConfiguration rs:hasPropertySimilarity ?propertySimilarity .
  ?propertySimilarity rs:appliedOnProperty ?property .
  ?propertySimilarity rs:hasPropertySimilarityValue ?similarity .
  ?x ?property ?value .
  ?item ?property ?value .
    bind (concat("it shares ", strafter(str(?value), "#"), " for predicate ", strafter(str(?property), "#"), " with ", strafter(str(?x), "#")) as ?becauseOf)
  }
  filter (?x != ?item)
}
}
group by ?suggestedItem
order by desc(?summedFinalSimilarity)

but to me that is a stupid solution and there must be a more clever one where i can take the aggregated data using one select

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
139 views
Welcome To Ask or Share your Answers For Others

1 Answer

Without seeing your data, it's impossible to say, and with a query this big, it's probably not worth trying to debug the exact problem, but it's easy for this to happen if you can have duplicates (which would be easy to get, especially if you're using unions where some condition could match both parts). For instance, suppose you have data like this:

@prefix : <urn:ex:>

:x :similar [ :sim 0.10 ; :mult 2 ] ,
            [ :sim 0.12 ; :mult 1 ] ,
            [ :sim 0.12 ; :mult 1 ] ,  # yup, a duplicate
            [ :sim 0.15 ; :mult 4 ] .    

Then if you run this query, you'll get four result rows:

prefix : <urn:ex:>

select ?sim ((?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim  | final |
================
| 0.15 | 0.60  |
| 0.12 | 0.12  |
| 0.12 | 0.12  |
| 0.10 | 0.20  |
----------------

However, if you select distinct, you'll only see three:

select distinct ?sim ((?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}
----------------
| sim  | final |
================
| 0.15 | 0.60  |
| 0.12 | 0.12  |
| 0.10 | 0.20  |
----------------

Once you start to group by and sum, those non-distinct values will both get included:

select (sum(?sim * ?mult) as ?final) {
  :x :similar [ :sim ?sim ; :mult ?mult ] .
}
---------
| final |
=========
| 1.04  |
---------

That sum is the sum of all four terms, not the three distinct ones. Even if the data doesn't have the duplicate values, the union can introduce the duplicate results:

@prefix : <urn:ex:>

:x :similar [ :sim 0.10 ; :mult 2 ] ,
            [ :sim 0.12 ; :mult 1 ] ,
            [ :sim 0.15 ; :mult 4 ] .   
prefix : <urn:ex:>

select (sum(?sim * ?mult) as ?final) {
  { :x :similar [ :sim ?sim ; :mult ?mult ] }
  union
  { :x :similar [ :sim ?sim ; :mult ?mult ] }
}
---------
| final |
=========
| 1.84  |
---------

Since you found the need to use group_concat(distinct …), I wouldn't be surprised if there are duplicates of that nature.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...