Predict a numerical value using regression model

I use regression model to train my data.The predicted results as follows:

data rating predictedRating
A 30.45 29.770733
B 27.44 25.481995
C 26.84 22.389313
D 24 23.6501
E 20.65 19.64112

It seems a good performance. However, The results are the same when I use new data as test set to predict the value.

data predictedRating
A 6.8025723
B 6.8025723
C 6.8025723
D 6.8025723
E 6.8025723

I use set to aggregate my entity. May I know which part is wrong?
Thanks very much.

Can you include the commands you ran to train and produce the two tables you included?

1 Like

Hi whitley,
My architecture is as follows:

:data :hasRating :rating;
         :has :item_1.
:item_1 :has :item_2.

My command for training is as follows:

prefix spa: <tag:stardog:api:analytics:>

    graph spa:model {
        :r1 a spa:RegressionModel ;
            spa:arguments (?new_item_1 ?new_item_2) ;
            spa:predict ?rating ;
            spa:crossValidation 100 ;
            spa:evaluation true ;
            spa:evaluationMetric spa:mae ;
            spa:overwrite True .
    (spa:set(?item_1) as ?new_item_1) 
    (spa:set(?item_2) as ?new_item_2) 
        ?data  :hasRating ?rating ;
                :has ?item_1 .
        ?item_1 :has ?item_2 .
    GROUP BY ?data ?rating

I use the spa:set function to aggregate the item_1 and item_2.
Sorry I still don't understand the meaning of spa:set function after reading the manual. Can you please introduce this function?

Thanks for your help.

There are many things to go over here, probably too many for a single reply. First I'd suggest going back over the Stardog documentation in the Machine Learning section and read over it carefully. It's not too long.

I can see how the machine learning implementation can be a bit confusing. I think it might help if you don't think if it so much as a data model where triples are added or removed from a graph but as arguments to a stateful model. I'm not even sure what you'd get if you issued something like the following query. I might have to give it a try and see.

select * where {
    graph spa:model {
      ?s ?p ?o .

I would also go back and reexamine your data that you're modeling. I'm not quite sure what the data you have represents or what you're trying to predict.