Hi sosata,
I was unable to find a contact mail id for you. Did not know best way to reach you, so create an issue.Very bad way to do things I guess.
Very good discussion below:
dmlc/xgboost#1746
I could not put my comments. You are right on how softmax is used. Also, to answer why in binary case 1/(1+ exp(value)). The answer is, in general, for classification, probability of a specific label is 1/(1 + exp(-value)). Therefore, in the case of binary scenario, you have two probability values(for each label). One is p = 1/(1+exp(-value)), and therefore the other will, 1-p = exp(-value)/(1+exp(-value)). So you can actually choose any of the two terms for a your label of interest. From the two terms, you can take the second one, and that can be redefined now as 1/(1+exp(value)). May be you already figure this out. Sorry my English is poor. Greetings from India. Best wishes. Please close issue, if possible add as comment to the thread, I am unable to add.
Hi sosata,
I was unable to find a contact mail id for you. Did not know best way to reach you, so create an issue.Very bad way to do things I guess.
Very good discussion below:
dmlc/xgboost#1746
I could not put my comments. You are right on how softmax is used. Also, to answer why in binary case 1/(1+ exp(value)). The answer is, in general, for classification, probability of a specific label is 1/(1 + exp(-value)). Therefore, in the case of binary scenario, you have two probability values(for each label). One is p = 1/(1+exp(-value)), and therefore the other will, 1-p = exp(-value)/(1+exp(-value)). So you can actually choose any of the two terms for a your label of interest. From the two terms, you can take the second one, and that can be redefined now as 1/(1+exp(value)). May be you already figure this out. Sorry my English is poor. Greetings from India. Best wishes. Please close issue, if possible add as comment to the thread, I am unable to add.