Check duplicate issues.
Description
this commit
removed the line
if (!hdata->valid()) continue;
in roofit/roofitcore/src/RooChi2Var.cxx which is supposed to skip bins outside the specified range when calling e.g.
pdf->createChi2(data_hist, Range("range_name"));
pdf->getVal()
As a result, the reported chi2 value is the same, no matter what range is provided.
Adding a printout like
std::cout << i << " " << nPdf << " " << nData << std::endl;
a few lines below where the valid line should be, indicated that all bins were considered for the chi2 calculation and thus helped to find the cause of the issue.
Since the valid() function was removed from RooDataHist in the same commit one cannot simply add the line back in. Instead createChi2(Range()) should probably make use of he RooAbsData::reduce() implementation as mentioned in the message of that commit:
"In a subrange fit, the RooAbsOptTestStatistic is creating a clone of the dataset with the subrange only using RooAbsData::reduce(). So all entries are valid by definition."
Apparently this was implemented for the other test statistics (e.g. createNLL) but not createChi2.
Using createNLL(data_hist, Range("range_name") appears to take the ranges into account correctly and there is an overload of createNLL taking RooAbsData (allowing for the intended RooAbsData::reduce() method) rather than taking RooDataHist. No such overload exists for createChi2().
tested on main (@4f17792277dd51dcf238c8a05f8d3f093de4ddcd) tag:v6-36-00
Reproducer
Simply run this macro. It does not require any input.
test.cpp:
{
using namespace RooFit;
RooRealVar x("x", "x", 0., 1.);
x.setRange("lo", 0., .5);
x.setRange("hi", .5, 1.);
x.setRange("full", 0., 1.);
RooUniform uniform("uniform", "", x);
RooRealVar nbkg("nbkg", "", 11.);
RooExtendPdf pdf("pdf", "", uniform, nbkg);
TH1D h("h", "h", 10, 0., 1.);
for (int i = 1; i <= 10; ++i)
h.SetBinContent(i, 1.);
// histogram integral: 10. (slightly off from nbkg = 11. so that chi2 is not 0.)
RooDataHist dh("dh", "dh", x, &h);
for (string range : {"", "full", "lo", "hi", "lo,hi"}) {
auto chi2 = unique_ptr<RooAbsReal> (
range == ""
? pdf.createChi2 (
dh,
Extended(true),
DataError(RooAbsData::Poisson)
)
: pdf.createChi2 (
dh,
Extended(true),
DataError(RooAbsData::Poisson),
Range(range.c_str())
)
);
cout << "chi2(" << range << ") = " << chi2->getVal() << endl;
}
}
A uniform pdf on the range 0. to 1. with integral 11. is created and compared with a uniform histogram with 10 bins, also on the range 0. to 1. but with integral 11.
chi2 is calculated using Poisson errors first without specifying a range
and then with the ranges
full (0. to 1.)
lo (0. to .5)
hi (.5 to 1.)
and the union lo,hi (0. to 1.)
First, here is the result with an older root version where the valid() line is still there and where the chi2 calculation appears to be working correctly:
output with root-v6-26-08:
> chi2() = 0.0189114
> chi2(full) = 0.0189114
> chi2(lo) = 1.36162
> chi2(hi) = 1.36162
> chi2(lo,hi) = 2.72324
Don't mind the normalisation difference between chi2(lo,hi) and chi2(full) which is not the focus of this issue.
It is understood to be related to the ndof (factor 144 = 12^2; 12 bins including under/overflow).
Now with the current main:
output with root-v6-36-04:
chi2() = 0.0189114
chi2(full) = 0.0189114
chi2(lo) = 0.0189114
chi2(hi) = 0.0189114
chi2(lo,hi) = 0.0189114
all results are as if no range is provided at all.
When exchanging createChi2 with e.g. createNLL in the example above, one gets different values for the different ranges, as intended.
ROOT version
ROOT v6.36.04
Built for linuxx8664gcc on Aug 25 2025, 09:43:06
From tags/v6-36-04@v6-36-04
With c++ (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0
Binary directory: ~/install/root/bin
latest commit: 4f17792
Installation method
build from source
Operating system
Linux
Additional context
No response
Check duplicate issues.
Description
this commit
removed the line
in
roofit/roofitcore/src/RooChi2Var.cxxwhich is supposed to skip bins outside the specified range when calling e.g.As a result, the reported chi2 value is the same, no matter what range is provided.
Adding a printout like
a few lines below where the
validline should be, indicated that all bins were considered for the chi2 calculation and thus helped to find the cause of the issue.Since the
valid()function was removed fromRooDataHistin the same commit one cannot simply add the line back in. Instead createChi2(Range()) should probably make use of he RooAbsData::reduce() implementation as mentioned in the message of that commit:"In a subrange fit, the
RooAbsOptTestStatisticis creating a clone of the dataset with the subrange only usingRooAbsData::reduce(). So all entries are valid by definition."Apparently this was implemented for the other test statistics (e.g.
createNLL) but notcreateChi2.Using
createNLL(data_hist, Range("range_name")appears to take the ranges into account correctly and there is an overload ofcreateNLLtaking RooAbsData (allowing for the intended RooAbsData::reduce() method) rather than taking RooDataHist. No such overload exists forcreateChi2().tested on main (@4f17792277dd51dcf238c8a05f8d3f093de4ddcd) tag:v6-36-00
Reproducer
Simply run this macro. It does not require any input.
test.cpp:
A uniform pdf on the range 0. to 1. with integral 11. is created and compared with a uniform histogram with 10 bins, also on the range 0. to 1. but with integral 11.
chi2 is calculated using Poisson errors first without specifying a range
and then with the ranges
full (0. to 1.)
lo (0. to .5)
hi (.5 to 1.)
and the union lo,hi (0. to 1.)
First, here is the result with an older root version where the
valid()line is still there and where the chi2 calculation appears to be working correctly:output with root-v6-26-08:
Don't mind the normalisation difference between chi2(lo,hi) and chi2(full) which is not the focus of this issue.
It is understood to be related to the ndof (factor 144 = 12^2; 12 bins including under/overflow).
Now with the current main:
output with root-v6-36-04:
all results are as if no range is provided at all.
When exchanging
createChi2with e.g.createNLLin the example above, one gets different values for the different ranges, as intended.ROOT version
latest commit: 4f17792
Installation method
build from source
Operating system
Linux
Additional context
No response