Change storage backend to xray.DataArray by kynan · Pull Request #6 · firedrakeproject/pybench

kynan · 2015-06-02T22:51:59Z

Refactor the storage of benchmark results to use xray, an N-dimensional array with labelled coordinate axes, like an N-dimensional pandas.Series.

DataArrays can be indexed very efficiently and saved to netCDF file.

There are a few issues / differences to the previous dict based storage:

DataArray does not support hierarchical metadata
The coordinate axes need to be known when initialising the DataArray i.e. the regions to time need to be declared upfront.

This is a WIP, not yet ready to merge, but I'd appreciate comments.

kynan · 2015-06-02T22:54:58Z

@mlange05 Could you test this with your storage benchmarks?

Note that some changes to the benchmarks are required. In particular, you need to specify the regions beforehand.

mlange05 · 2015-07-03T09:33:50Z

I just had a quick look and the new approach seems very nice in general. The a-priori allocation of the data array is somewhat annoying though, since I like to record all PyOP2 timer data at the end of a benchmark routine. Is there any way the allocation can be deferred?

kynan · 2015-07-03T21:40:00Z

We could potentially record each region as a separate DataArray and in the end concatenate them into one.

Unfortunately we still have the problem of merging data from different runs, which is still messy and I don't yet have a good idea how to solve this in a nicer way.

The main advantage of the xray storage is really that it's much easier to query, so it should be possible to refactor and simplify the plotting code.

mlange05 · 2015-08-13T09:00:11Z

OK, I think we want the results stored in a xray.Dataset with regions as keys. This ensures that the dimensions/params for each region are the same, and allows us to use the labelled indexing feature to concatenate them when combining series. I think this also maps reasonably well with the series/params distinction in pybench.

I've pushed this to another branch for inspection. Please have a go and see if this works with the current workflow used in firedrake-bench.

Uses params as the coordinate dimension names and values.

Benchmark.data is now always a xray.Dataset keyed by region. For individual runs the array dimensions are just the params, for Datasets resulting from combine_series the dimensions are series.update(params). This allows registering new timings without pre-allocating DataArrays.

kynan · 2015-11-12T01:00:25Z

@mlange05 I've updated and rebase this branch, including your change, which I think makes perfect sense. I think this is ready for prime time, what do you think?

kynan force-pushed the feature/xray branch from 3dc49a0 to acdfcf6 Compare June 2, 2015 22:52

kynan and others added 22 commits November 11, 2015 23:16

Move pybench to its own package

427dad5

Remove sleep unit test

cca0a68

Add optional precision argument to table method

e0ee97b

Add regions as Benchmark attribute

b24802a

Only allow timing specified regions

645d30e

xray.DataArray as data attribute of Benchmark

83fb080

Uses params as the coordinate dimension names and values.

Add utils module with value_combinations function

4dfb8ed

Store benchmark timings in DataArray

1a40f17

Check that all params are defined at benchmark level

740ae05

Use value_combinations in profile method

a649af3

Add call method for lookup in DataArray

e7626d8

Load/save from/to netCDF file

6ab8442

Combine results from multiple files along new dimension

da54f94

Combine results from multiple series of benchmarks

54c859e

Support for relabelling coordinate axes when combining series

6b29ce1

Update table output to work with DataArray

2ae828f

Remove obsolete dataframe method

8924ce6

Update (sub)plot and lookup methods to work with DataArray

cdee6de

Update Benchmark unit tests

bcc9173

Only iterate data_vars in combine_series

8203bde

Add __getitem__ method

1256633

kynan force-pushed the feature/xray branch from acdfcf6 to 1256633 Compare November 12, 2015 00:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change storage backend to xray.DataArray#6

Change storage backend to xray.DataArray#6
kynan wants to merge 22 commits into
masterfrom
feature/xray

kynan commented Jun 2, 2015

Uh oh!

kynan commented Jun 2, 2015

Uh oh!

mlange05 commented Jul 3, 2015

Uh oh!

kynan commented Jul 3, 2015

Uh oh!

mlange05 commented Aug 13, 2015

Uh oh!

kynan commented Nov 12, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kynan commented Jun 2, 2015

Uh oh!

kynan commented Jun 2, 2015

Uh oh!

mlange05 commented Jul 3, 2015

Uh oh!

kynan commented Jul 3, 2015

Uh oh!

mlange05 commented Aug 13, 2015

Uh oh!

kynan commented Nov 12, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants