Skip to content

Commit e3e5c0f

Browse files
Maxim ZhiltsovRoman Donchenko
andauthored
Add basic versioning (open-edge-platform#238)
* Add project versioning capabilities * Update changelog Co-authored-by: Roman Donchenko <roman.donchenko@intel.com>
1 parent 9dae78f commit e3e5c0f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+6127
-2383
lines changed

.github/workflows/health_check.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ jobs:
1919
- name: Installing dependencies
2020
run: |
2121
pip install tensorflow pytest pytest-cov
22-
pip install -e ./
22+
pip install -e .[default]
2323
- name: Code instrumentation
2424
run: |
2525
pytest -v --cov --cov-report xml:coverage.xml

.github/workflows/pr_checks.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ jobs:
3131
- name: Installing dependencies
3232
run: |
3333
pip install tensorflow pytest
34-
pip install -e ./
34+
pip install -e .[default]
3535
- name: Unit testing
3636
run: |
3737
pytest -v

CHANGELOG.md

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,55 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88

99
## \[Unreleased\]
1010
### Added
11-
- TBD
11+
- A new installation target: `pip install datumaro[default]`, which should
12+
be used by default. The simple `datumaro` is supposed for library users.
13+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
14+
- Dataset and project versioning capabilities (Git-like)
15+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
16+
- "dataset revpath" concept in CLI, allowing to pass a dataset path with
17+
the dataset format in `diff`, `merge`, `explain` and `info` CLI commands
18+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
19+
- `add`, `remove`, `commit`, `checkout`, `log`, `status`, `info` CLI commands
20+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
1221

1322
### Changed
23+
- A project can contain and manage multiple datasets instead of a single one.
24+
CLI operations can be applied to the whole project, or to separate datasets.
25+
Datasets are modified inplace, by default
26+
(<https://github.com/openvinotoolkit/datumaro/issues/328>)
27+
- CLI help for builtin plugins doesn't require project
28+
(<https://github.com/openvinotoolkit/datumaro/issues/328>)
1429
- Annotation-related classes were moved into a new module,
1530
`datumaro.components.annotation`
1631
(<https://github.com/openvinotoolkit/datumaro/pull/439>)
1732
- Rollback utilities replaced with Scope utilities
1833
(<https://github.com/openvinotoolkit/datumaro/pull/444>)
34+
- The `Project` class from `datumaro.components` is changed completely
35+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
36+
- `diff` and `ediff` are joined into a single `diff` CLI command
37+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
38+
- Projects use new file layout, incompatible with old projects.
39+
An old project can be updated with `datum project migrate`
40+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
41+
- Inheriting `CliPlugin` is not required in plugin classes
42+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
43+
- `Importer`s do not create `Project`s anymore and just return a list of
44+
extractor configurations
45+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
1946

2047
### Deprecated
2148
- TBD
2249

2350
### Removed
24-
- TBD
51+
- `import`, `project merge` CLI commands
52+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
53+
- Support for project hierarchies. A project cannot be a source anymore
54+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
55+
- Project cannot have independent internal dataset anymore. All the project
56+
data must be stored in the project data sources
57+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
58+
- `datumaro_project` format
59+
(<https://github.com/openvinotoolkit/datumaro/pull/238>)
2560

2661
### Fixed
2762
- Deprecation warning in `open_images_format.py`
@@ -71,7 +106,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
71106
- Calling `ProjectDataset.transform()` with a string argument (<https://github.com/openvinotoolkit/datumaro/issues/402>)
72107
- Attributes casting for CVAT format (<https://github.com/openvinotoolkit/datumaro/pull/403>)
73108
- Loading of custom project plugins (<https://github.com/openvinotoolkit/datumaro/issues/404>)
74-
- Reading, writing anno file and saving name of the subset for test subset
109+
- Reading, writing anno file and saving name of the subset for test subset
75110
(<https://github.com/openvinotoolkit/datumaro/pull/447>)
76111

77112
### Security

MANIFEST.in

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,3 @@
11
include README.md
2-
include requirements-core.txt
2+
include requirements-core.txt
3+
include requirements-default.txt

datumaro/cli/__main__.py

Lines changed: 26 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@
99

1010
from ..version import VERSION
1111
from . import commands, contexts
12-
from .util import CliException, add_subparser
12+
from .util import add_subparser
13+
from .util.errors import CliException
1314

1415
_log_levels = {
1516
'debug': log.DEBUG,
@@ -59,22 +60,31 @@ def make_parser():
5960
_LogManager._define_loglevel_option(parser)
6061

6162
known_contexts = [
62-
('project', contexts.project, "Actions with project (deprecated)"),
63+
('project', contexts.project, "Actions with project"),
6364
('source', contexts.source, "Actions with data sources"),
6465
('model', contexts.model, "Actions with models"),
6566
]
6667
known_commands = [
67-
('create', commands.create, "Create project"),
68-
('import', commands.import_, "Create project from existing dataset"),
68+
("Project modification:", None, ''),
69+
('create', commands.create, "Create empty project"),
6970
('add', commands.add, "Add data source to project"),
7071
('remove', commands.remove, "Remove data source from project"),
72+
73+
("", None, ''),
74+
("Project versioning:", None, ''),
75+
('checkout', commands.checkout, "Switch to another branch or revision"),
76+
('commit', commands.commit, "Commit changes in tracked files"),
77+
('log', commands.log, "List history"),
78+
('status', commands.status, "Display current status"),
79+
80+
("", None, ''),
81+
("Dataset and project operations:", None, ''),
7182
('export', commands.export, "Export project in some format"),
72-
('filter', commands.filter, "Filter project"),
73-
('transform', commands.transform, "Transform project"),
74-
('merge', commands.merge, "Merge projects"),
75-
('convert', commands.convert, "Convert dataset into another format"),
76-
('diff', commands.diff, "Compare projects with intersection"),
77-
('ediff', commands.ediff, "Compare projects for equality"),
83+
('filter', commands.filter, "Filter project items"),
84+
('transform', commands.transform, "Modify project items"),
85+
('merge', commands.merge, "Merge datasets"),
86+
('convert', commands.convert, "Convert dataset between formats"),
87+
('diff', commands.diff, "Compare datasets"),
7888
('stats', commands.stats, "Compute project statistics"),
7989
('info', commands.info, "Print project info"),
8090
('explain', commands.explain, "Run Explainable AI algorithm for model"),
@@ -105,7 +115,8 @@ def make_parser():
105115
subcommands = parser.add_subparsers(title=subcommands_desc,
106116
description="", help=argparse.SUPPRESS)
107117
for command_name, command, _ in known_contexts + known_commands:
108-
add_subparser(subcommands, command_name, command.build_parser)
118+
if command is not None:
119+
add_subparser(subcommands, command_name, command.build_parser)
109120

110121
return parser
111122

@@ -121,7 +132,10 @@ def main(args=None):
121132
return 1
122133

123134
try:
124-
return args.command(args)
135+
retcode = args.command(args)
136+
if retcode is None:
137+
retcode = 0
138+
return retcode
125139
except CliException as e:
126140
log.error(e)
127141
return 1

datumaro/cli/commands/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@
55
# pylint: disable=redefined-builtin
66

77
from . import (
8-
add, convert, create, diff, ediff, explain, export, filter, import_, info,
9-
merge, remove, stats, transform, validate,
8+
add, checkout, commit, convert, create, diff, explain, export, filter, info,
9+
log, merge, remove, stats, status, transform, validate,
1010
)

datumaro/cli/commands/checkout.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Copyright (C) 2021 Intel Corporation
2+
#
3+
# SPDX-License-Identifier: MIT
4+
5+
import argparse
6+
7+
from datumaro.util.scope import scope_add, scoped
8+
9+
from ..util import MultilineFormatter
10+
from ..util.project import load_project
11+
12+
13+
def build_parser(parser_ctor=argparse.ArgumentParser):
14+
parser = parser_ctor(help="Navigate to a revision",
15+
description="""
16+
Command forms:|n
17+
1) %(prog)s <revision>|n
18+
2) %(prog)s [--] <source1> ...|n
19+
3) %(prog)s <revision> [--] <source1> <source2> ...|n
20+
|n
21+
1 - Restores a revision and all the sources in the working directory.|n
22+
2, 3 - Restores only specified sources from the specified revision.|n
23+
|s|sThe current revision is used, when not set.|n
24+
|s|s"--" is optionally used to separate source names and revisions.|n
25+
|n
26+
Examples:|n
27+
- Restore the previous revision:|n
28+
|s|s%(prog)s HEAD~1 |n |n
29+
- Restore the saved version of a source in the working tree|n
30+
|s|s%(prog)s -- source-1 |n |n
31+
- Restore a previous version of a source|n
32+
|s|s%(prog)s 33fbfbe my-source
33+
""", formatter_class=MultilineFormatter)
34+
35+
parser.add_argument('_positionals', nargs=argparse.REMAINDER,
36+
help=argparse.SUPPRESS) # workaround for -- eaten by positionals
37+
parser.add_argument('rev', nargs='?',
38+
help="Commit or tag (default: current)")
39+
parser.add_argument('sources', nargs='*',
40+
help="Sources to restore (default: all)")
41+
parser.add_argument('-f', '--force', action='store_true',
42+
help="Allows to overwrite unsaved changes in case of conflicts "
43+
"(default: %(default)s)")
44+
parser.add_argument('-p', '--project', dest='project_dir', default='.',
45+
help="Directory of the project to operate on (default: current dir)")
46+
parser.set_defaults(command=checkout_command)
47+
48+
return parser
49+
50+
@scoped
51+
def checkout_command(args):
52+
has_sep = '--' in args._positionals
53+
if has_sep:
54+
pos = args._positionals.index('--')
55+
if 1 < pos:
56+
raise argparse.ArgumentError(None,
57+
message="Expected no more than 1 revision argument")
58+
else:
59+
pos = 1
60+
args.rev = (args._positionals[:pos] or [None])[0]
61+
args.sources = args._positionals[pos + has_sep:]
62+
if has_sep and not args.sources:
63+
raise argparse.ArgumentError('sources', message="When '--' is used, "
64+
"at least 1 source name must be specified")
65+
66+
project = scope_add(load_project(args.project_dir))
67+
68+
project.checkout(rev=args.rev, sources=args.sources, force=args.force)
69+
70+
return 0

datumaro/cli/commands/commit.py

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# Copyright (C) 2020-2021 Intel Corporation
2+
#
3+
# SPDX-License-Identifier: MIT
4+
5+
import argparse
6+
7+
from datumaro.util.scope import scope_add, scoped
8+
9+
from ..util import MultilineFormatter
10+
from ..util.project import load_project
11+
12+
13+
def build_parser(parser_ctor=argparse.ArgumentParser):
14+
parser = parser_ctor(help="Create a revision",
15+
description="""
16+
Creates a new revision from the current state of the working directory.|n
17+
|n
18+
Examples:|n
19+
- Create a revision:|n
20+
|s|s%(prog)s -m "Added data"
21+
""", formatter_class=MultilineFormatter)
22+
23+
parser.add_argument('-m', '--message', required=True, help="Commit message")
24+
parser.add_argument('--allow-empty', action='store_true',
25+
help="Allow commits with no changes (default: %(default)s)")
26+
parser.add_argument('--allow-foreign', action='store_true',
27+
help="Allow commits with non-Datumaro changes (default: %(default)s)")
28+
parser.add_argument('--no-cache', action='store_true',
29+
help="Don't put committed datasets into cache, "
30+
"save only metainfo (default: %(default)s)")
31+
parser.add_argument('-p', '--project', dest='project_dir', default='.',
32+
help="Directory of the project to operate on (default: current dir)")
33+
parser.set_defaults(command=commit_command)
34+
35+
return parser
36+
37+
@scoped
38+
def commit_command(args):
39+
project = scope_add(load_project(args.project_dir))
40+
41+
old_tree = project.head
42+
43+
new_commit = project.commit(args.message, allow_empty=args.allow_empty,
44+
allow_foreign=args.allow_foreign, no_cache=args.no_cache)
45+
46+
new_tree = project.working_tree
47+
diff = project.diff(old_tree, new_tree)
48+
49+
print("Moved to commit '%s' %s" % (new_commit, args.message))
50+
print(" %s targets changed" % len(diff))
51+
for t, s in diff.items():
52+
print(" %s %s" % (s.name, t))
53+
54+
return 0

datumaro/cli/commands/convert.py

Lines changed: 30 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -12,34 +12,41 @@
1212
from datumaro.util.os_util import make_file_name
1313

1414
from ..contexts.project import FilterModes
15-
from ..util import CliException, MultilineFormatter
15+
from ..util import MultilineFormatter
16+
from ..util.errors import CliException
1617
from ..util.project import generate_next_file_name
1718

1819

1920
def build_parser(parser_ctor=argparse.ArgumentParser):
20-
builtin_importers = sorted(Environment().importers.items)
21-
builtin_converters = sorted(Environment().converters.items)
21+
builtin_readers = sorted(
22+
set(Environment().importers) | set(Environment().extractors))
23+
builtin_writers = sorted(Environment().converters)
2224

2325
parser = parser_ctor(help="Convert an existing dataset to another format",
2426
description="""
25-
Converts a dataset from one format to another.
26-
You can add your own formats using a project.|n
27-
|n
28-
Supported input formats: %s|n
29-
|n
30-
Supported output formats: %s|n
31-
|n
32-
Examples:|n
33-
- Export a dataset as a PASCAL VOC dataset, include images:|n
34-
|s|sconvert -i src/path -f voc -- --save-images|n
35-
|n
36-
- Export a dataset as a COCO dataset to a specific directory:|n
37-
|s|sconvert -i src/path -f coco -o path/I/like/
38-
""" % (', '.join(builtin_importers), ', '.join(builtin_converters)),
27+
Converts a dataset from one format to another.
28+
You can add your own formats and do many more by creating a
29+
Datumaro project.|n
30+
|n
31+
This command serves as an alias for the "create", "add", and
32+
"export" commands, allowing to obtain the same results simpler
33+
and faster. Check descriptions of these commands for more info.|n
34+
|n
35+
Supported input formats: {}|n
36+
|n
37+
Supported output formats: {}|n
38+
|n
39+
Examples:|n
40+
- Export a dataset as a PASCAL VOC dataset, include images:|n
41+
|s|s%(prog)s -i src/path -f voc -- --save-images|n
42+
|n
43+
- Export a dataset as a COCO dataset to a specific directory:|n
44+
|s|s%(prog)s -i src/path -f coco -o path/I/like/
45+
""".format(', '.join(builtin_readers), ', '.join(builtin_writers)),
3946
formatter_class=MultilineFormatter)
4047

4148
parser.add_argument('-i', '--input-path', default='.', dest='source',
42-
help="Path to look for a dataset")
49+
help="Input dataset path (default: current dir)")
4350
parser.add_argument('-if', '--input-format',
4451
help="Input dataset format. Will try to detect, if not specified.")
4552
parser.add_argument('-f', '--output-format', required=True,
@@ -49,13 +56,15 @@ def build_parser(parser_ctor=argparse.ArgumentParser):
4956
parser.add_argument('--overwrite', action='store_true',
5057
help="Overwrite existing files in the save directory")
5158
parser.add_argument('-e', '--filter',
52-
help="Filter expression for dataset items")
59+
help="XML XPath filter expression for dataset items. Read \"filter\" "
60+
"command docs for more info")
5361
parser.add_argument('--filter-mode', default=FilterModes.i.name,
5462
type=FilterModes.parse,
55-
help="Filter mode (options: %s; default: %s)" % \
63+
help="Filter mode, one of %s (default: %s)" % \
5664
(', '.join(FilterModes.list_options()) , '%(default)s'))
5765
parser.add_argument('extra_args', nargs=argparse.REMAINDER,
58-
help="Additional arguments for output format (pass '-- -h' for help)")
66+
help="Additional arguments for output format (pass '-- -h' for help). "
67+
"Must be specified after the main command arguments")
5968
parser.set_defaults(command=convert_command)
6069

6170
return parser

0 commit comments

Comments
 (0)