Conversation
|
@kirill-sizov, @yasakova-anastasia, please test the PR. To test, do: Look at What is done:
What is not done:
|
|
@zhiltsov-max
datum create
datum add -f yolo ../yolo_dataset
datum commit -m "add yolo dataset"
datum transform -t remap_labels -- -l label_0:apple
datum filter -e '/item[subset="train"]'
# ...
# PermissionError: [Errno 13] Permission denied: <path/to/datumaro/cache>My expectations: successful filtering
datum create
datum add -f voc tests/assets/voc_dataset
datum commit -m "add dataset"
datum filter -e '/item[subset="train"]'
# ...
# AttributeError: 'DatasetItemStorageDatasetView' object has no attribute 'select'My expectations: successful filtering 3.Commit with no changes Also, I have a question: how can I undo changes that I have not yet committed (in git I use |
|
I tested this PR and got the following notes:
|
|
@kirill-sizov, @yasakova-anastasia, thanks for the review and questions. 1 and 2 I'll check, as well as performance.
|
|
TBD:
|
- Fixed pytest problem with tests
|
Blocked by in-place saving problems: #257, #260
Found a problem with Pytest and logging, which can lead to an infinite loop in testing pytest-dev/pytest#5502. Can be resolved with |
Use Scope for VCS
[VCS] Make separate migration command
| "Project migration cannot be done inplace.") | ||
|
|
||
| old_aux_dir = osp.join(src_dir, '.datumaro') | ||
| old_config = Config.parse(osp.join(old_aux_dir, 'config.yaml')) |
There was a problem hiding this comment.
I don't see the old config being deleted anywhere. Won't that cause the new project code to still complain about the old config after migration?
There was a problem hiding this comment.
I changed the meaning of the operation. Now, it creates a new project in the destination directory, the old project is unaffected.
|
|
||
| old_config_path = osp.join(found_path, 'config.yaml') | ||
| if osp.isfile(old_config_path): | ||
| if Config.parse(old_config_path).format_version != 2: |
There was a problem hiding this comment.
This check shouldn't be here. New code can't handle old project files, so if one is found, the user should be told to run migration regardless of its content (the migration code will then tell them that 2 is not a valid version in old config files).
There was a problem hiding this comment.
I told this before - the difference in the config file names is occasional. It is very simple to forget about this when changing the code.
Summary
Related: #131
Fixed: #328
Closes #402 (not applicable anymore)
General:
Glossary:
- revision - corresponds to a Git commit
- working / head / revision tree - a project build tree and plugins at a specified revision
- object - a revision or a dataset
- source / dataset / revision / project paths - a path to a dataset in a special format
- local revpath - a path to a dataset within the current project
- full revpath - an absolute path to a dataset
- data source - basically, an URL + dataset format name
- stage - a modification of a data source. A transformation, filter or something else.
- build tree - a directed graph (tree) with leaf nodes at data sources and a single root node called "project"
- build target - a data source or a stage
- pipeline - a subgraph of a build target
CLI:
source add,commit,checkout,logCLI commandsimport,project mergeCLI commandsdiffandediffare joined into a singlediffcommanddiff,merge,explainnow accept dataset revpathsAPI:
Projectis completely rewritten and has a new interface.- Added v1->v2 migration. It can be done with
datum project migrateHow to test
Check this test file for examples: tests/cli/test_project.py
Checklist
developbranchLicense
Feel free to contact the maintainers if that's a concern.