Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/issue-infra-project.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: "add to `apps` GH project when a new issue is submitted"
name: "add to `infra` GH project when a new issue is submitted"

on:
issues:
Expand Down
4 changes: 0 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,3 @@
# MMIF for python

**NOTE** that this project is in pre-alpha and being actively developed. Nothing is guaranteed to reliably work for the moment and developer need to be very careful when using APIs implemented here. Please use [the issue track](https://github.com/clamsproject/mmif/issues) to report bugs and malfunctions.

## MultiMedia Interchange Format
[MMIF](https://mmif.clams.ai) is a JSON(-LD)-based data format designed for transferring annotation data between computational analysis applications in [CLAMS project](https://clams.ai).

2 changes: 1 addition & 1 deletion documentation/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def linkcode_resolve(domain, info):
# Determines whether remote or local git branches/tags are preferred if their output dirs conflict
smv_prefer_remote_refs = True

# TODO (krim @ 6/13/21): maybe there's a way to re-wrote what I wrote in the
# TODO (krim @ 6/13/21): maybe there's a way to re-write what I wrote in the
# fork of sphinx-multiversion here in conf.py. Issues I can think of as of now;
# 1. sphinx-mv/main.py know current version of the library by git tag,
# but conf.py has no way to know that...
Expand Down
1 change: 0 additions & 1 deletion documentation/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ Welcome to mmif-python's documentation!

introduction
target-versions
consumer-tutorial

.. toctree::
:maxdepth: 2
Expand Down
29 changes: 20 additions & 9 deletions documentation/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@ Getting Started
Overview
---------

MultiMedia Interchange Format (MMIF) is a JSON(-LD)-based data format designed for transparency and interoperability for customized computational analysis application workflows.
This documentation focuses on Python implementation of the MMIF. To learn more about the data format specification, please visit the `MMIF wbesite <https://mmif.clams.ai>`_.
``mmif-python`` is a public, open source implementation of the MMIF data format. ``mmif-python`` supports de-/serialization of MMIF objects, as well as many navigation and manipulation helpers for MMIF objects.
MultiMedia Interchange Format (MMIF) is a JSON(-LD)-based data format designed for reproducibility, transparency and interoperability for customized computational analysis application workflows.
This documentation focuses on Python implementation of the MMIF. To learn more about the data format specification, please visit the `MMIF website <https://mmif.clams.ai>`_.
``mmif-python`` is a public, open source implementation of the MMIF data format. ``mmif-python`` supports serialization/deserialization of MMIF objects from/to Python objects, as well as many navigation and manipulation helpers for MMIF objects.

Prerequisites
-------------

* `Python <https://www.python.org>`_: ``mmif-python`` requires Python 3.6 or newer. We have no plan to support `Python 2.7 <https://pythonclock.org/>`_.
* `Python <https://www.python.org>`_: the latest ``mmif-python`` requires Python 3.8 or newer. We have no plan to support `Python 2.7 <https://pythonclock.org/>`_.

Installation
---------------
Expand Down Expand Up @@ -43,19 +43,19 @@ MMIF Serialization

mmif_str = """{
"metadata": {
"mmif": "http://mmif.clams.ai/0.2.0"
"mmif": "http://mmif.clams.ai/1.0.0"
},
"documents": [
{
"@type": "http://mmif.clams.ai/0.2.0/vocabulary/VideoDocument",
"@type": "http://mmif.clams.ai/vocabulary/VideoDocument/v1",
"properties": {
"id": "m1",
"mime": "video/mp4",
"location": "file:///var/archive/video-0012.mp4"
}
},
{
"@type": "http://mmif.clams.ai/0.2.0/vocabulary/TextDocument",
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"id": "m2",
"mime": "text/plain",
Expand All @@ -72,7 +72,18 @@ Few notes;
#. MMIF does not carry the primary source files in it.
#. MMIF encode the specification version at the top. As not all MMIF versions are backward-compatible, a version ``mmif-python`` implementation of the MMIF might not be able to load an unsupported version of MMIF string.

When serializing back to :class:`str`, call ``.serialize()`` (:meth:`mmif.serialize.model.MmifObject.serialize`) on the object.
When serializing back to :class:`str`, call :meth:`mmif.serialize.model.MmifObject.serialize` on the object.

To get subcomponents, you can use various getters implemented in subclasses. For more details, the API documentation (:ref:`apidoc`) will help.
To get subcomponents, you can use various getters implemented in subclasses. For example;

.. code-block:: python

from mmif.vocabulary.document_types import DocumentTypes

for video in mmif_obj.Mmif.get_documents_by_type(DocumentTypes.VideoDocument):
with open(video.location_path(), 'b') as in_video:
# do something with the video file


For a full list of available helper methods, please refer to :ref:`the API documentation <apidoc>`.

3 changes: 3 additions & 0 deletions mmif/serialize/annotation.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,10 @@ class Document(Annotation):
:param document_obj: the JSON data that defines the document
"""
def __init__(self, doc_obj: Optional[Union[bytes, str, dict]] = None) -> None:
# to store the parent view ID
self._parent_view_id = ''
self.reserved_names.add('_parent_view_id')

self._type: Union[ThingTypesBase, DocumentTypesBase] = ThingTypesBase('')
self.properties: DocumentProperties = DocumentProperties()
self.disallow_additional_properties()
Expand Down
22 changes: 14 additions & 8 deletions mmif/serialize/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ class MmifObject(object):
an actual representation with a JSON formatted string or equivalent
`dict` object argument.

This superclass has three specially designed instance variables, and these
This superclass has four specially designed instance variables, and these
variable names cannot be used as attribute names for MMIF objects.

1. _unnamed_attributes:
Expand All @@ -54,12 +54,15 @@ class MmifObject(object):
When serialize, an object will skip its *empty* (e.g. zero-length, or None)
attributes unless they are in this list. Otherwise, the serialized JSON
string would have empty representations (e.g. ``""``, ``[]``).
4. _exclude_from_diff:
This is a simple list of names of attributes that should be excluded from
the diff calculation in ``__eq__``.

# TODO (krim @ 8/17/20): this dict is however, a duplicate with the type hints in the class definition.
Maybe there is a better way to utilize type hints (e.g. getting them as a programmatically), but for now
developers should be careful to add types to hints as well as to this dict.

Also note that those two special attributes MUST be set in the __init__()
Also note that those special attributes MUST be set in the __init__()
before calling super method, otherwise deserialization will not work.

And also, a subclass that has one or more *named* attributes, it must
Expand All @@ -74,24 +77,24 @@ class MmifObject(object):
an ID value automatically generated, based on its parent object.
"""

reserved_names: Set = {
reserved_names: Set[str] = {
'reserved_names',
'_unnamed_attributes',
'_attribute_classes',
'_required_attributes',
# used in Document class to store parent view id
'_parent_view_id',
# used in View class to autogenerate annotation ids
'_id_counts'
'_exclude_from_diff'
}
_unnamed_attributes: Optional[dict]
_exclude_from_diff: Set[str]
_attribute_classes: Dict[str, Type] = {} # Mapping: str -> Type

def __init__(self, mmif_obj: Optional[Union[bytes, str, dict]] = None) -> None:
if isinstance(mmif_obj, bytes):
mmif_obj = mmif_obj.decode('utf8')
if not hasattr(self, '_required_attributes'):
self._required_attributes = []
if not hasattr(self, '_exclude_from_diff'):
self._exclude_from_diff = set()
if not hasattr(self, '_unnamed_attributes'):
self._unnamed_attributes = {}
if mmif_obj is not None:
Expand Down Expand Up @@ -253,7 +256,10 @@ def __str__(self) -> str:

def __eq__(self, other) -> bool:
return isinstance(other, type(self)) and \
len(DeepDiff(self, other, report_repetition=True, exclude_types=[datetime])) == 0
len(DeepDiff(self, other, report_repetition=True, exclude_types=[datetime],
# https://github.com/clamsproject/mmif-python/issues/214
exclude_paths=self._exclude_from_diff)
) == 0

def __len__(self) -> int:
"""
Expand Down
4 changes: 4 additions & 0 deletions mmif/serialize/view.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,11 @@ class View(MmifObject):
"""

def __init__(self, view_obj: Optional[Union[bytes, str, dict]] = None) -> None:
# used to autogenerate annotation ids
self._id_counts = {}
self.reserved_names.add("_id_counts")
self._exclude_from_diff = {"_id_counts"}

self.id: str = ''
self.metadata: ViewMetadata = ViewMetadata()
self.annotations: AnnotationsList = AnnotationsList()
Expand Down
1 change: 1 addition & 0 deletions requirements.dev
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ lxml
sphinx
autodoc
sphinx-rtd-theme
sphinx-autobuild
git+https://github.com/clamsproject/sphinx-multiversion.git@master
twine
m2r2
4 changes: 3 additions & 1 deletion tests/test_serialize.py
Original file line number Diff line number Diff line change
Expand Up @@ -570,14 +570,16 @@ def test_new_contain(self):
# empty at_type is not allowed
self.view_obj.new_contain("")


def test_add_annotation(self):
anno_obj = Annotation(self.mmif_examples_json['everything']['views'][6]['annotations'][2])
old_len = len(self.view_obj.annotations)
self.view_obj.add_annotation(anno_obj) # raise exception if this fails
self.assertEqual(old_len+1, len(self.view_obj.annotations))
self.assertIn('http://vocab.lappsgrid.org/NamedEntity', self.view_obj.metadata.contains)
_ = self.view_obj.serialize() # raise exception if this fails
self.view_obj.new_annotation(AnnotationTypes.TimePoint)
roundtrip = View(json.loads(self.view_obj.serialize()))
self.assertEqual(roundtrip, self.view_obj)

def test_new_annotation(self):
self.view_obj.new_annotation('Relation', 'relation1')
Expand Down