-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Labels
choreRoutine maintenance tasks that don't affect application behavior or functionality (e.g., dependencyRoutine maintenance tasks that don't affect application behavior or functionality (e.g., dependency
Description
Our Gilda package requires that some nltk datasets be downloaded in order to properly execute. Our testing suite requires the following to be executed in order to not fail.
import nltk; nltk.download("stopwords"); nltk.download("punkt_tab")
The block of code which fails is the following:
ontology-access-kit/src/oaklib/implementations/gilda.py
Lines 75 to 86 in aaede8e
| def _gilda_annotate(self, text: str) -> Iterator[TextAnnotation]: | |
| from gilda.ner import annotate | |
| for match_text, match, start, end in annotate(text, grounder=self.grounder): | |
| yield TextAnnotation( | |
| subject_start=start, | |
| subject_end=end, | |
| subject_label=match_text, | |
| object_id=match.term.get_curie(), | |
| object_label=match.term.entry_name, | |
| matches_whole_text=start == 0 and end == len(text), | |
| ) |
Having to know ahead of time you need to invoke these downloads is a bit annoying, I expect it'd be better to just have these nltk.download commands run whenever creating a GildaImplementation class
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
choreRoutine maintenance tasks that don't affect application behavior or functionality (e.g., dependencyRoutine maintenance tasks that don't affect application behavior or functionality (e.g., dependency