Skip to content

latest spaCy tagger and parser returning unexpected results #535

@bdewilde

Description

@bdewilde

After downloading the latest version of spaCy and updating the models, I no longer get reasonable POS tagging or dependency parsing. Here's an example in Python 3.5 on macOS Sierra:

>>> import spacy
>>> en_nlp = spacy.load('en')
>>> en_doc = en_nlp('Hello, world. Here are two sentences.')
>>> [tok.text for tok in en_doc]
['Hello', ',', 'world', '.', 'Here', 'are', 'two', 'sentences', '.']
>>> [tok.pos_ for tok in en_doc]
['PUNCT', 'PUNCT', 'PUNCT', 'PUNCT', 'PUNCT', 'PUNCT', 'PUNCT', 'PUNCT', 'PUNCT']
>>> [tok.tag_ for tok in en_doc]
['""', '""', '""', '""', '""', '""', '""', '""', '""']
>>> [tok.dep_ for tok in en_doc]
['ROOT', 'ROOT', 'ROOT', 'ROOT', 'ROOT', 'ROOT', 'ROOT', 'ROOT', 'ROOT']

Is this no longer correct code for v1.0.1? If it is, do you have any ideas what's going wrong?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugBugs and behaviour differing from documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions