Skip Navigation

Scout Archives

Home Projects Publications Archives About Sign Up or Log In


spaCY is a python library for natural language processing. It comes with pre-trained statistical models that allow it to perform detailed semantic analysis for English, German, Greek, Spanish, French, Italian, Dutch, and Portuguese. In these languages, spaCy is able to break sentences into parts of speech, identify syntactic relationships (subject, object, etc), and generate sentence diagrams. In addition, it is able to identify root words (a process called Lemmatization by linguists) for the above languages, recognizing both suffixed roots (e.g., "color," "colors," "coloring") and different verb tenses (e.g., "is," "was," "be"). spaCy can also perform Named Entity Recognition (NER), to recognize "named 'real-world' objects like persons, companies, or locations." For around 40 other languages that lack a statistical model, spaCy is still able to accomplish simpler tasks like text tokenization and similarity testing. On spaCy's Usage page, readers can find installation instructions for windows, macOS, and Unix/Linux systems. The Usage page also provides a number of guides for getting started and a series of in-depth code examples. spaCy is free software distributed under the MIT license with source code available on GitHub.
Archived Scout Publication URL
Scout Publication
GEM Subject
Date of Scout Publication
June 21st, 2019
Date Of Record Creation
June 19th, 2019 at 10:02am
Date Of Record Release
June 19th, 2019 at 11:07am
Resource URL Clicks
Add Comment


(no comments available yet)