Looking at stop words: why you shouldn’t blindly trust model defaults

talk

This talk will focus on the importance of checking assumptions and defaults in the software you use.

Published

September 26, 2020

Invited talk at Salt Lake City R Users Group

Removing stop words is a fairly common step in natural language processing, and NLP packages often supply a default list. However, most documentation and tutorials don’t explore the nuances of selecting an appropriate list. Defaults for machine learning and modeling can be helpful but may be misleading or wrong. This talk will focus on the importance of checking assumptions and defaults in the software you use.