I'm aware that there is no single universal list for stopwords. But the current stopwords-id.txt (currently 1309 sloc) contains too many words that aren't stop words (eg: not function words). It's more an "Indonesian word list" rather than "Indonesian stop words".
I think It's safer to use other existing stop words list and then improve it incrementally.
References:
I'm aware that there is no single universal list for stopwords. But the current stopwords-id.txt (currently 1309 sloc) contains too many words that aren't stop words (eg: not function words). It's more an "Indonesian word list" rather than "Indonesian stop words".
I think It's safer to use other existing stop words list and then improve it incrementally.
References: