View on GitHub


Full working examples in Python with accompanying dataset for Text Mining & NLP. Includes: Gensim Word2Vec, phrase embeddings, keyword extraction with TFIDF, word count with pyspark, simple text preprocessing, accessing pre-trained embeddings and more.

Text Classification with Logistic Regression

Learn how to build your first text classifier using Logistic Regression in Python. The challenge is to categorize news articles with the appropriate categories (from a set of 31 categories).

Running the Notebook

  1. From the command line, first, clone this repo.
    git clone <this repo url>
  2. Next, switch to the text-classification directory of this repo. ``` cd nlp-in-practice/text-classification
3. Then, run jupyter notebook

jupyter notebook ```

  1. Now, go to notebooks directory and select the notebook you would like to run and re-run the cells.