Article Summarizer

Using LangChain to summarize my written thoughts for the Website v4 project. This guide introduces a script that uses a MapReduceChain to create initial summaries of each article and then reduces them into a final summary.

Cover image for Article Summarizer

Python Notebook Setup

I used this workflow to summarize roughly twenty posts from my old blog.

Info

The responses from OpenAI were slow, because this is a lot of work, and the default timeout set by LangChain was ten minutes, so I ended up running it on batches of five articles at a time.

Pro Tip

Grouping 'like' articles increased the quality of responses and I was very pleased to see the intermediate steps. The output text was good, but the intermediate steps contained a level of detail I didn't expect.

Prerequisites

  1. LangChain and requisites installed. See LangChain installation docs
  2. A folder named articles with the .txt files you want to summarize.
  3. An API key from OpenAI
  4. A file named constants.py to store the API key

Load the API key and documents.

The articles are stored like this.

Define the map_chain so we can run it later.

Do the same for the reduce_chain.

This last block does the following:

  • Initialize chains for combining, mapping, and reducing documents using LangChain modules.
  • Split the input documents into manageable chunks using a character-based text splitter.
  • Execute the map-reduce chain on the split documents and retrieve the output text and intermediate steps.

Example Outputs

For brevity, here is the Intermediate Steps of the first document.

This modular research workflow empowers you to adapt and extend the code to suit your specific research needs.

Feel free to modify the code according to your research requirements and explore different document types by adjusting the document loaders.