Adding Concurrency to speed up our Golang Web Application




Welcome to part 24 of the Go Programming tutorial series. In this tutorial, we're going to be applying goroutines and channels in an effort to add concurrency and more efficiency to our web app. Up to this point, our web app takes ~5000ms (5 seconds), to load, which is atrocious. Using Google Chrome's web dev tools, we can see where that time is being spent. You can do this for yourself by pressing f12 in Google Chrome and going to the performance tab. There, you can load and test times on your page. In our case, we're loading a massive table, which is going to take a while, but the "idle" time is the golang time. That's the time where the browser is just sitting there waiting for a response. We're currently showing a 2800ms "idle" time, meaning our program is taking 2.8 seconds. This is not good.

If you don't have our code up to this point, there are basically three files you should be familiar with:

First, the code from the previous tutorial, showing a simple example of goroutines, channels, and iterating through returned data:

package main

import (
    "fmt"
    "sync"
)

var wg sync.WaitGroup

func foo(c chan int, someValue int) {
    defer wg.Done()
    c <- someValue * 5
}

func main() {
    fooVal := make(chan int, 10)
    for i := 0; i < 10; i++ {
        wg.Add(1)
        go foo(fooVal, i)
    }
    wg.Wait()
    close(fooVal)
    for item := range fooVal {
        fmt.Println(item)
    }
}

Then you should also have the latest web app code:

package main

import (
    "fmt"
    "net/http"
    "html/template"
    "encoding/xml"
    "io/ioutil"
)

type NewsMap struct {
    Keyword string
    Location string
}

type NewsAggPage struct {
    Title string
    News map[string]NewsMap
}

type Sitemapindex struct {
    Locations []string `xml:"sitemap>loc"`
}

type News struct {
    Titles []string `xml:"url>news>title"`
    Keywords []string `xml:"url>news>keywords"`
    Locations []string `xml:"url>loc"`
}


func indexHandler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "<h1>Whoa, Go is neat!</h1>")
}

func newsAggHandler(w http.ResponseWriter, r *http.Request) {

    var s Sitemapindex
    var n News
    resp, _ := http.Get("https://www.washingtonpost.com/news-sitemap-index.xml")
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &s)
    news_map := make(map[string]NewsMap)

    for _, Location := range s.Locations {
        resp, _ := http.Get(Location)
        bytes, _ := ioutil.ReadAll(resp.Body)
        xml.Unmarshal(bytes, &n)

        for idx, _ := range n.Keywords {
            news_map[n.Titles[idx]] = NewsMap{n.Keywords[idx], n.Locations[idx]}
        }
    }

    p := NewsAggPage{Title: "Amazing News Aggregator", News: news_map}

    t, _ := template.ParseFiles("aggregatorfinish.html")
    t.Execute(w, p)
}


func main() {
    http.HandleFunc("/", indexHandler)
    http.HandleFunc("/agg/", newsAggHandler)
    http.ListenAndServe(":8000", nil) 
}

And aggregatorfinish.html:

<head>
    <script type="text/javascript" charset="utf8" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
    <link rel="stylesheet" type="text/css" href="//cdn.datatables.net/1.10.16/css/jquery.dataTables.css">
    <script type="text/javascript" charset="utf8" src="//cdn.datatables.net/1.10.16/js/jquery.dataTables.js"></script>
</head>

<h1>{{.Title}}</h1>

<table id="example" class="display" cellspacing="0" width="100%">
    <col width="35%">
    <col width="65%">
    <thead>
        <tr>
            <th>Title</th>
            <th>Keywords</th>
        </tr>
    </thead>
    <tbody>

        {{ range $key, $value := .News }}
         <tr>
            <td><a href="{{ $value.Location }}" target='_blank'>{{ $key }}</td>
            <td>{{ $value.Keyword }}</td>
        </tr>
        {{ end }}
    </tbody>
</table>



<script>$(document).ready(function() {
    $('#example').DataTable();
} );</script> 

Okay, so this is the code that was our "proof of concept" for the site aggregator, but it's horribly slow. Let's modify the web app code and see if we can get some improvement out of this thing:

First, let's import sync, and then define our wait group: var wg sync.WaitGroup.

Next, let's head into our newsAggHandler, and make some modifications. First off, we (I) have been forgetting to close the response body, so let's fix that:

func newsAggHandler(w http.ResponseWriter, r *http.Request) {

    var s Sitemapindex
    var n News
    resp, _ := http.Get("https://www.washingtonpost.com/news-sitemap-index.xml")
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &s)
    news_map := make(map[string]NewsMap)
    resp.Body.Close()

Next, we can still visit the main sitemap as usual, but then, once we iterate over and start visiting all the actual sitemaps, we want that part to be concurrent with goroutines and channels.

Let's think about this for a moment. What is our channel going to contain? We know it needs to be a type. In our case, the "type" should be the news type. This is going to occur inside of the channel now, so we could either pass n to the channels, or we can just create local News per channel. To me, this seems both safer and more reasonable. So we'll remove the var n News from the newsAggHandler. Next, a goroutine is just a function, so we can replace the code in the main for loop with a function instead. I'll call it newsRoutine:

func newsRoutine(Location string){
    var n News
    resp, _ := http.Get(Location)
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &n)
}

Then we modify the newsAggHandler to be:

func newsAggHandler(w http.ResponseWriter, r *http.Request) {

    var s Sitemapindex
    resp, _ := http.Get("https://www.washingtonpost.com/news-sitemap-index.xml")
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &s)
    news_map := make(map[string]NewsMap)
    resp.Body.Close()

    for _, Location := range s.Locations {
        newsRoutine(Location)
    }

    p := NewsAggPage{Title: "Amazing News Aggregator", News: news_map}

    t, _ := template.ParseFiles("aggregatorfinish.html")
    t.Execute(w, p)
}

Notice here we're also closing the body. At this point, of course, we've NOT replicated the population of news_map. To do this, we want to use channels to pass data from the running goroutines. Let's create a channel, and modify our code to work with it: queue := make(chan News, 30) before the for loop, and then modify newsRoutine to accept a channel as a parameter, as well as to send a value over the channel:

func newsRoutine(c chan News, Location string){
    var n News
    resp, _ := http.Get(Location)
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &n)
    resp.Body.Close()
    c <- n
}

We then need to pass the channel through the newsRoutine call, and make newsRoutine a goroutine:

   queue := make(chan News, 30)

    for _, Location := range s.Locations {
        go newsRoutine(queue, Location)
    }

Now, like before, we want to not range before these are all done, do we want to make sure we close the channel, but not before all of the goroutines are done running. We've done this before, and we already imported sync and created the wait group variable. Now, we need to do wg.Add(1) before each newsRoutine, defer wg.Done() inside of the newsRoutine, do a wg.Wait() after the main for loop, followed by a close(queue). With this, after we've run all of the goroutines, and closed the channel, we're ready to range over it. We can do this with:

    for elem := range queue {

    }

In here, elem is going to be a News type, so we can actually use the exact same code with it as we had before, just replacing n with elem:

    for elem := range queue {
        for idx, _ := range elem.Keywords {
            news_map[elem.Titles[idx]] = NewsMap{elem.Keywords[idx], elem.Locations[idx]}
        }
    }

With this, we're done! Full web application Golang code is:

package main

import (
    "fmt"
    "net/http"
    "html/template"
    "encoding/xml"
    "io/ioutil"
    "sync"
)

var wg sync.WaitGroup

type NewsMap struct {
    Keyword string
    Location string
}

type NewsAggPage struct {
    Title string
    News map[string]NewsMap
}

type Sitemapindex struct {
    Locations []string `xml:"sitemap>loc"`
}

type News struct {
    Titles []string `xml:"url>news>title"`
    Keywords []string `xml:"url>news>keywords"`
    Locations []string `xml:"url>loc"`
}

func indexHandler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "<h1>Whoa, Go is neat!</h1>")
}

func newsRoutine(c chan News, Location string){
    defer wg.Done()
    var n News
    resp, _ := http.Get(Location)
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &n)
    resp.Body.Close()
    c <- n
}

func newsAggHandler(w http.ResponseWriter, r *http.Request) {

    var s Sitemapindex
    resp, _ := http.Get("https://www.washingtonpost.com/news-sitemap-index.xml")
    bytes, _ := ioutil.ReadAll(resp.Body)
    xml.Unmarshal(bytes, &s)
    news_map := make(map[string]NewsMap)
    resp.Body.Close()
    queue := make(chan News, 30)

    for _, Location := range s.Locations {
        wg.Add(1)
        go newsRoutine(queue, Location)
    }
    wg.Wait()
    close(queue)

    for elem := range queue {
        for idx, _ := range elem.Keywords {
            news_map[elem.Titles[idx]] = NewsMap{elem.Keywords[idx], elem.Locations[idx]}
        }
    }

    p := NewsAggPage{Title: "Amazing News Aggregator", News: news_map}

    t, _ := template.ParseFiles("aggregatorfinish.html")
    t.Execute(w, p)
}

func main() {
    http.HandleFunc("/", indexHandler)
    http.HandleFunc("/agg/", newsAggHandler)
    http.ListenAndServe(":8000", nil) 
}

We can run this, and head to http://127.0.0.1:8000/agg/. Immediately, you should notice it's much faster. The page itself loads fairly quickly, with most of the time being spent waiting for the table to fully populate and get styled.

We can use the Chrome developer tools to compare the previous code to now:

Previous code:

golang tutorials golang tutorials

With Goroutines and channels:

golang tutorials golang tutorials

Comparing these, we can see the main change is with the "idle" time. This was time where the browser was idle, simply waiting for a response from our web server. Our program pulls the data from the sitemaps fresh on every load. So now our actual Golang code is ~ 560% faster, just by simply bringing in goroutines and channels.

From here, our next steps would be to further address that remaining ~500ms. We'd need to also time the Washington Post's sitemap response times. For example, if we slightly modify our starting Sitemapindex to grab the sitemap data on server load, rather than page load, we can see we save ~75ms. We can also confirm this using the same method as before. We could easily make this change, since we can assume that the sitemap index itself doesn't change very often. We can check some of the other sitemaps to see ~75ms response from them as well, so out of 500ms, 350ms is what we have to work with. At this point, I would instead probably focus on the larger issue, which is the population of the table. I would probably bring in real pagination, with a more functional search-bar. We could use jquery to retain a live-search, and then use pagination, which could also load with jquery or something, just anything to avoid sending all 1000+ rows of titles, urls, and keywords all at once. I am sure there are also ways we can shave down our Golang code run time. If you have any suggestions, feel free to share them!

That's all for now. For more tutorials, head:





  • Introduction to the Go Programming Language
  • Go Language Syntax
  • Go Language Types
  • Pointers in Go Programming
  • Simple Web App in Go Programming
  • Structs in the Go Programming Language
  • Methods in Go Programming
  • Pointer Receivers in Go Programming
  • More Web Dev in Go Language
  • Acessing the Internet in Go
  • Parsing XML with Go Programming
  • Looping in Go Programming
  • Continuing our Go Web application
  • Mapping in Golang
  • Mapping Golang sitemap data
  • Golang Web App HTML Templating
  • Applying templating to our Golang web app
  • Goroutines - Concurrency in Goprogramming
  • Synchronizing Goroutines - Concurrency in Golang
  • Defer - Golang
  • Panic and Recover in Go Programming
  • Go Channels - Concurrency in Go
  • Go Channels buffering, iteration, and synchronization
  • Adding Concurrency to speed up our Golang Web Application