Concurrency in Go: goroutines explained for backend developers
Goroutines, concurrency vs parallelism, real backend patterns and common mistakes. Simple and effective concurrency in Go.

In Python you have asyncio. In Java, threads and executors. In Go, you write go in front of a function call. And that simplicity is, I think, both one of the best design decisions of the language and one of the most dangerous traps for those who adopt it without understanding what lies beneath.
I’ve spent years building backends in Kotlin and Python. I’ve worked with Kotlin coroutines, Java’s CompletableFuture, Python’s asyncio. And I can say, at least from my experience, that concurrency in Go doesn’t eliminate complexity. What it does is let you work with it without feeling like you’re fighting the language. The syntax gets out of the way, the abstractions are few, and the mental model is surprisingly straightforward. But straightforward doesn’t mean trivial, and that’s where a lot of people get lost.
This article covers goroutines, the real difference between concurrency and parallelism, the synchronization mechanisms in the sync package, real backend patterns and the mistakes you’re going to make. If you already know how to program and want to understand concurrency in Go from the perspective of someone who builds real services, this is for you. If you’re just starting with Go, you may want to first read learning Go.
Concurrency vs parallelism: the distinction that matters
Before writing a single goroutine, I think you need to be clear on a concept that most developers mix up (I mixed it up for years): concurrency and parallelism are not the same thing.
Concurrency is the ability to manage multiple tasks at once. It doesn’t mean they run at the same time. It means your program is structured to be able to make progress on several tasks without one blocking the others.
Parallelism is real simultaneous execution. Two things literally running at the same time on two different CPUs.
Rob Pike, one of Go’s creators, sums it up in a phrase I repeat whenever someone confuses the two concepts:
“Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.”
In practice, for backend, concurrency matters much more than parallelism. And honestly, most of our services spend 90% of the time waiting: waiting for database responses, waiting for HTTP calls to other services, waiting for I/O. If your program can do something useful while it waits, you gain performance. Not because you’re executing things in parallel, but because you’re not wasting time blocked.
Go gives you both. Goroutines allow structural concurrency. The Go runtime, if it has multiple CPUs available, can execute them in parallel. But the fundamental value is in concurrency: your code can be making ten HTTP requests at once without needing ten operating system threads.
A concrete example. Your service receives a request and needs to:
- Query the database to get a user
- Call an external service to get their permissions
- Query a cache to get their preferences
Sequentially, if each operation takes 100ms, you need 300ms. Concurrently, you can launch all three at once and wait for them to finish: 100ms. Not because you’re executing in parallel (you might be, but that’s not what matters), but because you’re not waiting for one to finish before starting the next.
What is a goroutine
A goroutine is a function that executes concurrently with the rest of the program. Technically it’s a lightweight thread managed by the Go runtime, not by the operating system.
This is important, and that’s where Go makes a difference. In Java, every thread is an OS thread. Creating them is expensive (typically 1-2 MB of stack per thread), context switching between them is expensive, and having thousands at once is problematic. In Go, a goroutine starts with a stack of about 8 KB that grows dynamically. You can have hundreds of thousands running without issues. The Go runtime multiplexes them over a much smaller number of OS threads using its own scheduler.
The model is M:N. M goroutines mapped onto N OS threads. The Go runtime decides when and how to distribute them. You don’t have to manage thread pools, configure the number of threads, or think about context switches. You launch goroutines and the runtime handles it.
func process(id int) {
fmt.Printf("processing task %d\n", id)
time.Sleep(100 * time.Millisecond) // simulates work
fmt.Printf("task %d completed\n", id)
}This function has nothing special about it. It’s a normal function. The magic is in how you call it.
Launching goroutines: the go keyword
Launching a goroutine is the simplest operation in Go’s concurrency model:
go process(1)That’s it. The process function executes in a new goroutine. Execution of the code that made the call continues immediately without waiting for process to finish.
func main() {
fmt.Println("start")
go process(1)
go process(2)
go process(3)
fmt.Println("goroutines launched")
time.Sleep(1 * time.Second) // inelegant wait
}This example launches three goroutines. All three execute concurrently. The time.Sleep at the end is necessary because if main finishes, the program ends, and the goroutines die with it. Obviously, time.Sleep is not the right way to synchronize goroutines. It’s a hack you’ll see in tutorials that you shouldn’t use in production. But it serves to illustrate the point: launching goroutines is trivial.
You can also launch anonymous functions as goroutines:
go func() {
fmt.Println("running in a goroutine")
}()
go func(msg string) {
fmt.Println(msg)
}("hello from goroutine")The pattern of passing arguments to the anonymous function is important. If you capture variables from the outer scope directly instead of passing them as arguments, you can end up with race conditions. More on this in the common errors section.
The problem: goroutines without synchronization
That’s where the simplicity of go becomes a trap. It’s so easy to launch goroutines that many developers — myself included at the start — launch them without thinking about how they synchronize. And then the problems begin.
func main() {
counter := 0
for i := 0; i < 1000; i++ {
go func() {
counter++
}()
}
time.Sleep(1 * time.Second)
fmt.Println(counter) // 1000? Not necessarily.
}This code has a race condition. A thousand goroutines try to increment the same variable at the same time. counter++ is not an atomic operation: it reads the value, increments it, and writes it. If two goroutines read the same value before either has written, one of the writes is lost.
The result might be 1000, or 987, or 953. It depends on the runtime’s scheduling, the system load, the number of CPUs. It’s non-deterministic, and honestly, that’s the worst kind of bug that exists: the one that works on your machine and fails in production.
Without synchronization, goroutines are a bug generator. The fundamental rule of concurrency in Go is simple: if two goroutines access the same variable and at least one modifies it, you need synchronization.
sync.WaitGroup: waiting for goroutines to finish
The first synchronization mechanism you need is sync.WaitGroup. It solves the most basic problem: knowing when a group of goroutines has finished.
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
fmt.Printf("goroutine %d working\n", id)
time.Sleep(100 * time.Millisecond)
}(i)
}
wg.Wait() // blocks until all goroutines call Done()
fmt.Println("all goroutines finished")
}WaitGroup has three methods:
Add(n): increments the counter byn. Call it before launching the goroutine.Done(): decrements the counter by 1. Call it when the goroutine finishes. Usingdeferis the convention.Wait(): blocks until the counter reaches 0.
Common mistakes with WaitGroup:
- Calling
Addinside the goroutine instead of before launching it. If the goroutine hasn’t been scheduled yet when you callWait, the counter might be at 0 andWaitreturns prematurely.
// BAD
go func() {
wg.Add(1) // may execute after wg.Wait()
defer wg.Done()
// work
}()
// GOOD
wg.Add(1)
go func() {
defer wg.Done()
// work
}()Forgetting
Done(). If a goroutine doesn’t callDone,Waitblocks forever. Always usedefer wg.Done()as the first line of the goroutine.Passing
WaitGroupby value.WaitGroupmust not be copied. If you pass it to a function, pass it as a pointer.
// BAD: wg is copied, Done() doesn't affect the original
func worker(wg sync.WaitGroup) {
defer wg.Done()
// work
}
// GOOD: pass the pointer
func worker(wg *sync.WaitGroup) {
defer wg.Done()
// work
}A real pattern I use constantly: launching N concurrent HTTP calls and waiting for all of them to finish.
func fetchAll(urls []string) []Response {
var wg sync.WaitGroup
results := make([]Response, len(urls))
for i, url := range urls {
wg.Add(1)
go func(idx int, u string) {
defer wg.Done()
results[idx] = fetch(u)
}(i, url)
}
wg.Wait()
return results
}Notice that each goroutine writes to a different position in the results slice. There’s no race condition because they don’t share positions. If they all wrote to the same variable, you’d need a mutex.
sync.Mutex: protecting shared state
When two or more goroutines need to read and write the same variable, you need a sync.Mutex. A mutex (mutual exclusion) guarantees that only one goroutine accesses the critical section at a time.
type SafeCounter struct {
mu sync.Mutex
count int
}
func (c *SafeCounter) Increment() {
c.mu.Lock()
c.count++
c.mu.Unlock()
}
func (c *SafeCounter) Value() int {
c.mu.Lock()
defer c.mu.Unlock()
return c.count
}Now the counter example works correctly:
func main() {
counter := &SafeCounter{}
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(1)
go func() {
defer wg.Done()
counter.Increment()
}()
}
wg.Wait()
fmt.Println(counter.Value()) // always 1000
}sync.RWMutex: concurrent reads
If your use case has many reads and few writes, sync.RWMutex is more efficient. It allows multiple concurrent readers but only one exclusive writer.
type Cache struct {
mu sync.RWMutex
items map[string]string
}
func (c *Cache) Get(key string) (string, bool) {
c.mu.RLock()
defer c.mu.RUnlock()
val, ok := c.items[key]
return val, ok
}
func (c *Cache) Set(key, value string) {
c.mu.Lock()
defer c.mu.Unlock()
c.items[key] = value
}Multiple goroutines can call Get simultaneously. But when one calls Set, it blocks all others until it finishes.
Practical rules for mutexes
Keep the critical section as small as possible. Don’t put I/O inside a Lock. Do the Lock, modify the variable, do the Unlock. If you need to make an HTTP call, copy the data you need inside the Lock and make the call outside.
Use
deferfor Unlock. It’s the idiomatic form and protects you from forgetting the Unlock if there’s an early return or a panic.Never copy a Mutex. As with WaitGroup, always pass pointers.
Don’t Lock inside a Lock of the same mutex. It’s an immediate deadlock. Go doesn’t have reentrant mutexes by design.
// DEADLOCK: Lock inside Lock
func (c *SafeCounter) Bad() {
c.mu.Lock()
// ...
c.mu.Lock() // blocks here forever
}When to use mutex vs channels
Go has two main synchronization mechanisms: mutexes and channels. The question of when to use each generates endless debate, and I don’t think there’s a single answer. But my rule is pragmatic:
Use mutex when you’re protecting shared state. If you have a variable that multiple goroutines need to read and write, a mutex is the most direct solution. An in-memory cache, a counter, a map of active sessions.
Use channels when you’re coordinating workflows. If you need to pass data from one goroutine to another, signal that something has finished, or implement a producer-consumer pattern, channels are the right abstraction.
The famous phrase is “Don’t communicate by sharing memory; share memory by communicating.” It’s a good principle, but taken to the extreme it produces artificially complex code. I’ve seen — and confess I’ve written — implementations of a simple counter with channels instead of a mutex, and the result was unreadable.
// Mutex: simple, direct, correct
var mu sync.Mutex
var count int
func increment() {
mu.Lock()
count++
mu.Unlock()
}
// Channels for a counter: overengineering
func counterManager(inc <-chan struct{}, get <-chan chan int) {
count := 0
for {
select {
case <-inc:
count++
case reply := <-get:
reply <- count
}
}
}The second example is technically correct, and in some contexts may make sense. But for the vast majority of cases, I think the mutex is more readable, faster and easier to debug.
A quick guide:
| Situation | Use |
|---|---|
| Protect read/write of a variable | sync.Mutex or sync.RWMutex |
| Pass results between goroutines | Channels |
| Signal that something has finished | sync.WaitGroup or a channel |
| Fan-out / fan-in | Channels |
| Worker pool | Channels |
| In-memory cache | sync.RWMutex |
| Limit concurrency | Buffered channel as semaphore |
If you’re starting with Go, master WaitGroup and Mutex first. Then move on to channels in Go and worker pools in Go. Don’t try to learn everything at once.
Real backend patterns: concurrent calls and parallel queries
Let’s get to what matters: patterns you’ll use in real services.
Concurrent HTTP calls
Your service needs to call three external APIs to compose a response. Doing it sequentially is wasting time.
type UserProfile struct {
User User
Permissions []Permission
Preferences Preferences
}
func GetUserProfile(ctx context.Context, userID string) (*UserProfile, error) {
var (
wg sync.WaitGroup
user User
permissions []Permission
preferences Preferences
userErr error
permErr error
prefErr error
)
wg.Add(3)
go func() {
defer wg.Done()
user, userErr = fetchUser(ctx, userID)
}()
go func() {
defer wg.Done()
permissions, permErr = fetchPermissions(ctx, userID)
}()
go func() {
defer wg.Done()
preferences, prefErr = fetchPreferences(ctx, userID)
}()
wg.Wait()
if userErr != nil {
return nil, fmt.Errorf("fetching user: %w", userErr)
}
if permErr != nil {
return nil, fmt.Errorf("fetching permissions: %w", permErr)
}
if prefErr != nil {
return nil, fmt.Errorf("fetching preferences: %w", prefErr)
}
return &UserProfile{
User: user,
Permissions: permissions,
Preferences: preferences,
}, nil
}This pattern is the bread and butter of backend with Go. Three calls that used to take 300ms now take 100ms. The error variables are separate because each goroutine writes to its own. There’s no race condition.
For a more robust pattern with cancellation, you can use errgroup from the golang.org/x/sync package:
func GetUserProfile(ctx context.Context, userID string) (*UserProfile, error) {
g, ctx := errgroup.WithContext(ctx)
var user User
var permissions []Permission
var preferences Preferences
g.Go(func() error {
var err error
user, err = fetchUser(ctx, userID)
return err
})
g.Go(func() error {
var err error
permissions, err = fetchPermissions(ctx, userID)
return err
})
g.Go(func() error {
var err error
preferences, err = fetchPreferences(ctx, userID)
return err
})
if err := g.Wait(); err != nil {
return nil, err
}
return &UserProfile{
User: user,
Permissions: permissions,
Preferences: preferences,
}, nil
}errgroup is better than WaitGroup for this case because it cancels the context if any of the goroutines fails. If the permissions call fails, the other goroutines receive the cancellation signal through the context and can stop instead of continuing to work pointlessly.
Concurrent database queries
Same pattern, but for PostgreSQL queries:
func GetDashboardData(ctx context.Context, db *sql.DB, userID int64) (*Dashboard, error) {
g, ctx := errgroup.WithContext(ctx)
var orders []Order
var stats Stats
var notifications []Notification
g.Go(func() error {
var err error
orders, err = getRecentOrders(ctx, db, userID)
return err
})
g.Go(func() error {
var err error
stats, err = getUserStats(ctx, db, userID)
return err
})
g.Go(func() error {
var err error
notifications, err = getUnreadNotifications(ctx, db, userID)
return err
})
if err := g.Wait(); err != nil {
return nil, fmt.Errorf("loading dashboard: %w", err)
}
return &Dashboard{
Orders: orders,
Stats: stats,
Notifications: notifications,
}, nil
}Three queries that were previously executed sequentially, now execute concurrently. If your PostgreSQL connection pool has enough connections, this reduces latency dramatically.
Processing a batch with limited concurrency
Sometimes you need to process thousands of items but can’t launch thousands of goroutines at once (for example, because the database or external API has a connection limit). A buffered channel acts as a semaphore:
func processBatch(ctx context.Context, items []Item) error {
const maxConcurrency = 10
sem := make(chan struct{}, maxConcurrency)
g, ctx := errgroup.WithContext(ctx)
for _, item := range items {
item := item // capture the loop variable
g.Go(func() error {
sem <- struct{}{} // acquire slot
defer func() { <-sem }() // release slot
return processItem(ctx, item)
})
}
return g.Wait()
}The sem channel has a buffer of 10. When 10 goroutines have written to it, the next one blocks until one of the previous ones reads from the channel (releasing a slot). It’s a simple and effective semaphore pattern. For more advanced patterns of this type, you can read worker pools in Go.
Common mistakes: goroutine leaks, race conditions and oversights
After working with Go in production, I can say that 80% of concurrency bugs fall into a few categories. I list them here because knowing them in advance can save you weeks of debugging.
Goroutine leaks
A goroutine leak occurs when you launch a goroutine that never terminates. It stays there, consuming memory, waiting for something that will never arrive.
// LEAK: if nobody reads from the channel, the goroutine blocks forever
func leakyFunction() {
ch := make(chan int)
go func() {
result := expensiveComputation()
ch <- result // blocks if nobody reads
}()
// the function returns without reading from ch
// the goroutine remains blocked forever
}The fix depends on the case. Sometimes you need a buffered channel. Sometimes you need a select with a ctx.Done(). Sometimes you need to make sure someone reads from the channel.
// FIX: buffered channel of size 1
func fixedFunction() {
ch := make(chan int, 1) // buffer of 1: the goroutine can write and finish
go func() {
result := expensiveComputation()
ch <- result
}()
// even if nobody reads, the goroutine doesn't block
}Another common cause of leaks: goroutines listening on a channel that nobody closes.
// LEAK: if nobody closes ch, the goroutine never finishes
go func() {
for item := range ch {
process(item)
}
}()Rule: if you launch a goroutine, be clear about what its termination condition is. If you can’t explain when and why it’s going to finish, you have a potential leak.
Loop variable capture
This is a classic that has bitten every Go developer at some point. Since Go 1.22, loop variables are captured correctly in most cases, but it’s important to understand the problem because there’s still legacy code that has it.
// PROBLEM (Go < 1.22): all goroutines print the same value
for _, url := range urls {
go func() {
fetch(url) // url is the loop variable, not a copy
}()
}
// SOLUTION: pass as argument
for _, url := range urls {
go func(u string) {
fetch(u)
}(url)
}Before Go 1.22, the url variable of the loop was the same in each iteration. The goroutines captured a reference to that variable, and by the time they executed, url already had the last value. Since Go 1.22, each iteration creates a new variable, but passing as an argument remains the most explicit and safe pattern.
Unsynchronized access to maps
Maps in Go are not thread-safe. And this can be surprising: two goroutines writing to the same map simultaneously cause a panic at runtime, not a silent incorrect result. A direct crash.
// PANIC: concurrent map writes
m := make(map[string]int)
for i := 0; i < 100; i++ {
go func(n int) {
m[fmt.Sprintf("key-%d", n)] = n // panic
}(i)
}Solutions:
sync.Mutexto protect map access.sync.Mapif you have a use case with many reads and few writes and the keys are relatively stable.- Redesign so each goroutine has its own map and you combine them at the end.
// Option 1: Mutex
type SafeMap struct {
mu sync.Mutex
m map[string]int
}
func (sm *SafeMap) Set(key string, val int) {
sm.mu.Lock()
sm.m[key] = val
sm.mu.Unlock()
}Forgetting to pass the context
In backend services, the context (context.Context) is your cancellation mechanism. If you launch a goroutine that makes an HTTP call or a database query and you don’t pass the context, that goroutine won’t know that the original request was cancelled.
// BAD: the goroutine keeps working even if the client has cancelled
go func() {
result, err := http.Get(url) // without context
// ...
}()
// GOOD: use the request context
go func() {
req, _ := http.NewRequestWithContext(ctx, "GET", url, nil)
result, err := http.DefaultClient.Do(req)
// ...
}()More on this in context in Go.
The race detector: go test -race
Go has a built-in tool that detects race conditions at runtime. It’s one of the best things about Go’s tooling and you should always use it.
go test -race ./...
go run -race main.go
go build -race -o myappThe race detector instruments your code to detect unsynchronized concurrent accesses to the same variable. When it detects a race condition, it prints a detailed report with the stacks of the goroutines involved:
WARNING: DATA RACE
Read at 0x00c0000a4000 by goroutine 7:
main.main.func1()
/home/roger/app/main.go:15 +0x3c
Previous write at 0x00c0000a4000 by goroutine 6:
main.main.func1()
/home/roger/app/main.go:15 +0x52
Goroutine 7 (running) created at:
main.main()
/home/roger/app/main.go:13 +0x84
Goroutine 6 (finished) created at:
main.main()
/home/roger/app/main.go:13 +0x84It tells you exactly which variable, which goroutines, and at which line of code. It is, I think, one of the best tools in the entire Go ecosystem.
When and how to use it
In tests, always. Your CI should run go test -race ./... on every commit. There’s no excuse not to. The performance penalty exists (code runs 2-10x slower and uses more memory), but in tests that doesn’t matter.
In development, frequently. Compile with -race when you’re working on concurrent code. The race detector only detects races that actually occur during execution, not potential ones, so you need the conflicting code to actually run.
In production, no. The performance and memory penalty is too high for production. But if you have a staging environment, consider running with -race there.
An important detail: the race detector finds races that occur during execution. If a race condition only manifests under high load and your tests don’t generate that load, the detector won’t find it. That’s why it’s important to have tests that exercise the concurrent paths of your code.
func TestConcurrentAccess(t *testing.T) {
counter := &SafeCounter{}
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
wg.Add(1)
go func() {
defer wg.Done()
for j := 0; j < 100; j++ {
counter.Increment()
_ = counter.Value()
}
}()
}
wg.Wait()
if counter.Value() != 10000 {
t.Errorf("expected 10000, got %d", counter.Value())
}
}This test doesn’t just verify the result: when run with -race, it also verifies there are no unsynchronized accesses.
Goroutines and HTTP servers: one goroutine per request
If you use net/http (or frameworks like Gin, Chi, Echo), each HTTP request is handled in its own goroutine. You don’t have to do anything for this to happen. The standard Go server launches a goroutine for each incoming connection.
func main() {
http.HandleFunc("/api/users", handleUsers)
http.ListenAndServe(":8080", nil)
}
func handleUsers(w http.ResponseWriter, r *http.Request) {
// this function is already running in its own goroutine
// you don't need to launch an additional goroutine to handle the request
}This has practical implications:
Your handler is already concurrent. A thousand simultaneous requests means a thousand goroutines executing your handler. If your handler accesses mutable global state (a package variable, a shared map), you need synchronization.
The request context is cancelled when the client disconnects.
r.Context()gives you a context that is cancelled if the client closes the connection. Pass it to all your downstream operations (queries, HTTP calls, etc.).You can launch goroutines inside the handler, but be careful. If you launch a goroutine that outlives the handler, you need to make sure it doesn’t use the
http.ResponseWriteror*http.Requestafter the handler returns, because they will be recycled.
func handleUsers(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
// GOOD: goroutines that finish before the handler returns
g, ctx := errgroup.WithContext(ctx)
var users []User
var count int
g.Go(func() error {
var err error
users, err = getUsers(ctx)
return err
})
g.Go(func() error {
var err error
count, err = getUserCount(ctx)
return err
})
if err := g.Wait(); err != nil {
http.Error(w, "internal error", http.StatusInternalServerError)
return
}
json.NewEncoder(w).Encode(map[string]any{
"users": users,
"count": count,
})
}A mistake I often see in beginner code:
func handleUsers(w http.ResponseWriter, r *http.Request) {
// BAD: goroutine that writes to w after the handler returns
go func() {
users, _ := getUsers(r.Context())
json.NewEncoder(w).Encode(users) // w may have been recycled
}()
// handler returns immediately, the ResponseWriter is no longer valid
}If you need to do background work that outlives the request (sending an email, updating a cache), don’t use the ResponseWriter or the Request. Copy the data you need and use an independent context.
func handleOrder(w http.ResponseWriter, r *http.Request) {
order := processOrder(r)
// Respond to the client immediately
json.NewEncoder(w).Encode(order)
// Background work: use context.Background(), not r.Context()
go func(orderID string) {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
sendConfirmationEmail(ctx, orderID)
}(order.ID)
}Performance: how many goroutines are too many
The short answer, and probably not the one you expect: more than you think you can have.
I’ve seen benchmarks with millions of goroutines running simultaneously. Each goroutine starts with a stack of about 8 KB, so a million goroutines consumes about 8 GB of memory just in stacks. In practice, for a typical backend service, having tens of thousands of active goroutines is completely normal and shouldn’t worry you.
What should worry you is not the number of goroutines but what they do:
- Goroutines waiting for I/O: they’re cheap. A goroutine blocked on a network read consumes almost no CPU. You can have thousands.
- Goroutines doing CPU work: they’re expensive. If you have 8 cores and 10,000 goroutines doing calculations, only 8 can run simultaneously. The rest wait. Scheduling overhead starts to matter.
- Goroutines that create more goroutines without limit: dangerous. If each request launches N goroutines and you receive M requests, you have M*N goroutines. If N or M are large, you can run out of memory.
Monitoring goroutines
In production, monitor the number of active goroutines:
import "runtime"
func metricsHandler(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "goroutines: %d\n", runtime.NumGoroutine())
}If you see the number growing continuously without dropping, you have a goroutine leak. It’s one of the first metrics I configure in any Go service.
With pprof, you can inspect exactly what your goroutines are doing:
import _ "net/http/pprof"
func main() {
go http.ListenAndServe(":6060", nil)
// your main server on another port
}Then you can access http://localhost:6060/debug/pprof/goroutine?debug=1 to see a dump of all active goroutines, grouped by stack trace. It’s invaluable for diagnosing leaks.
GOMAXPROCS
GOMAXPROCS controls how many OS threads the Go runtime uses to execute goroutines. By default, it’s the number of available CPUs. You rarely need to change it, but it’s good to know it exists.
import "runtime"
func main() {
fmt.Println("CPUs:", runtime.NumCPU())
fmt.Println("GOMAXPROCS:", runtime.GOMAXPROCS(0)) // 0 = query only, don't change
}In Docker containers, before Go 1.19, GOMAXPROCS could take the number of CPUs from the host instead of the container. If your container has 2 CPUs but the host has 64, Go created 64 threads. Uber’s automaxprocs library was the standard solution. Since Go 1.19, the runtime respects the container’s CPU limits.
Next steps: channels, context, worker pools
Concurrency in Go doesn’t end with goroutines and mutexes. In fact, I’ve just covered the basics. But things change considerably when you incorporate the mechanisms that make concurrency in Go truly powerful:
Channels in Go: the communication mechanism between goroutines. Buffered vs unbuffered, directionality, the select pattern, and how to close channels safely. If sync.WaitGroup and sync.Mutex are level 1 of concurrency in Go, channels are level 2.
Context in Go: cancellation, timeouts and value propagation. In a backend service, the context is what allows all downstream operations to stop when a client disconnects. It’s the invisible glue that keeps concurrency under control.
Worker pools in Go: when you need to process thousands of tasks with limited concurrency. Workers that read from one channel, results sent through another channel, and clean cancellation. It’s the most important pattern for batch processing.
What I recommend: practice first with WaitGroup and Mutex until they come naturally. Write tests with -race for everything. When that’s natural, move on to channels. And when you master channels, worker pools and context fall into place.
Concurrency in Go is not magic. It’s a simple model with simple tools that, combined correctly, let you write backend that makes the most of your hardware resources. The complexity doesn’t disappear — it would be naive to think so — but for the first time in a long time, I feel like the language is on my side instead of against me. And not because Go is better than everything else. But because its concurrency model fits naturally with what I need to build most days.


