I Used Global Variables for “Convenience” (And Created Bugs I Couldn’t Reproduce)
The code needed configuration values.
# config.py
DATABASE_URL = "postgresql://localhost/mydb"
API_KEY = "sk_live_abc123"
DEBUG = True
# app.py
import config
def connect_database():
return psycopg2.connect(config.DATABASE_URL)
def call_api(endpoint):
return requests.get(f"https://api.example.com/{endpoint}",
headers={'X-API-Key': config.API_KEY})
Clean. Simple. Every module could access configuration through import config. No passing parameters everywhere. Convenient.
Then the bug reports started.
“The app works fine on my machine but not in production.” “Tests pass locally but fail in CI.” “Feature works when run alone but breaks when run after other tests.”
I couldn’t reproduce any of them. The code worked perfectly for me. But users were experiencing random, intermittent failures.
Then I discovered the problem….. global variables.
# test_api.py
import config
def test_with_mock_api():
# Override config for testing
config.API_KEY = "test_key_123"
result = call_api('users')
assert result.status_code == 200
# test_database.py
import config
def test_with_test_database():
# Override config for testing
config.DATABASE_URL = "postgresql://localhost/testdb"
result = query_users()
assert len(result) > 0
Tests were modifying global state. When run in sequence, they interfered with each other. The order of test execution determined which tests passed.
# Run test_api first, then test_database
pytest test_api.py test_database.py # Both pass
# Run test_database first, then test_api
pytest test_database.py test_api.py # test_api fails!
# Why?
# test_database changed DATABASE_URL globally
# test_api used the changed value
# Results were unpredictable
That month….. hunting down race conditions and weird test failures….. I learned the most painful lessons about global state.
Global variables aren’t convenient. They’re invisible dependencies that make code impossible to reason about. And they create bugs that only appear in specific circumstances.
Let me show you all the ways global variables destroyed my code.
The Race Condition That Only Happened in Production
I had a counter as a global variable.
# counter.py
request_count = 0
def track_request():
global request_count
request_count += 1
return request_count
# app.py
from counter import track_request
@app.route('/api/endpoint')
def handle_request():
count = track_request()
logging.info(f"Request #{count}")
return process_request()
Worked perfectly in development. Every request got a unique sequential number.
In production with multiple threads….. request counts were wrong.
# Thread 1 and Thread 2 run simultaneously
# Thread 1: Read request_count (0)
# Thread 2: Read request_count (0)
# Thread 1: Add 1, get 1
# Thread 2: Add 1, get 1
# Both threads see count = 1
# Later
# Thread 3: Read request_count (1)
# Thread 4: Read request_count (1)
# Thread 3: Add 1, get 2
# Thread 4: Add 1, get 2
# Counts are duplicated!
# Logs show
# Request #1
# Request #1 ← Duplicate!
# Request #2
# Request #2 ← Duplicate!
Global variables are not thread-safe. Multiple threads accessing the same global create race conditions.
import threading
request_count = 0
count_lock = threading.Lock()
def track_request():
global request_count
with count_lock: # Only one thread at a time
request_count += 1
return request_count
But this introduced another problem….. lock contention. Every request had to wait for the lock. Under load, performance tanked.
# Better: thread-local storage
import threading
thread_local = threading.local()
def track_request():
if not hasattr(thread_local, 'request_count'):
thread_local.request_count = 0
thread_local.request_count += 1
return thread_local.request_count
# Each thread has its own counter
# No race conditions
# No lock contention
Or don’t use globals at all…..
class RequestTracker:
def __init__(self):
self.count = 0
self.lock = threading.Lock()
def track(self):
with self.lock:
self.count += 1
return self.count
# Create one instance per application
tracker = RequestTracker()
@app.route('/api/endpoint')
def handle_request():
count = tracker.track()
logging.info(f"Request #{count}")
return process_request()
The Cache That Leaked Memory
I created a global cache for performance.
# cache.py
_cache = {}
def cache_set(key, value):
_cache[key] = value
def cache_get(key):
return _cache.get(key)
# user_service.py
from cache import cache_set, cache_get
def get_user(user_id):
cached = cache_get(f"user:{user_id}")
if cached:
return cached
user = database.query("SELECT * FROM users WHERE id = %s", user_id)
cache_set(f"user:{user_id}", user)
return user
Worked great. Cached users, fast lookups.
In production….. memory usage grew continuously. The process used 8GB, 16GB, 32GB. Eventually crashed.
# After 1 hour
_cache = {
'user:1': {...},
'user:2': {...},
'user:3': {...},
# ... 10,000 users cached
}
# After 1 day
_cache = {
'user:1': {...},
'user:2': {...},
# ... 240,000 users cached
# 5GB of memory
}
# After 1 week
# 1,680,000 cached users
# 32GB of memory
# Server crashes
The cache never cleared itself. Every unique user_id was cached forever. Global state accumulated infinitely.
from functools import lru_cache
@lru_cache(maxsize=1000) # Limit cache size
def get_user(user_id):
return database.query("SELECT * FROM users WHERE id = %s", user_id)
# Or use time-based expiration
from datetime import datetime, timedelta
class ExpiringCache:
def __init__(self, ttl_seconds=300):
self._cache = {}
self._expiry = {}
self.ttl = timedelta(seconds=ttl_seconds)
def set(self, key, value):
self._cache[key] = value
self._expiry[key] = datetime.now() + self.ttl
def get(self, key):
if key not in self._cache:
return None
if datetime.now() > self._expiry[key]:
del self._cache[key]
del self._expiry[key]
return None
return self._cache[key]
# One cache instance
cache = ExpiringCache(ttl_seconds=300)
def get_user(user_id):
cached = cache.get(f"user:{user_id}")
if cached:
return cached
user = database.query("SELECT * FROM users WHERE id = %s", user_id)
cache.set(f"user:{user_id}", user)
return user
The Test That Changed Global State
I wrote a test that modified a global.
# settings.py
DEBUG = False
MAX_RETRIES = 3
TIMEOUT = 30
# test_api.py
import settings
def test_api_with_debug():
# Enable debug mode for this test
settings.DEBUG = True
result = call_api_endpoint()
assert 'debug_info' in result
# test_retry.py
import settings
def test_retry_logic():
# This test expects DEBUG = False
result = call_api_endpoint()
assert 'debug_info' not in result
# If test_api runs first
# settings.DEBUG is now True
# test_retry fails!
Tests modified global state. The modifications persisted across tests. Test order mattered.
# Test isolation broken
pytest test_api.py test_retry.py # test_retry fails
pytest test_retry.py test_api.py # Both pass
pytest test_retry.py # Passes when run alone
pytest test_api.py test_retry.py # Fails when run after test_api
I needed to save and restore state.
import settings
import pytest
@pytest.fixture
def restore_settings():
# Save original values
original_debug = settings.DEBUG
original_retries = settings.MAX_RETRIES
original_timeout = settings.TIMEOUT
yield # Test runs here
# Restore original values
settings.DEBUG = original_debug
settings.MAX_RETRIES = original_retries
settings.TIMEOUT = original_timeout
def test_api_with_debug(restore_settings):
settings.DEBUG = True
result = call_api_endpoint()
assert 'debug_info' in result
def test_retry_logic(restore_settings):
result = call_api_endpoint()
assert 'debug_info' not in result
Or better….. don’t modify globals in tests.
# Instead of modifying globals
def test_api_with_debug():
settings.DEBUG = True
result = call_api_endpoint()
# Pass configuration explicitly
def test_api_with_debug():
config = {'DEBUG': True}
result = call_api_endpoint(config)
# Or use dependency injection
def call_api_endpoint(debug=False):
if debug:
return {'data': '...', 'debug_info': '...'}
return {'data': '...'}
def test_api_with_debug():
result = call_api_endpoint(debug=True)
assert 'debug_info' in result
The Singleton That Wasn’t
I created a singleton pattern with a global.
# database.py
_db_instance = None
def get_database():
global _db_instance
if _db_instance is None:
_db_instance = DatabaseConnection()
return _db_instance
# Multiple modules use it
from database import get_database
def save_user(user):
db = get_database()
db.execute("INSERT INTO users ...")
def get_orders():
db = get_database()
return db.query("SELECT * FROM orders")
I thought I had one database connection. In reality, I had one per process.
# With multiprocessing
from multiprocessing import Process
def worker():
db = get_database()
# Each process creates its own connection
# _db_instance is not shared between processes
# In process 1
get_database() # Creates connection A
# In process 2
get_database() # Creates connection B (different instance!)
# Two separate connections
# Not a singleton across processes
And with thread safety issues…..
# Thread 1 and Thread 2 run simultaneously
def get_database():
global _db_instance
if _db_instance is None: # Thread 1 checks: None
# Thread 2 checks: None
_db_instance = DatabaseConnection() # Thread 1 creates connection
_db_instance = DatabaseConnection() # Thread 2 creates connection (overwrites)
return _db_instance
# Two connections created
# One immediately lost
# Memory leak
Thread-safe singleton…..
import threading
_db_instance = None
_db_lock = threading.Lock()
def get_database():
global _db_instance
if _db_instance is None:
with _db_lock:
# Double-check inside lock
if _db_instance is None:
_db_instance = DatabaseConnection()
return _db_instance
But better….. don’t use singletons or globals.
class Application:
def __init__(self):
self.db = DatabaseConnection()
def save_user(self, user):
self.db.execute("INSERT INTO users ...")
def get_orders(self):
return self.db.query("SELECT * FROM orders")
# Create one instance
app = Application()
# Pass it where needed
@app.route('/users')
def users_endpoint():
return app.get_orders()
The Import Side Effect That Broke Everything
I had initialization code at module level.
# config.py
import os
DATABASE_URL = os.environ['DATABASE_URL'] # Read from environment
API_KEY = os.environ['API_KEY']
# Initialize connection at import time
db_connection = connect_to_database(DATABASE_URL)
Every import of config executed this code. Even if I just wanted to look at the module.
# test_config.py
import config # This crashes!
# Why?
# Environment variables aren't set
# KeyError: 'DATABASE_URL'
# Even if I'm just testing something else
import other_module
# other_module imports config
# config tries to read environment variables
# Crash!
Import side effects made the code impossible to test or even import without full environment setup.
# Can't do this in tests
import config
config.DATABASE_URL = "postgresql://localhost/testdb"# Because it already tried to connect during import
# Connection was made with os.environ['DATABASE_URL']
# Setting it after import does nothing
I had to set up the entire environment just to import the module.
# test_config.py
import os
# Must set environment before importing
os.environ['DATABASE_URL'] = 'postgresql://localhost/testdb'
os.environ['API_KEY'] = 'test_key'
import config # Now it works
# But this affects all other tests
# Global environment pollution
The fix….. lazy initialization.
# config.py
import os
_db_connection = None
def get_database_url():
return os.environ.get('DATABASE_URL', 'postgresql://localhost/defaultdb')
def get_api_key():
return os.environ.get('API_KEY', 'default_key')
def get_db_connection():
global _db_connection
if _db_connection is None:
_db_connection = connect_to_database(get_database_url())
return _db_connection
# Now importing doesn't crash
# Connection only created when actually needed
Or use a class…..
class Config:
def __init__(self):
self._db_connection = None
@property
def database_url(self):
return os.environ.get('DATABASE_URL', 'postgresql://localhost/defaultdb')
@property
def db_connection(self):
if self._db_connection is None:
self._db_connection = connect_to_database(self.database_url)
return self._db_connection
# Create config when needed
config = Config()
# Tests can override
config_test = Config()
config_test._database_url_override = 'postgresql://localhost/testdb'
The Hidden Dependency That Made Testing Impossible
My function used a global without declaring it.
# config.py
API_ENDPOINT = "https://api.production.com"
# api.py
def call_api(path):
# Uses global without importing it explicitly
import config
url = f"{config.API_ENDPOINT}/{path}"
return requests.get(url)
# Looks fine
result = call_api('users')
But testing was impossible.
# test_api.py
import config
config.API_ENDPOINT = "https://api.test.com"
from api import call_api
# This STILL uses production API!
# Why?
# Because call_api imports config inside the function
# The import happens when the function runs
# After we've already overridden config.API_ENDPOINT
# But the import creates a new reference
Actually, this is confusing. Let me clarify…..
# api.py
def call_api(path):
import config # Gets the config module
url = f"{config.API_ENDPOINT}/{path}"
# This WILL use the modified value
# Real problem is this
API_ENDPOINT = "https://api.production.com"
def call_api(path):
# Implicitly uses module-level global
url = f"{API_ENDPOINT}/{path}"
return requests.get(url)
# test_api.py
import api
api.API_ENDPOINT = "https://api.test.com"
result = api.call_api('users')
# Still uses production URL if function was called before modificati
Hidden dependencies are invisible in the function signature.
# This function looks like it only depends on 'path'
def call_api(path):
url = f"{API_ENDPOINT}/{path}"
return requests.get(url)
# But it secretly depends on API_ENDPOINT global
# Impossible to see without reading the implementation
Make dependencies explicit…..
# Dependency is visible in signature
def call_api(path, endpoint="https://api.production.com"):
url = f"{endpoint}/{path}"
return requests.get(url)
# Testing is easy
def test_api():
result = call_api('users', endpoint='https://api.test.com')
assert result.status_code == 200
# Production uses default
result = call_api('users')
# Test overrides
result = call_api('users', endpoint='https://mock.api')
Or use dependency injection…..
class APIClient:
def __init__(self, endpoint="https://api.production.com"):
self.endpoint = endpoint
def call_api(self, path):
url = f"{self.endpoint}/{path}"
return requests.get(url)
# Production
client = APIClient()
result = client.call_api('users')
# Testing
test_client = APIClient(endpoint='https://api.test.com')
result = test_client.call_api('users')
The Monkey Patch That Had Permanent Effects
I monkey-patched a global for testing.
# production_module.py
import datetime
def get_current_time():
return datetime.datetime.now()
# test_module.py
import datetime
import production_module
def test_time_dependent_function():
# Mock current time
original_now = datetime.datetime.now
datetime.datetime.now = lambda: datetime.datetime(2024, 1, 1, 12, 0, 0)
result = production_module.get_current_time()
assert result.day == 1
# Forgot to restore!
# datetime.datetime.now = original_now
# All subsequent tests use mocked time!
def test_something_else():
now = datetime.datetime.now()
print(now) # 2024-01-01 12:00:00
# Forever stuck in the past!
Monkey patching modifies global state. If not restored, it affects everything.
import datetime
import pytest
@pytest.fixture
def freeze_time():
original_now = datetime.datetime.now
datetime.datetime.now = lambda: datetime.datetime(2024, 1, 1, 12, 0, 0)
yield
datetime.datetime.now = original_now # Always restore
def test_time_dependent_function(freeze_time):
result = get_current_time()
assert result.day == 1
Or use libraries designed for this…..
from freezegun import freeze_time
@freeze_time("2024-01-01 12:00:00")
def test_time_dependent_function():
result = get_current_time()
assert result.day == 1
# Automatically restored after test
What I Learned About Global State
After debugging impossible-to-reproduce bugs for weeks….. I learned that global variables are not about convenience. They’re about hidden dependencies and invisible coupling.
Problems with globals:
- Not thread-safe (race conditions)
- Not process-safe (each process has its own)
- Make testing hard (shared state between tests)
- Make code hard to understand (hidden dependencies)
- Cause memory leaks (accumulate forever)
- Create order dependencies (initialization order matters)
When globals are actually okay:
- True constants (NEVER modified)
- Module-level configuration (if immutable)
- Cache with proper size limits and eviction
- Logging configuration
- Read-only lookups (if thread-safe)
The checklist I use now…..
Is this actually a global?
☐ Can it be a parameter instead?
☐ Can it be a class attribute instead?
☐ Can it be dependency-injected instead?
If it must be global:
☐ Is it truly constant? (Never modified)
☐ Does it need thread safety?
☐ Does it need process safety?
☐ Does it need size limits?
☐ How will tests handle it?
☐ Are initialization side effects safe?
The golden rule….. if you’re using global keyword, you’re probably doing something wrong.
What global variable created bugs you couldn’t reproduce? Share it below….. we’ve all learned that globals make everything harder.
I Used Global Variables for “Convenience” (And Created Bugs I Couldn’t Reproduce) was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.