Python's dict.get Method Is Not a Mechanism for Fault Tolerance
Scenarios where you shouldn't use dict.get()
Python’s dict.get
method is syntactic sugar that saves you from having to check if a key exists.
# This
if "key" in d:
return d["key"]
else:
return "fallback"
# is equivalent to this
return d.get("key", "fallback")
But it is not a mechanism for fault tolerance.
If your program relies on the existence of some value to work properly, using .get
alone is not sufficient for handling the missing value. This sounds obvious. “My program relies on some value to function, therefore I can’t just continue executing my program as if nothing happened when that value is unexpectedly missing.” But unfortunately, I come across code that does this all the time, particularly in web development. It usually looks like:
response = requests.get('http://your-domain.com/latest_message/')
# Check if the request was successful (status code 200)
if response.status_code == 200:
# Extract the "author" and "content" fields from the response JSON
json_data = response.json()
author = json_data.get('author')
content = json_data.get('content')
# Call the process_message function with author and content
process_message(author, content)
else:
print("Failed to retrieve the latest message.")
The problem with this code is that:
The code runs while in an invalid state for longer than it needs to. The second the expected values are missing, the code is no longer functioning as we expect. When the program eventually crashes, it will have done so further from the source of the error than was necessary, making the issue harder to diagnose.
It introduces the possibility that API contract violations by the
/latest_message/
endpoint will fail silently. If theprocess_message
function in the above code also doesn’t check the type or value ofauthor
orcontent
, we might be processing empty messages and sending them into the abyss without knowing.Using
.get
to retrieve a required value may give other programmers in the codebase the impression that:The value is optional.
.get
has a purpose in this context, like preventing the program from crashing in the case where the values don’t exist, even though this should be done by an entirely different mechanism, like response validation or automated unit tests on the source service.
As a result, your fellow collaborators copy this pattern and propagate the above issues further throughout the codebase.
To avoid these issues, you should fail fast. The moment an expected piece of data is missing, the program should scream that something has gone terribly wrong, alert all parties involved, and die. It’s not the responsibility of your code to handle an API contract violation by some external system, nor is it possible. If you sign a contract with a plumber to replace your sink but instead they install a sauna, you’re not the one responsible for the mistake. It’s the responsibility of the plumber to make sure these types of mistakes don’t happen in the first place by carefully reviewing the contract before starting the job. Similarly, it’s the responsibility of the external system providing the data to have automated tests that guarantee the expected values are present. Save yourself the endless philosophical debates about whether or not the program running with incorrect behavior is better than the program crashing entirely. Use default dictionary access (ex. json_data["author"]
) for retrieving data that’s required for the program to work. Fail fast.