Here are some take away lessons (I learnt the hard way) when automating scripts/programs:
Plan to Fail Elegantly
Things may flow perfectly for now, but when something does mess up later on (e.g. a third party service becoming unavailable), the code should be able to respond and/or fail gracefully.
The effort required to do this can range from a simple if-else statement (e.g. null check), to a full fledged rewriting of code (e.g. changing the logic flow). Therefore, it might be good to tackle the what-ifs right at the start of planning.
Develop Built-in Debugging
Being able to debug without affecting “production” is important too. One of my methods is to use a simple flag to be passed at the command line. If the flag argument is present, the debugging messages are turned on, otherwise they will be hidden.
This is especially useful when returning to the code 3 months later, and trying to solve a bug that has crippled the system.
Record and Make Available Execution Logs
It is a good idea to record the output of code execution so that bugs/errors can be traced back to when it first started. Being able to know (roughly) when certain events happened can help significantly narrow down the possible causes.
For example, in one of my scripts, the target system had changed its string format, causing the regex matching to break. I noticed the anomaly only 2 weeks later when the script output had dropped significantly. Being able to trace back allowed me to understand the error, and to design a solution that was more robust and resilient to similar errors.
Allow Manual Runs
Automation is great, but being able to run the script/program on demand is also a great feature to have, especially for debugging outside the script’s intended operation hours. Or when a special request has been made.
Use Semaphores/Locks to Handle Concurrency
If the run time of the script(s) is not guaranteed, use Semaphores or Locks to ensure mutex to resources/code/storage, etc.
Automation allows us to run scripts and programs at shorter, regular intervals without requiring manual initiation/intervention. However, we should also be mindful that having too many concurrent sessions and/or a high request rate may result in a small scale DoS attack (or similar) on the target system. You may potentially cause inconvenience for other users, and/or get your IP banned.
To get around this, use timer delays to add time between operations so as to spread out the amount of traffic/load that is sent to the target over time. Of course, this would also mean that it will take a little bit longer for the same task to complete.
These lessons are born out of my own experiences/mistakes. Do feel free to share your own fair share of mistakes and lessons if you have some!