Scaling is not an algorithm but a mechanism

In 2025 many patterns and paradigms are still not in the heads of developers. In many cases developers tend to keep competency for making algorithms faster inside the software. In some cases it makes sense but in the area of a classical server application with the 3 tier model frontend, middleware, database there is normally no place for parallelism, locks, mutexes, semaphores and such to make algorithms faster. In todays world we have x-scaling, y-scaling, z-scaling which is done by the cloud infrastructure. And thats why we have it, right?

Whenever you as developer do not implement the desired algorithm but start thinking about how to make algorithms faster there are a few questions which give answers where the scaling must happen:

  1. Is the algorithm I am writing reducable to an optimization problem which is already solved. This is e.g. sorting of elements, searching data, check for existence, etc. If it is reducable then use the already known and established algorithms.
  2. If (1) does not fit and you are really implementing something custom think about doing it in the right place. E.g. when you are going to import a big json file into a database you could for sure open that file in the middleware, process each and every record in there and send it to the database which then gets stored. From my personal experience i can tell that when the JSON is passed over to the database and processed there the amount of time is reduced by up to 99%. Yes, you read correct. I had a project where a JSON file in the end was processed in a stored procedures in a DB and did not take 10 minutes but only a few seconds.
  3. If (2) also does not hold and there is really something that is custom implementation which cannot be relocated but must be really scaled then do it via infrastructure x-scaling / horizontal scaling. I am talking here about queues which must be processed where ich message there is an email or something like that. Do not do the sending of the mails via some parallelism in your middleware. Let the infrastructure scale that for you.
  4. If (3) does not hold or could be done more efficient then think about probabilistic data structures. What I mean is e.g. that you want to check very fast if a value in a data structure exists. There are wonderful algorithms like HyperLogLog, BloomFilter, SlimHash, KineticHanger, Cuckoo Filter and such which already do a great job.
  5. If all of the above do not hold, feel free to do your own algorithm but always estimation runtimes with the big O notation to raise the awareness of time and space of your algorithm.