Skip to the content.

🚀 The Full-Time Evolution

July 2023 - Present • Sections 6-10 from the original journey

6

🏦 Transition to Full-Time: Entering the Banking Client Universe

When my internship ended and I transitioned to a full-time SDE role, I thought I understood what I was getting into. I was wrong. The banking client stack was a completely different beast – simpler in some ways, but carrying the weight of being one of our most critical clients. This was a bank's critical infrastructure, and there was no room for error.

The banking client's world revolved around two fundamental flows: promotional campaigns that arrived as batch files via SFTP, and real-time transactional notifications that came through socket connections. Each flow had its own personality, its own challenges, and its own requirements for reliability and security.

📁
Promotional Flow

The promotional flow felt like watching a well-orchestrated machine. Files would arrive on our SFTP server, get picked up by cron jobs, transferred to processing VMs where Flume applications would transform and validate the data before feeding it into Kafka. It was batch processing at scale, requiring careful attention to error handling and data integrity.

Transactional Flow

The transactional flow was the complete opposite – real-time, high-stakes, and demanding instant responses. Notifications would arrive through our Node.js socket applications, get authenticated, validated, and immediately pushed into Kafka topics for processing. The margin for error was essentially zero.

🎯 Mission Critical: Banking infrastructure with zero tolerance for errors
7

🔒 The Security Challenge: Understanding Encryption

Working with financial services clients meant that security couldn't be an afterthought. The transactional flow required a sophisticated multi-layered encryption approach that I had to master completely. Each user device maintained its own cryptographic identity through our SDK, with secure key management handled at the backend level. For every notification, we implemented a hybrid encryption strategy that combined the strengths of both symmetric and asymmetric cryptography, ensuring that only the intended device could decrypt and display the message.

The system also needed to track user consent at a granular level – both promotional and transactional notifications required explicit user permission. Our SDK would capture consent from users in the client app and send it to our backend, where it was stored at the user level in Aerospike for validation during notification processing. It was an elegant and secure approach that ensured only authorized devices could decrypt and display notifications.

🔑
RSA Encryption
🛡️
AES Security
User Consent
🔐 Security Architecture: Two-step encryption with device-specific RSA keys and fresh AES keys for each message
8

🚧 The Node.js Performance Wall

As the banking client's usage grew, we started hitting the fundamental limitations of our Node.js push sender. Despite our best optimization efforts – tuning Kafka configurations, optimizing consumer handlers, and addressing rebalancing issues – we could barely push through 5,000 events per minute. For an I/O intensive application, JavaScript's single-threaded nature was becoming a bottleneck we couldn't engineer around.

The Kafka issues were particularly frustrating. Consumer groups would constantly rebalance, disrupting our throughput and causing delivery delays. We tried everything – adjusting timeouts, tuning heartbeat intervals, optimizing our consumer code – but the problems persisted. Eventually, we discovered that the real issue was our Azure HDInsight Kafka setup. Migrating to a standalone Kafka cluster resolved the rebalancing issues, but the fundamental performance limitations remained.

📊
Throughput Limit 5,000 events/min
⚠️
Bottleneck Single-threaded JS
🔍 Root Cause: JavaScript's single-threaded nature became an insurmountable bottleneck for high-throughput processing
9

🚀 Building the Future: The Java Push Sender Revolution

Faced with these limitations, our team made a bold decision – we would rewrite the entire push sender from scratch in Java Spring Boot. This wasn't just about changing languages; it was about reimagining how a high-performance notification system should work.

I took on this challenge with enthusiasm, using AI tools like Windsurf not as a crutch, but as a collaboration partner. I would design the architecture, write the code, and then use AI to review and suggest improvements. Every suggestion was carefully evaluated, modified to match my coding style, and integrated only if it truly improved the solution.

The architecture I built was something I'm genuinely proud of. Using SOLID principles and carefully chosen design patterns, I created a system of abstractions and interfaces that could handle not just the banking client's specific requirements, but also the varied payload types from our main Marketing Automation flow used by all other clients. The results were staggering – on the exact same hardware that struggled with 5,000 events per minute in Node.js, our Java application processed 150,000 events per minute. We rolled it out across all regions and all clients, replacing years of accumulated technical debt with a single, unified solution.

📈
Before 5,000 events/min
🚀
After 150,000 events/min
🎯 Revolutionary Result: 30x performance improvement on the same hardware - replaced years of technical debt with a unified solution
10

🎨 The Art of Performance Optimization

One of the most satisfying moments of my career came from what seemed like a trivial change. While performance testing our Android notification flow with Lambda batching, I noticed we were hitting a throughput ceiling around 33,000 events per minute. Something was creating a bottleneck, but it wasn't obvious what.

Diving into the code, I found the culprit in our batching logic. Every time we added a payload to a batch, we were converting the entire JSON array to a string and calculating its byte length. For a batch of 50 items, this meant the first addition took microseconds, but by the 49th addition, we were spending nearly 3 milliseconds just calculating sizes.

The fix was embarrassingly simple – instead of recalculating the entire batch size each time, just add the new payload's size to a running total. One line of code changed, and our throughput improved dramatically. It was a perfect reminder that performance optimization is often about understanding exactly what your code is doing, not just making it more complex.

🔍
Problem Inefficient batching
Solution One line fix
💡 Key Insight: Performance optimization is about understanding exactly what your code is doing, not making it more complex

🎯 Full-Time Journey Impact

Transforming from Intern to System Architect

🏛️ Enterprise Ready

Successfully transitioned to working with banking clients, understanding enterprise-grade security and compliance requirements.

🏗️ System Architect

Led the complete architectural overhaul from Node.js to Java Spring Boot, achieving 30x performance improvement.


Explore More of the Journey

Discover the technical innovations and problem-solving achievements

← Back to Home Back to Overview Technical Innovations and Problem-Solving →