In Part 1 of this series we installed Hadoop 3.3.6 natively on Ubuntu 24.04 and got HDFS running in pseudo-distributed mode. That gave us a working distributed file system, but Hadoop is much more than storage — its true power lies in processing large datasets in parallel using the MapReduce programming model. In this...
Continue reading...Hadoop 3.3.6 on Ubuntu: Native Installation without Virtual Machine
When I started working with Hadoop in a learning environment, the course guide indicated using Linux Mint in a virtual machine. However, I already had Ubuntu 24.04 installed natively on my Dell Vostro with 32 GB of RAM, and it seemed smarter to leverage it directly. In this post I...
Continue reading...PersonaPlex: Mastering Conversational English with NVIDIA and RunPod
PersonaPlex: Mastering Conversational English with NVIDIA and RunPod Improving conversational English requires real-time interaction, but finding a native speaker available 24/7 can be a challenge. In this post, I will show you how I deployed NVIDIA’s latest PersonaPlex-7B-v1 model on RunPod to create a private, high-performance English tutor. What is...
Continue reading...Maximizing Value: How I Optimize GitHub Copilot Pro and Anthropic Subscriptions for Coding and Research
Context: Why Model and Platform Matter As a data scientist and developer, I rely on advanced LLMs (Large Language Models) like Claude Opus, Sonnet, GPT-4.1, and GPT-4o for both architectural planning and daily coding. But I quickly learned that the same model behaves differently depending on the platform—and that maximizing value...
Continue reading...Time Series Forecasting with Exogenous Variables: SARIMAX vs Prophet
Last quarter, I needed to forecast SMS volume for budget planning. The catch? Our SMS volume directly depends on how many locations we operate—and that number is also growing. I couldn’t just extrapolate historical trends; I needed a model that accounts for this external driver. This is a common scenario...
Continue reading...