Hazem Elbaz

Academic and Professional Achievements – 2025

Academic and Professional Achievements – 2025

As 2025 comes to a close, it is important to document a year marked by exceptional academic and professional commitment, unfolding within highly non-traditional and constrained circumstances. The year was shaped by severe humanitarian, institutional, and technical challenges resulting from the ongoing war and its direct impact on the academic and research environment in Gaza.

Despite these conditions, 2025 became a year of academic continuity, strategic adaptation, and a deliberate shift toward more applied and impactful research directions.


1. Academic and Teaching Activities

Throughout 2025, I continued my work as a faculty member and researcher in the areas of:

This included university teaching, academic supervision of graduation projects and research, and the development of instructional content that bridges theoretical foundations with practical, industry-relevant applications.


2. Research and Development

On the research front, my work during 2025 focused on:

A significant portion of this work was conducted under severe operational constraints, including power outages, limited connectivity, and restricted access to computational resources, necessitating highly adaptive research planning and execution strategies.


3. Transition Toward Applied and Open Research

A key milestone in 2025 was a methodological shift toward applied, open, and reproducible research, with emphasis on:

This transition was driven by a strategic objective to maximize scientific, educational, and societal impact in resource-constrained environments.


4. Academic Leadership and Capacity Building

In parallel with research and teaching responsibilities, I continued to contribute to:


Skills Strengthened During 2025

Academic and Professional Skills

Technical Skills


Year Summary

While 2025 was far from a conventional academic year, it proved to be a pivotal period for consolidating academic resilience, advancing applied research with tangible impact, and redefining the notion of achievement under fragile and constrained conditions.


Looking Ahead

As I move into 2026, my focus remains on:

read more

Before You Build Detection… Make Sure You Have Collection

Before You Build Detection… Make Sure You Have Collection

“There’s no detection without collection.”
This simple truth is one of the most overlooked principles in modern SOC operations.


🧠 Introduction

In every SOC I’ve seen, teams are eager to start writing use cases, mapping them to MITRE ATT&CK, creating SIEM rules, and claiming, “We’re ready to detect any attack.”

But too often, they skip the step that makes all of this possible: data collection.

Before your detection rules can work, your SOC must have a solid foundation of telemetry — without it, even the best detection logic will fail silently.


⚙️ Collection Comes Before Detection

A Security Operations Center is an engineered system. The detection layer can’t function if the foundation layer (Telemetry & Collection) is unstable.

You can have the most advanced correlation engine in the world, but if your critical systems aren’t generating or forwarding enough logs, your SOC will see nothing.

Every detection depends on observable facts from your sensors, agents, and integrations.
Missing even one essential source — such as Windows Security Event Logs, EDR process telemetry, or DNS traffic — can create massive blind spots.

This becomes critical during lateral movement or privilege escalation attempts, where visibility gaps can completely hide attacker activity.


🔍 Step One: Log Source Review

Before writing any detection rules, conduct a comprehensive log source review — not a superficial checklist, but a technical validation that answers:

  1. Is the source actually enabled and sending logs to the SIEM?
  2. Are the events complete, or are they truncated?
  3. Do the logs cover all necessary audit categories (authentication, file access, process creation, etc.)?
  4. Are the fields properly parsed and normalized?

This gives you a true view of data coverage, not the assumed one.
Only then can you safely connect your detection use cases to the log sources that actually support them.


🗺️ When There’s No Asset Inventory or Network Diagram

This is one of the most common challenges for new SOC teams.
You enter an environment with thousands of devices and servers — but no updated CMDB, and no clear network map.

In that case, use a Bottom-Up Visibility Mapping approach: build visibility from the telemetry you already have.

Start from your existing logs in the SIEM or EDR and gradually reconstruct the environment:

  1. Identify active devices from endpoint data (DeviceName, Hostname, or AgentID).
  2. Map communication patterns from firewall or proxy logs.
  3. Extract user-to-device associations from Active Directory sign-ins.
  4. Analyze outbound connections to spot systems exposed to the internet.

By doing this, you build a real-world inventory based on evidence — not assumptions — which becomes the backbone of your detection strategy.


📡 Are You Collecting the Right Data?

The more diverse your telemetry, the stronger your detection capabilities.

Examples:

Regularly review your schema coverage:
Make sure critical fields like UserPrincipalName, DeviceId, IPAddress, and Timestamp exist and are normalized.

Such consistency allows your correlation logic to connect dots accurately — the difference between catching an incident or missing an attack entirely.


🧩 Key Takeaways

No Collection, No Detection.
Every SOC’s power begins with what it can see.


If you’re building your own SOC or starting your journey into SIEM and detection engineering, check out:
👉 Microsoft Sentinel Home Lab Setup | Step-by-Step Guide
A complete hands-on tutorial to deploy Microsoft Sentinel, connect data sources, and simulate real detections.


#CyberSecurity #SIEM #SOC #ThreatDetection #SecurityOperations
#MicrosoftSentinel #IncidentResponse #DetectionEngineering #CloudSecurity #SOCAnalysis

read more

New Milestone Achieved

New Milestone Achieved!

I’m pleased to share that I’ve successfully completed the “Business Model Canvas: A Tool for Entrepreneurs and Innovators” course from Kennesaw State University via Coursera.

Certificate: https://coursera.org/share/4bb3fb3a6f2e000cbeff4a6bbc0ea618 Course reference: https://www.coursera.org/learn/business-model-canvas/


Key Skills I Gained from this Training

These skills are extremely relevant to my current direction in AI-Driven SOC Automation — helping me bridge between cybersecurity research and business execution.


This course was a strong addition to my learning pipeline, especially as I continue shaping my upcoming technology venture and refining the business logic behind productizing AI-SOC solutions.


Question to my network: What course or recent learning experience gave you a major “perspective shift” in connecting technical work to business value?

#ContinuousLearning #BusinessModelCanvas #CyberSecurity #AISOC #Entrepreneurship #ProfessionalGrowth


read more

Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

🧠 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

“Every attack is a lesson — the key is building systems that learn faster than attackers do.”
Dr. Hazem A. Elbaz


🚀 Introduction

In this post, I’ll walk you through one of my most exciting hands-on projects — building a Security Operations Center (SOC) from scratch using Microsoft Azure’s free tier and Microsoft Sentinel.
This project is not just theoretical; it captures real-world cyberattacks and transforms them into actionable intelligence through dashboards and live maps.

Whether you’re a cybersecurity student, SOC analyst, or researcher, this lab is an ideal starting point to explore how professional SOC environments detect, collect, and analyze threats in real time.


🏗️ Why I Built This Project

After years of teaching and researching cybersecurity, I wanted to design a lab that:

By using Azure’s free resources, anyone can replicate this setup safely and affordably.


🔍 Project Overview

Here’s what the home SOC includes:

Component Description
Azure Subscription (Free Tier) Deploys all resources at zero cost.
Honeypot VM A Windows 10 machine deliberately exposed to attackers.
Log Analytics Workspace (LAW) Centralized log storage and analysis engine.
Microsoft Sentinel SIEM platform for correlation, alerting, and visualization.
Live Attack Map Displays attack origins in real time.

⚙️ Step-by-Step Highlights

1️⃣ Setting up Azure

Create a free Azure subscription and configure:

2️⃣ Deploying the Honeypot

Expose the VM intentionally:

⚠️ This should be done only in an isolated lab environment.

3️⃣ Observing Attacks

Within minutes, automated bots start brute-forcing your VM.

Monitor Event ID 4625 (Failed Login) using Windows Event Viewer:

4️⃣ Integrating with Sentinel

Use the Azure Monitor Agent to forward logs to Log Analytics Workspace.
Then, connect Sentinel to the workspace for correlation and visualization.

Sample KQL Query:

SecurityEvent
| where EventID == 4625
| project TimeGenerated, Account, IpAddress = tostring(parse_json(AdditionalFields)["IpAddress"])
| sort by TimeGenerated desc

5️⃣ Enriching Data with GeoIP

Import geoip-summarized.csv as a Sentinel Watchlist to map attacks to their geographic origins.

6️⃣ Visualizing Attacks

Create a custom Sentinel Workbook using map.json to generate a live global attack map. You’ll see where attackers are coming from — in real time.


📊 Results and Insights

Within hours of exposure, the honeypot began receiving:

These logs reflect the global nature of cyber threats and demonstrate how SOCs continuously analyze suspicious activities to safeguard systems.


🧠 Lessons Learned


🧩 Next Steps

Future enhancements:


📖 Full Documentation

All setup instructions, queries, and diagrams are available in the public repository: 👉 GitHub: SOC Home Lab from Zero

For a detailed tutorial and reflections, read the Medium article: 👉 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure (replace with your final post URL)


🌐 About the Author

Dr. Hazem A. Elbaz Assistant Professor of Cybersecurity | SOC Automation Researcher | AI-SOC Founder WebsiteLinkedInGitHub


read more

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

In the ever-evolving landscape of cybersecurity, Security Operations Centers (SOCs) are constantly battling an increasing volume and sophistication of threats. The manual burden on analysts is immense, leading to alert fatigue and a struggle to keep pace. This is precisely where the LLM-SOC-Agent project steps in, aiming to transform traditional SOC operations through the power of Large Language Models (LLMs) and intelligent automation.

The LLM-SOC-Agent, an integral part of the broader AI-SOC-Automation initiative, is an open-source endeavor focused on building a multi-agent security framework. This project envisions a future where LLMs act as intelligent assistants, capable of analyzing vast amounts of security data, generating comprehensive insights, and even executing response actions autonomously.

What is LLM-SOC-Agent?

At its core, LLM-SOC-Agent leverages multiple LLM models to analyze and generate security briefs, effectively acting as an AI-driven SOC analyst. The project’s goal is to go beyond simple text generation, enabling LLMs to understand context, reason through security scenarios, and make informed decisions.

Key features and functionalities being developed within LLM-SOC-Agent include:

The project emphasizes a modular design, allowing for individual agents to handle specific tasks and then collaborate to achieve complex security objectives. This agentic approach is crucial for breaking down intricate security problems into manageable, AI-addressable components.

Diving into the Code Repository

The LLM-SOC-Agent GitHub repository (https://github.com/ai-soc-automation/LLM-SOC-Agent) is where the magic happens. While the specifics of the code structure can evolve, you’ll typically find:

The development often involves leveraging LLM frameworks to simplify the process of building intelligent agents, managing their memory, decision-making processes, and tool integrations. This allows the project to focus on the security-specific logic rather than reinventing the wheel for LLM interactions.

Contributing to the Future of SOC Automation

The LLM-SOC-Agent project is a fantastic opportunity for anyone passionate about cybersecurity, AI, and open-source development. Contributions are welcomed from individuals with diverse skill sets, including:

If you’re looking to make a tangible impact on the future of security operations and work with cutting-edge AI technologies, the LLM-SOC-Agent project offers a collaborative environment to learn, build, and innovate. Check out the GitHub repository, explore the existing code, and don’t hesitate to engage with the community to find out how you can contribute!

This is more than just a coding project; it’s about building the next generation of intelligent SOCs, empowering security professionals, and strengthening our defenses against evolving cyber threats.

read more

From Alert Fatigue to Smart Triage: Building an LLM-Powered SOC Agent

From Alert Fatigue to Smart Triage: Building an LLM‑Powered SOC Agent

“Security teams drown in tens of thousands of alerts every day. What if a lightweight language model could triage them for you in real time?”

1. The Pain: Alert Overload & MTTR

Security Operations Centers (SOCs) rely on SIEM and SOAR tools, but rule‑based playbooks often miss context, generating floods of false positives. Analysts spend hours weeding out noise, and Mean‑Time‑To‑Respond (MTTR) balloons.

2. Our Idea: Context‑Aware Enrichment With LLMs

We fine‑tuned DistilRoBERTa using LoRA adapters on a blended corpus of _CIC‑IDS 2018_ logs and our synthetic SOC‑Sim stream. The agent:

  1. Enriches each alert with entity context (IP reputation, MITRE ATT\&CK techniques).
  2. Clusters alerts that share root cause, shrinking queue length.
  3. Prioritises by assigning a risk score using chain‑of‑thought prompting.

3. Architecture

┌──────────┐      ┌────────────┐     ┌─────────────┐
│  Logs    │──►──▶│ Preprocess │──►──▶│ LLM Enrich  │
└──────────┘      └────────────┘     └─────┬───────┘
                                           │  Clusters
                                           ▼
                                     ┌─────────────┐
                                     │  Prioritise │
                                     └─────┬───────┘
                                           ▼
                                     Analyst Dashboard

(A detailed diagram with component icons will be released in the repo’s /docs.)

4. Dataset & Training Pipeline

Dataset Records Label Strategy Notes
CIC‑IDS 2018 2.9 M Original attack labels Cleaned & deduped
SOC‑Sim 1.2 M Synthetic MITRE mapping Covers phish, ransomware

Training lasted 4 h on a single RTX 4090. LoRA reduced GPU memory to < 12 GB.

5. Early Results

Metric Rule‑based SOAR LLM‑SOC‑Agent Δ
MTTR (median) 47 min 32 min ↓ 32 %
False positives 18 % 11 % ↓ 7 pp
Analyst effort (alerts/day) 1 200 820 ↓ 31 %

6. What’s Next

7. Call to Action


This post is part of my ongoing series on AI‑Driven SOC Automation. Browse the entire journey on the AI‑SOC page.

read more

Detecting Network Anomalies with XGBoost and SMOTE

✍️ Blog Title:

Detecting Network Anomalies with XGBoost and SMOTE: From Cybersecurity Logs to AI Models


🧠 Introduction

As someone transitioning from a cybersecurity background into AI, I recently challenged myself to turn raw network traffic into intelligent insights. The result? A complete machine learning pipeline that detects DoS (Denial-of-Service) attacks with 99.9%+ accuracy and AUC, built on top of real-world IoT traffic.

This project marks a key milestone in my journey — transforming my hands-on experience with logs and network security into a practical AI application.


🔍 What Problem Are We Solving?

Traditional intrusion detection systems (IDS) often fail to detect sophisticated or low-rate DoS attacks. Moreover, the volume of network logs and the class imbalance between normal and malicious traffic make this task even harder.

So I asked myself:

Can we use modern machine learning to detect anomalies directly from network logs?


💾 Dataset: IoTID20-Extended (2024)

We used the IoTID20-Extended dataset, a recent and comprehensive collection of real IoT network traffic. It includes labeled flows representing normal and various attack types — including DoS and DDoS.

📌 Dataset link: Kaggle – IoTID20 Dataset


🛠️ Approach Overview

We designed an end-to-end pipeline with the following stages:

  1. Data Preprocessing

    • Handle missing values, encode categorical features, scale numerical ones.
  2. Feature Selection

    • Used SelectKBest to extract top predictive features.
  3. Class Balancing

    • Applied SMOTE to synthetically oversample underrepresented attack traffic.
  4. Model Training

    • Used XGBoost, known for performance on tabular datasets.
  5. Evaluation

    • 10-Fold Cross-Validation using F1-score and ROC-AUC.

📈 Results

The model achieved:

These results are exceptional, but they reflect a balanced, clean dataset. In real-world deployments, we’d expect slightly lower but still strong performance.

📊 Confusion Matrix and ROC Curve plots were also generated (see GitHub).


💡 Why This Matters

This project proves that AI can effectively augment traditional network security — not just by detecting anomalies, but by learning from raw or semi-structured data like logs. It’s a step toward AI-driven intrusion detection systems.

As a cybersecurity expert now stepping into AI, this fusion of domains is exactly where I plan to build next.


📂 Try It Yourself

Full project code, notebook, and results are available on GitHub:

🔗 GitHub Repo – Log Anomaly Detection

Includes:


🚀 Next Steps

This is just the beginning. My roadmap includes:


👨‍💻 About Me

I’m Hazem Elbaz, a cybersecurity researcher shifting toward applied AI and intelligent automation in network defense.

🧭 Follow my journey of building real-world AI from the ground up at: 🔗 elbazhazem.github.io


❓Question for You

Have you tried using ML or AI in log analysis or cybersecurity? What tools or datasets worked for you?

👇 Let’s discuss in the comments.

read more

My Roadmap: From Cybersecurity to Applied AI

My Roadmap: From Cybersecurity to Applied AI

I’ve spent most of my career in cybersecurity.
In 2024, I decided to pivot — not away from cyber, but toward AI-powered security.

Here’s my roadmap for the transition.


🎯 Step 1: Define a Use Case

I didn’t start with models — I started with a problem:

“How can I make logs easier to understand and analyze?”

That became my first AI-for-cyber project.


📚 Step 2: Learn the Basics of AI/ML

I focused on:


🔬 Step 3: Build Something Small, Fast

Log Analyzer LLM
This was my MVP to apply what I learned.


📈 Step 4: Go Deeper

I’m now:


🧠 Step 5: Share, Reflect, and Publish

This blog is part of that effort.

I’m also:


Lessons So Far


🔍 Curious about my work?
Check out my projects or connect on LinkedIn.

read more

Why Cybersecurity Needs AI More Than Ever

Why Cybersecurity Needs AI More Than Ever

Today’s cybersecurity teams are overloaded:

In a modern SOC, the real challenge isn’t detection — it’s prioritization and interpretation.


Where AI Can Help

1. Intelligent Summarization

LLMs can:

2. Threat Contextualization

Instead of just “block port 443”, LLMs can explain:

“This appears to be a reverse shell attempt based on behavior and timing.”

3. Automation of Repetitive Work

All these can be supported by fine-tuned models or simple LLM prompts.


This is Not a Future Vision — It’s Now

Tools like:


But Beware the Hype

AI ≠ Magic.


Final Thought

Cybersecurity needs more than automation.
It needs intelligent augmentation — and that’s where AI shines.

The future analyst is part human, part machine.


🤖 I’m exploring this space deeply in my own projects — see Log Analyzer LLM
🔁 Let’s co-build the next-gen SOC tools.

read more

Lessons from Building My First Log Analyzer with GPT-4

Lessons from Building My First Log Analyzer with GPT-4

When I started building Log Analyzer LLM, I had one goal:

“Make logs readable, fast.”

Logs are noisy, verbose, and contextless. I wanted an AI assistant that could summarize logs and highlight meaningful events — something traditional SIEMs don’t do well.

Here’s what I learned along the way.


1. Structure Matters More Than You Think

The biggest challenge?
Logs are not standardized.

Some logs are JSON. Others are multiline strings, or worse — key-value chaos.

Solution:
I started by building simple pre-processing steps to:


2. Prompt Engineering Is Critical

LLMs are powerful — but they need guidance.

💡 I tested several prompts:

Best results came from combining:


3. Don’t Trust the AI Blindly

LLMs hallucinate. Always.

Sometimes it summarized error logs as “successful operations”.
Other times it guessed at causes.

🚨 Lesson: Always cross-check with known events or ground truth.


4. Python + OpenAI = Fast Prototyping

I used:

Within hours, I had a working proof of concept.


What’s Next?


Building with GPT-4 taught me one thing:
AI isn’t perfect, but it’s incredibly useful if used right.

If you’re in cybersecurity, you should start experimenting.


🧠 Repo: Log Analyzer LLM
💬 DM me if you’re building something similar — let’s connect.

read more

From SIEMs to LLMs: Why I’m Building AI Tools for Cybersecurity

A personal reflection on why I moved from traditional log analysis tools to LLM-powered log insight engines.

From SIEMs to LLMs: Why I’m Building AI Tools for Cybersecurity

For years, I worked with SIEM platforms, firewalls, EDRs, and an endless stream of logs.

Like most security professionals, I knew the routine:

At some point, I asked myself: What if the system could “understand” what the logs are saying, not just parse them?


The Limits of Traditional SIEMs

Traditional SIEMs are excellent at:

But they lack context.
They don’t understand language.
They can’t summarize, explain, or infer like a human analyst.

That’s where Large Language Models (LLMs) come in.


Why LLMs?

LLMs, like GPT-4, bring something new to the table:

✅ They can summarize complex log entries
✅ Extract anomalies or outliers
✅ Translate logs into natural language
✅ Work across diverse sources without custom parsers

It’s not magic — it’s structured prompting, validation, and iteration.


My First Attempt: Log Analyzer LLM

That’s why I built my first prototype:
Log Analyzer with LLM

It:

It’s still early, but the potential is clear:

“I’m not replacing the SOC analyst — I’m giving them a second brain.”


What’s Next?


If you’ve ever been overwhelmed by thousands of logs and dozens of dashboards —
LLMs might be the tool you’ve been waiting for.


📬 Have thoughts? Want to collaborate?
Find me on LinkedIn or explore my next AI-for-cybersecurity projects.

read more