Hazem Elbaz

Academic and Professional Achievements – 2025

2026-01-01T00:00:00+00:00

Academic and Professional Achievements – 2025

As 2025 comes to a close, it is important to document a year marked by exceptional academic and professional commitment, unfolding within highly non-traditional and constrained circumstances. The year was shaped by severe humanitarian, institutional, and technical challenges resulting from the ongoing war and its direct impact on the academic and research environment in Gaza.

Despite these conditions, 2025 became a year of academic continuity, strategic adaptation, and a deliberate shift toward more applied and impactful research directions.

1. Academic and Teaching Activities

Throughout 2025, I continued my work as a faculty member and researcher in the areas of:

Cybersecurity
Network and Cloud Security
Log Analysis and Anomaly Detection
AI-Driven Security Operations Center (SOC) Automation

This included university teaching, academic supervision of graduation projects and research, and the development of instructional content that bridges theoretical foundations with practical, industry-relevant applications.

2. Research and Development

On the research front, my work during 2025 focused on:

Developing anomaly detection models using machine learning and deep learning techniques
Integrating Large Language Models (LLMs) into security log analysis workflows
Designing and prototyping AI-driven SOC automation frameworks
Expanding and refining research papers targeting submission to peer-reviewed journals and conferences

A significant portion of this work was conducted under severe operational constraints, including power outages, limited connectivity, and restricted access to computational resources, necessitating highly adaptive research planning and execution strategies.

3. Transition Toward Applied and Open Research

A key milestone in 2025 was a methodological shift toward applied, open, and reproducible research, with emphasis on:

Documenting research workflows and outputs through open-source GitHub repositories
Building reusable and extensible research artifacts
Aligning academic research with real-world SOC operational needs
Promoting applied research culture among students and early-career researchers

This transition was driven by a strategic objective to maximize scientific, educational, and societal impact in resource-constrained environments.

4. Academic Leadership and Capacity Building

In parallel with research and teaching responsibilities, I continued to contribute to:

Mentorship and academic guidance for students and graduates
Participation in training and capacity-building initiatives
Engagement in discussions related to research development and academic resilience
Strategic planning for future initiatives aimed at creating more sustainable learning and research environments

Skills Strengthened During 2025

Academic and Professional Skills

Applied research design and execution
Scientific writing and peer-review processes
Research supervision and team building
Research project management in high-risk environments

Technical Skills

AI-Driven SOC Automation
Log Analysis and Anomaly Detection
LLM Integration for Cybersecurity
Dataset Engineering and Evaluation Pipelines

Year Summary

While 2025 was far from a conventional academic year, it proved to be a pivotal period for consolidating academic resilience, advancing applied research with tangible impact, and redefining the notion of achievement under fragile and constrained conditions.

Looking Ahead

As I move into 2026, my focus remains on:

Establishing international research collaborations
Securing stable and safe academic and research opportunities
Producing high-quality scientific publications
Developing research frameworks that address local challenges while adhering to international academic standards

Before You Build Detection… Make Sure You Have Collection

2025-11-10T00:00:00+00:00

Before You Build Detection… Make Sure You Have Collection

“There’s no detection without collection.”
This simple truth is one of the most overlooked principles in modern SOC operations.

🧠 Introduction

In every SOC I’ve seen, teams are eager to start writing use cases, mapping them to MITRE ATT&CK, creating SIEM rules, and claiming, “We’re ready to detect any attack.”

But too often, they skip the step that makes all of this possible: data collection.

Before your detection rules can work, your SOC must have a solid foundation of telemetry — without it, even the best detection logic will fail silently.

⚙️ Collection Comes Before Detection

A Security Operations Center is an engineered system. The detection layer can’t function if the foundation layer (Telemetry & Collection) is unstable.

You can have the most advanced correlation engine in the world, but if your critical systems aren’t generating or forwarding enough logs, your SOC will see nothing.

Every detection depends on observable facts from your sensors, agents, and integrations.
Missing even one essential source — such as Windows Security Event Logs, EDR process telemetry, or DNS traffic — can create massive blind spots.

This becomes critical during lateral movement or privilege escalation attempts, where visibility gaps can completely hide attacker activity.

🔍 Step One: Log Source Review

Before writing any detection rules, conduct a comprehensive log source review — not a superficial checklist, but a technical validation that answers:

Is the source actually enabled and sending logs to the SIEM?
Are the events complete, or are they truncated?
Do the logs cover all necessary audit categories (authentication, file access, process creation, etc.)?
Are the fields properly parsed and normalized?

This gives you a true view of data coverage, not the assumed one.
Only then can you safely connect your detection use cases to the log sources that actually support them.

🗺️ When There’s No Asset Inventory or Network Diagram

This is one of the most common challenges for new SOC teams.
You enter an environment with thousands of devices and servers — but no updated CMDB, and no clear network map.

In that case, use a Bottom-Up Visibility Mapping approach: build visibility from the telemetry you already have.

Start from your existing logs in the SIEM or EDR and gradually reconstruct the environment:

Identify active devices from endpoint data (DeviceName, Hostname, or AgentID).
Map communication patterns from firewall or proxy logs.
Extract user-to-device associations from Active Directory sign-ins.
Analyze outbound connections to spot systems exposed to the internet.

By doing this, you build a real-world inventory based on evidence — not assumptions — which becomes the backbone of your detection strategy.

📡 Are You Collecting the Right Data?

The more diverse your telemetry, the stronger your detection capabilities.

Examples:

Endpoint logs → reveal process executions and local activity.
Network telemetry → exposes lateral movements.
Identity logs → highlight suspicious access behavior.
Cloud audit logs → track privileged operations in SaaS or IaaS.

Regularly review your schema coverage:
Make sure critical fields like UserPrincipalName, DeviceId, IPAddress, and Timestamp exist and are normalized.

Such consistency allows your correlation logic to connect dots accurately — the difference between catching an incident or missing an attack entirely.

🧩 Key Takeaways

Visibility is the foundation of detection.
Build your telemetry coverage before your rules.
Review log sources as rigorously as you test detections.
Correlate across data types to see the full attack surface.

No Collection, No Detection.
Every SOC’s power begins with what it can see.

🔗 Read Next

If you’re building your own SOC or starting your journey into SIEM and detection engineering, check out:
👉 Microsoft Sentinel Home Lab Setup | Step-by-Step Guide
A complete hands-on tutorial to deploy Microsoft Sentinel, connect data sources, and simulate real detections.

🏷️ Recommended Tags

#CyberSecurity #SIEM #SOC #ThreatDetection #SecurityOperations
#MicrosoftSentinel #IncidentResponse #DetectionEngineering #CloudSecurity #SOCAnalysis

New Milestone Achieved

2025-11-06T00:00:00+00:00

✨ New Milestone Achieved!

I’m pleased to share that I’ve successfully completed the “Business Model Canvas: A Tool for Entrepreneurs and Innovators” course from Kennesaw State University via Coursera.

Certificate: https://coursera.org/share/4bb3fb3a6f2e000cbeff4a6bbc0ea618 Course reference: https://www.coursera.org/learn/business-model-canvas/

Key Skills I Gained from this Training

Business Model Design & Structuring
Value Proposition Development
Customer Segmentation & Market Fit Thinking
Go-to-Market Strategy Logic
Revenue Streams & Cost Structure Mapping
Lean Startup mindset & hypothesis testing
Translating technical solutions into investor-friendly business language

These skills are extremely relevant to my current direction in AI-Driven SOC Automation — helping me bridge between cybersecurity research and business execution.

This course was a strong addition to my learning pipeline, especially as I continue shaping my upcoming technology venture and refining the business logic behind productizing AI-SOC solutions.

Question to my network: What course or recent learning experience gave you a major “perspective shift” in connecting technical work to business value?

#ContinuousLearning #BusinessModelCanvas #CyberSecurity #AISOC #Entrepreneurship #ProfessionalGrowth

Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

2025-10-05T00:00:00+00:00

🧠 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

“Every attack is a lesson — the key is building systems that learn faster than attackers do.”
— Dr. Hazem A. Elbaz

🚀 Introduction

In this post, I’ll walk you through one of my most exciting hands-on projects — building a Security Operations Center (SOC) from scratch using Microsoft Azure’s free tier and Microsoft Sentinel.
This project is not just theoretical; it captures real-world cyberattacks and transforms them into actionable intelligence through dashboards and live maps.

Whether you’re a cybersecurity student, SOC analyst, or researcher, this lab is an ideal starting point to explore how professional SOC environments detect, collect, and analyze threats in real time.

🏗️ Why I Built This Project

After years of teaching and researching cybersecurity, I wanted to design a lab that:

Bridges theory and reality — by exposing a honeypot to actual attackers.
Empowers learners — to build and observe a functioning SOC environment.
Showcases portfolio-ready skills — for anyone pursuing a cybersecurity career.

By using Azure’s free resources, anyone can replicate this setup safely and affordably.

🔍 Project Overview

Here’s what the home SOC includes:

Component	Description
Azure Subscription (Free Tier)	Deploys all resources at zero cost.
Honeypot VM	A Windows 10 machine deliberately exposed to attackers.
Log Analytics Workspace (LAW)	Centralized log storage and analysis engine.
Microsoft Sentinel	SIEM platform for correlation, alerting, and visualization.
Live Attack Map	Displays attack origins in real time.

⚙️ Step-by-Step Highlights

1️⃣ Setting up Azure

Create a free Azure subscription and configure:

Resource Group
Virtual Network (VNet)
Virtual Machine (Windows 10 Honeypot)

2️⃣ Deploying the Honeypot

Expose the VM intentionally:

Delete RDP security rules.
Allow all inbound traffic.
Disable Windows Firewall to attract attacks.

⚠️ This should be done only in an isolated lab environment.

3️⃣ Observing Attacks

Within minutes, automated bots start brute-forcing your VM.

Monitor Event ID 4625 (Failed Login) using Windows Event Viewer:

Username attempted
IP address
Failure reason

4️⃣ Integrating with Sentinel

Use the Azure Monitor Agent to forward logs to Log Analytics Workspace.
Then, connect Sentinel to the workspace for correlation and visualization.

Sample KQL Query:

SecurityEvent
| where EventID == 4625
| project TimeGenerated, Account, IpAddress = tostring(parse_json(AdditionalFields)["IpAddress"])
| sort by TimeGenerated desc

5️⃣ Enriching Data with GeoIP

Import geoip-summarized.csv as a Sentinel Watchlist to map attacks to their geographic origins.

6️⃣ Visualizing Attacks

Create a custom Sentinel Workbook using map.json to generate a live global attack map. You’ll see where attackers are coming from — in real time.

📊 Results and Insights

Within hours of exposure, the honeypot began receiving:

Hundreds of failed login attempts.
Attack sources from over 50 countries.
Common usernames like admin, test, and employee.

These logs reflect the global nature of cyber threats and demonstrate how SOCs continuously analyze suspicious activities to safeguard systems.

🧠 Lessons Learned

Attack simulation is a powerful learning tool.
Understanding Event ID 4625 is essential for brute-force detection.
KQL is a must-know language for any SOC analyst.
Visual dashboards turn complex data into clear stories for decision-makers.

🧩 Next Steps

Future enhancements:

Integrate Sysmon for deeper telemetry.
Automate alerts with Logic Apps.
Extend to multi-cloud monitoring (AWS / GCP).
Apply AI models or LLMs to summarize log anomalies.

📖 Full Documentation

All setup instructions, queries, and diagrams are available in the public repository: 👉 GitHub: SOC Home Lab from Zero

For a detailed tutorial and reflections, read the Medium article: 👉 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure (replace with your final post URL)

🌐 About the Author

Dr. Hazem A. Elbaz Assistant Professor of Cybersecurity | SOC Automation Researcher | AI-SOC Founder Website • LinkedIn • GitHub

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

2025-07-19T00:00:00+00:00

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

In the ever-evolving landscape of cybersecurity, Security Operations Centers (SOCs) are constantly battling an increasing volume and sophistication of threats. The manual burden on analysts is immense, leading to alert fatigue and a struggle to keep pace. This is precisely where the LLM-SOC-Agent project steps in, aiming to transform traditional SOC operations through the power of Large Language Models (LLMs) and intelligent automation.

The LLM-SOC-Agent, an integral part of the broader AI-SOC-Automation initiative, is an open-source endeavor focused on building a multi-agent security framework. This project envisions a future where LLMs act as intelligent assistants, capable of analyzing vast amounts of security data, generating comprehensive insights, and even executing response actions autonomously.

What is LLM-SOC-Agent?

At its core, LLM-SOC-Agent leverages multiple LLM models to analyze and generate security briefs, effectively acting as an AI-driven SOC analyst. The project’s goal is to go beyond simple text generation, enabling LLMs to understand context, reason through security scenarios, and make informed decisions.

Key features and functionalities being developed within LLM-SOC-Agent include:

Threat Intelligence Analysis: Processing and summarizing threat intelligence data to provide actionable insights on emerging threats.
Log Analysis: Identifying anomalies and suspicious activities within vast volumes of log data.
Vulnerability Assessment: Assessing vulnerabilities and summarizing critical exposures.
Incident Response: Evaluating security incidents and recommending appropriate response actions.
Overseer Summary: Generating a final, consolidated summary brief based on the outputs of various specialized agents.

The project emphasizes a modular design, allowing for individual agents to handle specific tasks and then collaborate to achieve complex security objectives. This agentic approach is crucial for breaking down intricate security problems into manageable, AI-addressable components.

Diving into the Code Repository

The LLM-SOC-Agent GitHub repository (https://github.com/ai-soc-automation/LLM-SOC-Agent) is where the magic happens. While the specifics of the code structure can evolve, you’ll typically find:

Agent Modules: Python scripts or directories dedicated to each specialized agent (e.g., threat_intel_agent.py, log_analysis_agent.py). These modules likely contain the logic for interacting with LLMs, processing specific data types, and generating targeted outputs.
Core Orchestration: Files responsible for coordinating the activities of different agents, defining workflows, and managing the overall execution flow. This might involve setting up communication channels between agents and handling the aggregation of their individual analyses.
Data Handling: Scripts or utilities for data ingestion, preprocessing, and formatting to prepare security data for LLM consumption. The project currently reads .txt files, but future iterations could involve integration with SIEMs, threat intelligence platforms, and other security tools.
Configuration: Files to manage API keys, model selections (e.g., local LLMs via Ollama, or cloud-based LLMs like those from Together API), and other project settings.
Examples and Demos: Sample data and scripts to showcase the agent’s capabilities and provide a starting point for users and contributors.

The development often involves leveraging LLM frameworks to simplify the process of building intelligent agents, managing their memory, decision-making processes, and tool integrations. This allows the project to focus on the security-specific logic rather than reinventing the wheel for LLM interactions.

Contributing to the Future of SOC Automation

The LLM-SOC-Agent project is a fantastic opportunity for anyone passionate about cybersecurity, AI, and open-source development. Contributions are welcomed from individuals with diverse skill sets, including:

Cybersecurity Analysts/Engineers: Provide domain expertise, define use cases, and validate the accuracy and effectiveness of the AI agents.
Machine Learning Engineers/Data Scientists: Develop and fine-tune LLM models, implement new anomaly detection algorithms, and improve the overall intelligence of the agents.
Software Developers: Build new agent modules, enhance existing code, integrate with other security tools, and improve the project’s scalability and robustness.
Researchers: Explore novel applications of LLMs in cybersecurity, contribute to the theoretical foundations, and propose innovative solutions.

If you’re looking to make a tangible impact on the future of security operations and work with cutting-edge AI technologies, the LLM-SOC-Agent project offers a collaborative environment to learn, build, and innovate. Check out the GitHub repository, explore the existing code, and don’t hesitate to engage with the community to find out how you can contribute!

This is more than just a coding project; it’s about building the next generation of intelligent SOCs, empowering security professionals, and strengthening our defenses against evolving cyber threats.

From Alert Fatigue to Smart Triage: Building an LLM-Powered SOC Agent

2025-07-13T00:00:00+00:00

From Alert Fatigue to Smart Triage: Building an LLM‑Powered SOC Agent

“Security teams drown in tens of thousands of alerts every day. What if a lightweight language model could triage them for you in real time?”

1. The Pain: Alert Overload & MTTR

Security Operations Centers (SOCs) rely on SIEM and SOAR tools, but rule‑based playbooks often miss context, generating floods of false positives. Analysts spend hours weeding out noise, and Mean‑Time‑To‑Respond (MTTR) balloons.

2. Our Idea: Context‑Aware Enrichment With LLMs

We fine‑tuned DistilRoBERTa using LoRA adapters on a blended corpus of _CIC‑IDS 2018_ logs and our synthetic SOC‑Sim stream. The agent:

Enriches each alert with entity context (IP reputation, MITRE ATT\&CK techniques).
Clusters alerts that share root cause, shrinking queue length.
Prioritises by assigning a risk score using chain‑of‑thought prompting.

3. Architecture

┌──────────┐      ┌────────────┐     ┌─────────────┐
│  Logs    │──►──▶│ Preprocess │──►──▶│ LLM Enrich  │
└──────────┘      └────────────┘     └─────┬───────┘
                                           │  Clusters
                                           ▼
                                     ┌─────────────┐
                                     │  Prioritise │
                                     └─────┬───────┘
                                           ▼
                                     Analyst Dashboard

(A detailed diagram with component icons will be released in the repo’s /docs.)

4. Dataset & Training Pipeline

Dataset	Records	Label Strategy	Notes
CIC‑IDS 2018	2.9 M	Original attack labels	Cleaned & deduped
SOC‑Sim	1.2 M	Synthetic MITRE mapping	Covers phish, ransomware

Training lasted 4 h on a single RTX 4090. LoRA reduced GPU memory to < 12 GB.

5. Early Results

Metric	Rule‑based SOAR	LLM‑SOC‑Agent	Δ
MTTR (median)	47 min	32 min	↓ 32 %
False positives	18 %	11 %	↓ 7 pp
Analyst effort (alerts/day)	1 200	820	↓ 31 %

6. What’s Next

Real‑time Zeek telemetry ingest
Adversarial robustness testing (IBM ART)
Feedback loop to fine‑tune on analyst decisions

7. Call to Action

⭐ Star the repo: https://github.com/ai-soc-automation/LLM-SOC-Agent
🐞 File issues or suggest datasets
💬 Join the discussion on LinkedIn.

This post is part of my ongoing series on AI‑Driven SOC Automation. Browse the entire journey on the AI‑SOC page.

Detecting Network Anomalies with XGBoost and SMOTE

2025-06-20T00:00:00+00:00

✍️ Blog Title:

Detecting Network Anomalies with XGBoost and SMOTE: From Cybersecurity Logs to AI Models

🧠 Introduction

As someone transitioning from a cybersecurity background into AI, I recently challenged myself to turn raw network traffic into intelligent insights. The result? A complete machine learning pipeline that detects DoS (Denial-of-Service) attacks with 99.9%+ accuracy and AUC, built on top of real-world IoT traffic.

This project marks a key milestone in my journey — transforming my hands-on experience with logs and network security into a practical AI application.

🔍 What Problem Are We Solving?

Traditional intrusion detection systems (IDS) often fail to detect sophisticated or low-rate DoS attacks. Moreover, the volume of network logs and the class imbalance between normal and malicious traffic make this task even harder.

So I asked myself:

Can we use modern machine learning to detect anomalies directly from network logs?

💾 Dataset: IoTID20-Extended (2024)

We used the IoTID20-Extended dataset, a recent and comprehensive collection of real IoT network traffic. It includes labeled flows representing normal and various attack types — including DoS and DDoS.

📌 Dataset link: Kaggle – IoTID20 Dataset

🛠️ Approach Overview

We designed an end-to-end pipeline with the following stages:

Data Preprocessing
- Handle missing values, encode categorical features, scale numerical ones.
Feature Selection
- Used SelectKBest to extract top predictive features.
Class Balancing
- Applied SMOTE to synthetically oversample underrepresented attack traffic.
Model Training
- Used XGBoost, known for performance on tabular datasets.
Evaluation
- 10-Fold Cross-Validation using F1-score and ROC-AUC.

📈 Results

The model achieved:

✅ Accuracy: 100%
✅ F1 Score: 1.00
✅ ROC-AUC: 1.00

These results are exceptional, but they reflect a balanced, clean dataset. In real-world deployments, we’d expect slightly lower but still strong performance.

📊 Confusion Matrix and ROC Curve plots were also generated (see GitHub).

💡 Why This Matters

This project proves that AI can effectively augment traditional network security — not just by detecting anomalies, but by learning from raw or semi-structured data like logs. It’s a step toward AI-driven intrusion detection systems.

As a cybersecurity expert now stepping into AI, this fusion of domains is exactly where I plan to build next.

📂 Try It Yourself

Full project code, notebook, and results are available on GitHub:

🔗 GitHub Repo – Log Anomaly Detection

Includes:

Notebook with all steps
Visual results
Cleaned dataset path
README.md + requirements.txt

🚀 Next Steps

This is just the beginning. My roadmap includes:

Applying LLMs to raw .log files
Integrating SHAP/LIME for model explainability
Deploying real-time log anomaly detectors
Combining clustering + classification in hybrid models

👨‍💻 About Me

I’m Hazem Elbaz, a cybersecurity researcher shifting toward applied AI and intelligent automation in network defense.

🧭 Follow my journey of building real-world AI from the ground up at: 🔗 elbazhazem.github.io

❓Question for You

Have you tried using ML or AI in log analysis or cybersecurity? What tools or datasets worked for you?

👇 Let’s discuss in the comments.

My Roadmap: From Cybersecurity to Applied AI

2025-06-19T00:00:00+00:00

My Roadmap: From Cybersecurity to Applied AI

I’ve spent most of my career in cybersecurity.
In 2024, I decided to pivot — not away from cyber, but toward AI-powered security.

Here’s my roadmap for the transition.

🎯 Step 1: Define a Use Case

I didn’t start with models — I started with a problem:

“How can I make logs easier to understand and analyze?”

That became my first AI-for-cyber project.

📚 Step 2: Learn the Basics of AI/ML

I focused on:

Python for data and APIs
Numpy, Pandas
Scikit-learn for traditional models
HuggingFace + OpenAI for LLMs
LangChain for chaining prompts

🔬 Step 3: Build Something Small, Fast

→ Log Analyzer LLM
This was my MVP to apply what I learned.

📈 Step 4: Go Deeper

I’m now:

Learning clustering + classification
Experimenting with fine-tuning
Studying academic papers
Rebuilding my GitHub profile with applied projects

This blog is part of that effort.

I’m also:

Writing a research paper
Applying for academic/industry roles
Building a public portfolio of AI + cybersecurity tools

Lessons So Far

AI is not a destination — it’s a toolkit
Focus on usefulness, not hype
You don’t need a PhD to start

🔍 Curious about my work?
Check out my projects or connect on LinkedIn.

Why Cybersecurity Needs AI More Than Ever

2025-06-18T00:00:00+00:00

Why Cybersecurity Needs AI More Than Ever

Today’s cybersecurity teams are overloaded:

📈 Alert fatigue
⌛ Shortage of skilled analysts
🚨 False positives everywhere
🕵️‍♂️ Sophisticated, evasive threats

In a modern SOC, the real challenge isn’t detection — it’s prioritization and interpretation.

Where AI Can Help

1. Intelligent Summarization

LLMs can:

Digest 500 lines of logs
Summarize what happened
Highlight what matters

2. Threat Contextualization

Instead of just “block port 443”, LLMs can explain:

“This appears to be a reverse shell attempt based on behavior and timing.”

3. Automation of Repetitive Work

Categorize phishing emails
Triage alerts
Recommend mitigation steps

All these can be supported by fine-tuned models or simple LLM prompts.

This is Not a Future Vision — It’s Now

Tools like:

GPT-4 + Python
LangChain + SIEM integrations
Vector databases + threat intel are already being tested in production environments.

But Beware the Hype

AI ≠ Magic.

It needs validation
It requires tuning
It must be explainable

Final Thought

Cybersecurity needs more than automation.
It needs intelligent augmentation — and that’s where AI shines.

The future analyst is part human, part machine.

🤖 I’m exploring this space deeply in my own projects — see Log Analyzer LLM
🔁 Let’s co-build the next-gen SOC tools.

Lessons from Building My First Log Analyzer with GPT-4

2025-06-17T00:00:00+00:00

Lessons from Building My First Log Analyzer with GPT-4

When I started building Log Analyzer LLM, I had one goal:

“Make logs readable, fast.”

Logs are noisy, verbose, and contextless. I wanted an AI assistant that could summarize logs and highlight meaningful events — something traditional SIEMs don’t do well.

Here’s what I learned along the way.

1. Structure Matters More Than You Think

The biggest challenge?
Logs are not standardized.

Some logs are JSON. Others are multiline strings, or worse — key-value chaos.

Solution:
I started by building simple pre-processing steps to:

Remove noise and timestamps
Break logs into chunks
Group them by similarity

2. Prompt Engineering Is Critical

LLMs are powerful — but they need guidance.

💡 I tested several prompts:

“Summarize these log lines in plain English.”
“Detect any anomalies in these logs.”
“Explain what this log segment means.”

Best results came from combining:

System-level instructions (e.g., “You are a cybersecurity analyst.”)
Contextual samples (few-shot prompting)

3. Don’t Trust the AI Blindly

LLMs hallucinate. Always.

Sometimes it summarized error logs as “successful operations”.
Other times it guessed at causes.

🚨 Lesson: Always cross-check with known events or ground truth.

4. Python + OpenAI = Fast Prototyping

I used:

openai Python SDK
Simple .log file reader
Streamlit for UI (optional)

Within hours, I had a working proof of concept.

What’s Next?

Auto-grouping similar events (clustering)
Hybrid models: rules + LLMs
Anomaly scoring
Integration with real-time log streams

Building with GPT-4 taught me one thing:
AI isn’t perfect, but it’s incredibly useful if used right.

If you’re in cybersecurity, you should start experimenting.

🧠 Repo: Log Analyzer LLM
💬 DM me if you’re building something similar — let’s connect.

Hazem Elbaz

Academic and Professional Achievements – 2025

Academic and Professional Achievements – 2025

1. Academic and Teaching Activities

2. Research and Development

3. Transition Toward Applied and Open Research

4. Academic Leadership and Capacity Building

Skills Strengthened During 2025

Academic and Professional Skills

Technical Skills

Year Summary

Looking Ahead

Before You Build Detection… Make Sure You Have Collection

Before You Build Detection… Make Sure You Have Collection

🧠 Introduction

⚙️ Collection Comes Before Detection

🔍 Step One: Log Source Review

🗺️ When There’s No Asset Inventory or Network Diagram

📡 Are You Collecting the Right Data?

🧩 Key Takeaways

🔗 Read Next

🏷️ Recommended Tags

New Milestone Achieved

Key Skills I Gained from this Training

Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

🧠 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure

🚀 Introduction

🏗️ Why I Built This Project

🔍 Project Overview

⚙️ Step-by-Step Highlights

1️⃣ Setting up Azure

2️⃣ Deploying the Honeypot

3️⃣ Observing Attacks

4️⃣ Integrating with Sentinel

5️⃣ Enriching Data with GeoIP

6️⃣ Visualizing Attacks

📊 Results and Insights

🧠 Lessons Learned

🧩 Next Steps

📖 Full Documentation

🌐 About the Author

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI

What is LLM-SOC-Agent?

Diving into the Code Repository

Contributing to the Future of SOC Automation

From Alert Fatigue to Smart Triage: Building an LLM-Powered SOC Agent

From Alert Fatigue to Smart Triage: Building an LLM‑Powered SOC Agent

1. The Pain: Alert Overload & MTTR

2. Our Idea: Context‑Aware Enrichment With LLMs

3. Architecture

4. Dataset & Training Pipeline

5. Early Results

6. What’s Next

7. Call to Action

Detecting Network Anomalies with XGBoost and SMOTE

✍️ Blog Title:

🧠 Introduction

🔍 What Problem Are We Solving?

💾 Dataset: IoTID20-Extended (2024)

🛠️ Approach Overview

📈 Results

💡 Why This Matters

📂 Try It Yourself

🚀 Next Steps

👨‍💻 About Me

❓Question for You

My Roadmap: From Cybersecurity to Applied AI

My Roadmap: From Cybersecurity to Applied AI

🎯 Step 1: Define a Use Case

📚 Step 2: Learn the Basics of AI/ML

🔬 Step 3: Build Something Small, Fast

📈 Step 4: Go Deeper

🧠 Step 5: Share, Reflect, and Publish

Lessons So Far

Why Cybersecurity Needs AI More Than Ever

Why Cybersecurity Needs AI More Than Ever

Where AI Can Help

1. Intelligent Summarization

2. Threat Contextualization