<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://elbazhazem.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://elbazhazem.github.io/" rel="alternate" type="text/html" hreflang="en" /><updated>2026-01-02T05:35:05+00:00</updated><id>https://elbazhazem.github.io/feed.xml</id><title type="html">Hazem Elbaz</title><subtitle>Made by Hazem Elbaz</subtitle><author><name>Hazem Elbaz</name></author><entry><title type="html">Academic and Professional Achievements – 2025</title><link href="https://elbazhazem.github.io/Achievement/" rel="alternate" type="text/html" title="Academic and Professional Achievements – 2025" /><published>2026-01-01T00:00:00+00:00</published><updated>2026-01-01T00:00:00+00:00</updated><id>https://elbazhazem.github.io/Achievement</id><content type="html" xml:base="https://elbazhazem.github.io/Achievement/"><![CDATA[<h1 id="academic-and-professional-achievements--2025">Academic and Professional Achievements – 2025</h1>

<p>As 2025 comes to a close, it is important to document a year marked by exceptional academic and professional commitment, unfolding within highly non-traditional and constrained circumstances. The year was shaped by severe humanitarian, institutional, and technical challenges resulting from the ongoing war and its direct impact on the academic and research environment in Gaza.</p>

<p>Despite these conditions, 2025 became a year of academic continuity, strategic adaptation, and a deliberate shift toward more applied and impactful research directions.</p>

<hr />

<h2 id="1-academic-and-teaching-activities">1. Academic and Teaching Activities</h2>

<p>Throughout 2025, I continued my work as a faculty member and researcher in the areas of:</p>

<ul>
  <li>Cybersecurity</li>
  <li>Network and Cloud Security</li>
  <li>Log Analysis and Anomaly Detection</li>
  <li>AI-Driven Security Operations Center (SOC) Automation</li>
</ul>

<p>This included university teaching, academic supervision of graduation projects and research, and the development of instructional content that bridges theoretical foundations with practical, industry-relevant applications.</p>

<hr />

<h2 id="2-research-and-development">2. Research and Development</h2>

<p>On the research front, my work during 2025 focused on:</p>

<ul>
  <li>Developing anomaly detection models using machine learning and deep learning techniques</li>
  <li>Integrating Large Language Models (LLMs) into security log analysis workflows</li>
  <li>Designing and prototyping AI-driven SOC automation frameworks</li>
  <li>Expanding and refining research papers targeting submission to peer-reviewed journals and conferences</li>
</ul>

<p>A significant portion of this work was conducted under severe operational constraints, including power outages, limited connectivity, and restricted access to computational resources, necessitating highly adaptive research planning and execution strategies.</p>

<hr />

<h2 id="3-transition-toward-applied-and-open-research">3. Transition Toward Applied and Open Research</h2>

<p>A key milestone in 2025 was a methodological shift toward applied, open, and reproducible research, with emphasis on:</p>

<ul>
  <li>Documenting research workflows and outputs through open-source GitHub repositories</li>
  <li>Building reusable and extensible research artifacts</li>
  <li>Aligning academic research with real-world SOC operational needs</li>
  <li>Promoting applied research culture among students and early-career researchers</li>
</ul>

<p>This transition was driven by a strategic objective to maximize scientific, educational, and societal impact in resource-constrained environments.</p>

<hr />

<h2 id="4-academic-leadership-and-capacity-building">4. Academic Leadership and Capacity Building</h2>

<p>In parallel with research and teaching responsibilities, I continued to contribute to:</p>

<ul>
  <li>Mentorship and academic guidance for students and graduates</li>
  <li>Participation in training and capacity-building initiatives</li>
  <li>Engagement in discussions related to research development and academic resilience</li>
  <li>Strategic planning for future initiatives aimed at creating more sustainable learning and research environments</li>
</ul>

<hr />

<h2 id="skills-strengthened-during-2025">Skills Strengthened During 2025</h2>

<h3 id="academic-and-professional-skills">Academic and Professional Skills</h3>
<ul>
  <li>Applied research design and execution</li>
  <li>Scientific writing and peer-review processes</li>
  <li>Research supervision and team building</li>
  <li>Research project management in high-risk environments</li>
</ul>

<h3 id="technical-skills">Technical Skills</h3>
<ul>
  <li>AI-Driven SOC Automation</li>
  <li>Log Analysis and Anomaly Detection</li>
  <li>LLM Integration for Cybersecurity</li>
  <li>Dataset Engineering and Evaluation Pipelines</li>
</ul>

<hr />

<h2 id="year-summary">Year Summary</h2>

<p>While 2025 was far from a conventional academic year, it proved to be a pivotal period for consolidating academic resilience, advancing applied research with tangible impact, and redefining the notion of achievement under fragile and constrained conditions.</p>

<hr />

<h2 id="looking-ahead">Looking Ahead</h2>

<p>As I move into 2026, my focus remains on:</p>

<ul>
  <li>Establishing international research collaborations</li>
  <li>Securing stable and safe academic and research opportunities</li>
  <li>Producing high-quality scientific publications</li>
  <li>Developing research frameworks that address local challenges while adhering to international academic standards</li>
</ul>]]></content><author><name>Dr. Hazem Abdul Qader Elbaz</name></author><category term="Achievements" /><category term="Academic" /><category term="Cybersecurity" /><category term="Cybersecurity" /><category term="Research" /><category term="SOC" /><category term="LLMs" /><category term="Anomaly Detection" /><category term="Gaza" /><category term="Academic Resilience" /><summary type="html"><![CDATA[A reflective academic and professional summary of 2025, focused on applied cybersecurity research, AI-driven SOC automation, and academic resilience under constrained conditions.]]></summary></entry><entry><title type="html">Before You Build Detection… Make Sure You Have Collection</title><link href="https://elbazhazem.github.io/before-you-build-detection-make-sure-you-have-collection/" rel="alternate" type="text/html" title="Before You Build Detection… Make Sure You Have Collection" /><published>2025-11-10T00:00:00+00:00</published><updated>2025-11-10T00:00:00+00:00</updated><id>https://elbazhazem.github.io/before-you-build-detection-make-sure-you-have-collection</id><content type="html" xml:base="https://elbazhazem.github.io/before-you-build-detection-make-sure-you-have-collection/"><![CDATA[<h1 id="before-you-build-detection-make-sure-you-have-collection">Before You Build Detection… Make Sure You Have Collection</h1>

<blockquote>
  <p>“There’s no detection without collection.”<br />
This simple truth is one of the most overlooked principles in modern SOC operations.</p>
</blockquote>

<hr />

<h2 id="-introduction">🧠 Introduction</h2>

<p>In every SOC I’ve seen, teams are eager to start writing <em>use cases</em>, mapping them to <em>MITRE ATT&amp;CK</em>, creating <em>SIEM rules</em>, and claiming, “We’re ready to detect any attack.”</p>

<p>But too often, they skip the step that makes all of this possible: <strong>data collection</strong>.</p>

<p>Before your detection rules can work, your SOC must have a solid foundation of telemetry — without it, even the best detection logic will fail silently.</p>

<hr />

<h2 id="️-collection-comes-before-detection">⚙️ Collection Comes Before Detection</h2>

<p>A Security Operations Center is an engineered system. The <strong>detection layer</strong> can’t function if the <strong>foundation layer</strong> (Telemetry &amp; Collection) is unstable.</p>

<p>You can have the most advanced correlation engine in the world, but if your critical systems aren’t generating or forwarding enough logs, your SOC will see nothing.</p>

<p>Every detection depends on <em>observable facts</em> from your <strong>sensors, agents, and integrations</strong>.<br />
Missing even one essential source — such as Windows Security Event Logs, EDR process telemetry, or DNS traffic — can create massive blind spots.</p>

<p>This becomes critical during <strong>lateral movement</strong> or <strong>privilege escalation</strong> attempts, where visibility gaps can completely hide attacker activity.</p>

<hr />

<h2 id="-step-one-log-source-review">🔍 Step One: Log Source Review</h2>

<p>Before writing any detection rules, conduct a <strong>comprehensive log source review</strong> — not a superficial checklist, but a technical validation that answers:</p>

<ol>
  <li>Is the source actually enabled and sending logs to the SIEM?</li>
  <li>Are the events complete, or are they truncated?</li>
  <li>Do the logs cover all necessary audit categories (authentication, file access, process creation, etc.)?</li>
  <li>Are the fields properly parsed and normalized?</li>
</ol>

<p>This gives you a <strong>true view of data coverage</strong>, not the assumed one.<br />
Only then can you safely connect your detection use cases to the log sources that actually support them.</p>

<hr />

<h2 id="️-when-theres-no-asset-inventory-or-network-diagram">🗺️ When There’s No Asset Inventory or Network Diagram</h2>

<p>This is one of the most common challenges for new SOC teams.<br />
You enter an environment with thousands of devices and servers — but no updated CMDB, and no clear network map.</p>

<p>In that case, use a <strong>Bottom-Up Visibility Mapping</strong> approach: build visibility from the telemetry you already have.</p>

<p>Start from your existing logs in the SIEM or EDR and gradually reconstruct the environment:</p>

<ol>
  <li>Identify active devices from endpoint data (<code class="language-plaintext highlighter-rouge">DeviceName</code>, <code class="language-plaintext highlighter-rouge">Hostname</code>, or <code class="language-plaintext highlighter-rouge">AgentID</code>).</li>
  <li>Map communication patterns from firewall or proxy logs.</li>
  <li>Extract user-to-device associations from Active Directory sign-ins.</li>
  <li>Analyze outbound connections to spot systems exposed to the internet.</li>
</ol>

<p>By doing this, you build a <strong>real-world inventory</strong> based on evidence — not assumptions — which becomes the backbone of your detection strategy.</p>

<hr />

<h2 id="-are-you-collecting-the-right-data">📡 Are You Collecting the Right Data?</h2>

<p>The more <strong>diverse your telemetry</strong>, the stronger your detection capabilities.</p>

<p>Examples:</p>
<ul>
  <li><strong>Endpoint logs</strong> → reveal process executions and local activity.</li>
  <li><strong>Network telemetry</strong> → exposes lateral movements.</li>
  <li><strong>Identity logs</strong> → highlight suspicious access behavior.</li>
  <li><strong>Cloud audit logs</strong> → track privileged operations in SaaS or IaaS.</li>
</ul>

<p>Regularly review your <strong>schema coverage</strong>:<br />
Make sure critical fields like <code class="language-plaintext highlighter-rouge">UserPrincipalName</code>, <code class="language-plaintext highlighter-rouge">DeviceId</code>, <code class="language-plaintext highlighter-rouge">IPAddress</code>, and <code class="language-plaintext highlighter-rouge">Timestamp</code> exist and are normalized.</p>

<p>Such consistency allows your correlation logic to connect dots accurately — the difference between catching an incident or missing an attack entirely.</p>

<hr />

<h2 id="-key-takeaways">🧩 Key Takeaways</h2>

<ul>
  <li>Visibility is the foundation of detection.</li>
  <li>Build your telemetry coverage before your rules.</li>
  <li>Review log sources as rigorously as you test detections.</li>
  <li>Correlate across data types to see the full attack surface.</li>
</ul>

<blockquote>
  <p><strong>No Collection, No Detection.</strong><br />
Every SOC’s power begins with what it can see.</p>
</blockquote>

<hr />

<h2 id="-read-next">🔗 Read Next</h2>

<p>If you’re building your own SOC or starting your journey into SIEM and detection engineering, check out:<br />
👉 <a href="https://medium.com/p/microsoft-sentinel-home-lab-setup-step-by-step-guide-c8a2677f34e0?source=social.tw"><strong>Microsoft Sentinel Home Lab Setup | Step-by-Step Guide</strong></a><br />
A complete hands-on tutorial to deploy Microsoft Sentinel, connect data sources, and simulate real detections.</p>

<hr />

<h2 id="️-recommended-tags">🏷️ Recommended Tags</h2>

<p><code class="language-plaintext highlighter-rouge">#CyberSecurity</code> <code class="language-plaintext highlighter-rouge">#SIEM</code> <code class="language-plaintext highlighter-rouge">#SOC</code> <code class="language-plaintext highlighter-rouge">#ThreatDetection</code> <code class="language-plaintext highlighter-rouge">#SecurityOperations</code><br />
<code class="language-plaintext highlighter-rouge">#MicrosoftSentinel</code> <code class="language-plaintext highlighter-rouge">#IncidentResponse</code> <code class="language-plaintext highlighter-rouge">#DetectionEngineering</code> <code class="language-plaintext highlighter-rouge">#CloudSecurity</code> <code class="language-plaintext highlighter-rouge">#SOCAnalysis</code></p>]]></content><author><name>Hazem Elbaz</name></author><category term="CyberSecurity" /><category term="SOC" /><category term="SIEM" /><category term="DetectionEngineering" /><category term="ThreatDetection" /><category term="MicrosoftSentinel" /><category term="CloudSecurity" /><summary type="html"><![CDATA[A deep dive into the critical role of data collection in SOC operations — why 'no collection, no detection' should guide your entire detection engineering process.]]></summary></entry><entry><title type="html">New Milestone Achieved</title><link href="https://elbazhazem.github.io/New-Milestone-Achieved/" rel="alternate" type="text/html" title="New Milestone Achieved" /><published>2025-11-06T00:00:00+00:00</published><updated>2025-11-06T00:00:00+00:00</updated><id>https://elbazhazem.github.io/New-Milestone-Achieved</id><content type="html" xml:base="https://elbazhazem.github.io/New-Milestone-Achieved/"><![CDATA[<p>✨ <strong>New Milestone Achieved!</strong></p>

<p>I’m pleased to share that I’ve successfully completed the <strong>“Business Model Canvas: A Tool for Entrepreneurs and Innovators”</strong> course from Kennesaw State University via Coursera.</p>

<p>Certificate: <a href="https://coursera.org/share/4bb3fb3a6f2e000cbeff4a6bbc0ea618">https://coursera.org/share/4bb3fb3a6f2e000cbeff4a6bbc0ea618</a>
Course reference: <a href="https://www.coursera.org/learn/business-model-canvas/">https://www.coursera.org/learn/business-model-canvas/</a></p>

<hr />

<h3 id="key-skills-i-gained-from-this-training"><strong>Key Skills I Gained from this Training</strong></h3>

<ul>
  <li>Business Model Design &amp; Structuring</li>
  <li>Value Proposition Development</li>
  <li>Customer Segmentation &amp; Market Fit Thinking</li>
  <li>Go-to-Market Strategy Logic</li>
  <li>Revenue Streams &amp; Cost Structure Mapping</li>
  <li>Lean Startup mindset &amp; hypothesis testing</li>
  <li>Translating technical solutions into investor-friendly business language</li>
</ul>

<p>These skills are extremely relevant to my current direction in <strong>AI-Driven SOC Automation</strong> — helping me bridge between cybersecurity research and business execution.</p>

<hr />

<p>This course was a strong addition to my learning pipeline, especially as I continue shaping my upcoming technology venture and refining the business logic behind productizing AI-SOC solutions.</p>

<hr />

<p><strong>Question to my network:</strong>
What course or recent learning experience gave you a major “perspective shift” in connecting technical work to business value?</p>

<p>#ContinuousLearning #BusinessModelCanvas #CyberSecurity #AISOC #Entrepreneurship #ProfessionalGrowth</p>

<hr />]]></content><author><name>Hazem Elbaz</name></author><category term="ContinuousLearning" /><category term="BusinessModelCanvas" /><category term="CyberSecurity" /><category term="AISOC" /><category term="Entrepreneurship" /><category term="ProfessionalGrowth" /><category term="Innovation" /><category term="StartupBuilding" /><category term="ValuePropositionDesign" /><summary type="html"><![CDATA[A short reflection on completing the Business Model Canvas course from Kennesaw State University on Coursera, highlighting the new strategic skills gained and how they support my AI-Driven SOC Automation journey.]]></summary></entry><entry><title type="html">Building a SOC Home Lab from Zero — Catching Real Attackers on Azure</title><link href="https://elbazhazem.github.io/Building-SOC-Home-Lab-from-Zero/" rel="alternate" type="text/html" title="Building a SOC Home Lab from Zero — Catching Real Attackers on Azure" /><published>2025-10-05T00:00:00+00:00</published><updated>2025-10-05T00:00:00+00:00</updated><id>https://elbazhazem.github.io/Building-SOC-Home-Lab-from-Zero</id><content type="html" xml:base="https://elbazhazem.github.io/Building-SOC-Home-Lab-from-Zero/"><![CDATA[<h1 id="-building-a-soc-home-lab-from-zero--catching-real-attackers-on-azure">🧠 Building a SOC Home Lab from Zero — Catching Real Attackers on Azure</h1>

<blockquote>
  <p><em>“Every attack is a lesson — the key is building systems that learn faster than attackers do.”</em><br />
— <strong>Dr. Hazem A. Elbaz</strong></p>
</blockquote>

<hr />

<h2 id="-introduction">🚀 Introduction</h2>

<p>In this post, I’ll walk you through one of my most exciting hands-on projects — building a <strong>Security Operations Center (SOC) from scratch</strong> using <strong>Microsoft Azure’s free tier</strong> and <strong>Microsoft Sentinel</strong>.<br />
This project is not just theoretical; it captures <strong>real-world cyberattacks</strong> and transforms them into <strong>actionable intelligence</strong> through dashboards and live maps.</p>

<p>Whether you’re a <strong>cybersecurity student</strong>, <strong>SOC analyst</strong>, or <strong>researcher</strong>, this lab is an ideal starting point to explore how professional SOC environments detect, collect, and analyze threats in real time.</p>

<hr />

<h2 id="️-why-i-built-this-project">🏗️ Why I Built This Project</h2>

<p>After years of teaching and researching cybersecurity, I wanted to design a lab that:</p>
<ul>
  <li><strong>Bridges theory and reality</strong> — by exposing a honeypot to actual attackers.</li>
  <li><strong>Empowers learners</strong> — to build and observe a functioning SOC environment.</li>
  <li><strong>Showcases portfolio-ready skills</strong> — for anyone pursuing a cybersecurity career.</li>
</ul>

<p>By using Azure’s free resources, anyone can replicate this setup safely and affordably.</p>

<hr />

<h2 id="-project-overview">🔍 Project Overview</h2>

<p>Here’s what the home SOC includes:</p>

<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><strong>Azure Subscription (Free Tier)</strong></td>
      <td>Deploys all resources at zero cost.</td>
    </tr>
    <tr>
      <td><strong>Honeypot VM</strong></td>
      <td>A Windows 10 machine deliberately exposed to attackers.</td>
    </tr>
    <tr>
      <td><strong>Log Analytics Workspace (LAW)</strong></td>
      <td>Centralized log storage and analysis engine.</td>
    </tr>
    <tr>
      <td><strong>Microsoft Sentinel</strong></td>
      <td>SIEM platform for correlation, alerting, and visualization.</td>
    </tr>
    <tr>
      <td><strong>Live Attack Map</strong></td>
      <td>Displays attack origins in real time.</td>
    </tr>
  </tbody>
</table>

<hr />

<h2 id="️-step-by-step-highlights">⚙️ Step-by-Step Highlights</h2>

<h3 id="1️⃣-setting-up-azure">1️⃣ Setting up Azure</h3>
<p>Create a free Azure subscription and configure:</p>
<ul>
  <li>Resource Group</li>
  <li>Virtual Network (VNet)</li>
  <li>Virtual Machine (Windows 10 Honeypot)</li>
</ul>

<h3 id="2️⃣-deploying-the-honeypot">2️⃣ Deploying the Honeypot</h3>
<p>Expose the VM intentionally:</p>
<ul>
  <li>Delete RDP security rules.</li>
  <li>Allow all inbound traffic.</li>
  <li>Disable Windows Firewall to attract attacks.</li>
</ul>

<p>⚠️ <em>This should be done only in an isolated lab environment.</em></p>

<h3 id="3️⃣-observing-attacks">3️⃣ Observing Attacks</h3>
<p>Within minutes, automated bots start brute-forcing your VM.</p>

<p>Monitor <strong>Event ID 4625 (Failed Login)</strong> using Windows Event Viewer:</p>
<ul>
  <li>Username attempted</li>
  <li>IP address</li>
  <li>Failure reason</li>
</ul>

<h3 id="4️⃣-integrating-with-sentinel">4️⃣ Integrating with Sentinel</h3>
<p>Use the <strong>Azure Monitor Agent</strong> to forward logs to <strong>Log Analytics Workspace</strong>.<br />
Then, connect Sentinel to the workspace for correlation and visualization.</p>

<p>Sample KQL Query:</p>
<pre><code class="language-kql">SecurityEvent
| where EventID == 4625
| project TimeGenerated, Account, IpAddress = tostring(parse_json(AdditionalFields)["IpAddress"])
| sort by TimeGenerated desc
</code></pre>

<h3 id="5️⃣-enriching-data-with-geoip">5️⃣ Enriching Data with GeoIP</h3>

<p>Import <code class="language-plaintext highlighter-rouge">geoip-summarized.csv</code> as a <strong>Sentinel Watchlist</strong> to map attacks to their geographic origins.</p>

<h3 id="6️⃣-visualizing-attacks">6️⃣ Visualizing Attacks</h3>

<p>Create a custom Sentinel <strong>Workbook</strong> using <code class="language-plaintext highlighter-rouge">map.json</code> to generate a <strong>live global attack map</strong>.
You’ll see where attackers are coming from — in real time.</p>

<hr />

<h2 id="-results-and-insights">📊 Results and Insights</h2>

<p>Within hours of exposure, the honeypot began receiving:</p>

<ul>
  <li>Hundreds of <strong>failed login attempts</strong>.</li>
  <li>Attack sources from <strong>over 50 countries</strong>.</li>
  <li>Common usernames like <code class="language-plaintext highlighter-rouge">admin</code>, <code class="language-plaintext highlighter-rouge">test</code>, and <code class="language-plaintext highlighter-rouge">employee</code>.</li>
</ul>

<p>These logs reflect the <strong>global nature of cyber threats</strong> and demonstrate how SOCs continuously analyze suspicious activities to safeguard systems.</p>

<hr />

<h2 id="-lessons-learned">🧠 Lessons Learned</h2>

<ul>
  <li><strong>Attack simulation</strong> is a powerful learning tool.</li>
  <li>Understanding <strong>Event ID 4625</strong> is essential for brute-force detection.</li>
  <li><strong>KQL</strong> is a must-know language for any SOC analyst.</li>
  <li>Visual dashboards turn complex data into clear stories for decision-makers.</li>
</ul>

<hr />

<h2 id="-next-steps">🧩 Next Steps</h2>

<p>Future enhancements:</p>

<ul>
  <li>Integrate <strong>Sysmon</strong> for deeper telemetry.</li>
  <li>Automate alerts with <strong>Logic Apps</strong>.</li>
  <li>Extend to <strong>multi-cloud monitoring</strong> (AWS / GCP).</li>
  <li>Apply <strong>AI models or LLMs</strong> to summarize log anomalies.</li>
</ul>

<hr />

<h2 id="-full-documentation">📖 Full Documentation</h2>

<p>All setup instructions, queries, and diagrams are available in the public repository:
👉 <a href="https://github.com/elbazhazem/SOC-Home-Lab"><strong>GitHub: SOC Home Lab from Zero</strong></a></p>

<p>For a detailed tutorial and reflections, read the Medium article:
👉 <a href="https://medium.com/@hazem.baz/soc-home-lab-on-azure-from-zero-catching-real-attackers-6e377afee7aa"><strong>Building a SOC Home Lab from Zero — Catching Real Attackers on Azure</strong></a> <em>(replace with your final post URL)</em></p>

<hr />

<h2 id="-about-the-author">🌐 About the Author</h2>

<p><strong>Dr. Hazem A. Elbaz</strong>
Assistant Professor of Cybersecurity | SOC Automation Researcher | AI-SOC Founder
<a href="https://elbazhazem.github.io">Website</a> • <a href="https://www.linkedin.com/in/hazem-elbaz">LinkedIn</a> • <a href="https://github.com/elbazhazem">GitHub</a></p>

<hr />]]></content><author><name>Hazem Elbaz</name></author><category term="Cybersecurity" /><category term="SOC" /><category term="Microsoft Sentinel" /><category term="Azure" /><category term="SIEM" /><category term="Practical Labs" /><summary type="html"><![CDATA[A hands-on journey into building a functional Security Operations Center (SOC) using free Azure resources and Microsoft Sentinel — from honeypot setup to live attack visualization.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://elbazhazem.github.io/assets/soc-home-lab-banner.png" /><media:content medium="image" url="https://elbazhazem.github.io/assets/soc-home-lab-banner.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI</title><link href="https://elbazhazem.github.io/unveiling-LLM-SOC-Agent/" rel="alternate" type="text/html" title="Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI" /><published>2025-07-19T00:00:00+00:00</published><updated>2025-07-19T00:00:00+00:00</updated><id>https://elbazhazem.github.io/unveiling-LLM-SOC-Agent</id><content type="html" xml:base="https://elbazhazem.github.io/unveiling-LLM-SOC-Agent/"><![CDATA[<h2 id="unveiling-llm-soc-agent-revolutionizing-security-operations-with-ai">Unveiling LLM-SOC-Agent: Revolutionizing Security Operations with AI</h2>

<p>In the ever-evolving landscape of cybersecurity, Security Operations Centers (SOCs) are constantly battling an increasing volume and sophistication of threats. The manual burden on analysts is immense, leading to alert fatigue and a struggle to keep pace. This is precisely where the <strong>LLM-SOC-Agent</strong> project steps in, aiming to transform traditional SOC operations through the power of Large Language Models (LLMs) and intelligent automation.</p>

<p>The LLM-SOC-Agent, an integral part of the broader <a href="https://github.com/ai-soc-automation">AI-SOC-Automation</a> initiative, is an open-source endeavor focused on building a multi-agent security framework. This project envisions a future where LLMs act as intelligent assistants, capable of analyzing vast amounts of security data, generating comprehensive insights, and even executing response actions autonomously.</p>

<h3 id="what-is-llm-soc-agent">What is LLM-SOC-Agent?</h3>

<p>At its core, LLM-SOC-Agent leverages multiple LLM models to analyze and generate security briefs, effectively acting as an AI-driven SOC analyst. The project’s goal is to go beyond simple text generation, enabling LLMs to understand context, reason through security scenarios, and make informed decisions.</p>

<p>Key features and functionalities being developed within LLM-SOC-Agent include:</p>

<ul>
  <li><strong>Threat Intelligence Analysis:</strong> Processing and summarizing threat intelligence data to provide actionable insights on emerging threats.</li>
  <li><strong>Log Analysis:</strong> Identifying anomalies and suspicious activities within vast volumes of log data.</li>
  <li><strong>Vulnerability Assessment:</strong> Assessing vulnerabilities and summarizing critical exposures.</li>
  <li><strong>Incident Response:</strong> Evaluating security incidents and recommending appropriate response actions.</li>
  <li><strong>Overseer Summary:</strong> Generating a final, consolidated summary brief based on the outputs of various specialized agents.</li>
</ul>

<p>The project emphasizes a modular design, allowing for individual agents to handle specific tasks and then collaborate to achieve complex security objectives. This agentic approach is crucial for breaking down intricate security problems into manageable, AI-addressable components.</p>

<h3 id="diving-into-the-code-repository">Diving into the Code Repository</h3>

<p>The LLM-SOC-Agent GitHub repository (<a href="https://github.com/ai-soc-automation/LLM-SOC-Agent">https://github.com/ai-soc-automation/LLM-SOC-Agent</a>) is where the magic happens. While the specifics of the code structure can evolve, you’ll typically find:</p>

<ul>
  <li><strong>Agent Modules:</strong> Python scripts or directories dedicated to each specialized agent (e.g., <code class="language-plaintext highlighter-rouge">threat_intel_agent.py</code>, <code class="language-plaintext highlighter-rouge">log_analysis_agent.py</code>). These modules likely contain the logic for interacting with LLMs, processing specific data types, and generating targeted outputs.</li>
  <li><strong>Core Orchestration:</strong> Files responsible for coordinating the activities of different agents, defining workflows, and managing the overall execution flow. This might involve setting up communication channels between agents and handling the aggregation of their individual analyses.</li>
  <li><strong>Data Handling:</strong> Scripts or utilities for data ingestion, preprocessing, and formatting to prepare security data for LLM consumption. The project currently reads <code class="language-plaintext highlighter-rouge">.txt</code> files, but future iterations could involve integration with SIEMs, threat intelligence platforms, and other security tools.</li>
  <li><strong>Configuration:</strong> Files to manage API keys, model selections (e.g., local LLMs via Ollama, or cloud-based LLMs like those from Together API), and other project settings.</li>
  <li><strong>Examples and Demos:</strong> Sample data and scripts to showcase the agent’s capabilities and provide a starting point for users and contributors.</li>
</ul>

<p>The development often involves leveraging LLM frameworks to simplify the process of building intelligent agents, managing their memory, decision-making processes, and tool integrations. This allows the project to focus on the security-specific logic rather than reinventing the wheel for LLM interactions.</p>

<h3 id="contributing-to-the-future-of-soc-automation">Contributing to the Future of SOC Automation</h3>

<p>The LLM-SOC-Agent project is a fantastic opportunity for anyone passionate about cybersecurity, AI, and open-source development. Contributions are welcomed from individuals with diverse skill sets, including:</p>

<ul>
  <li><strong>Cybersecurity Analysts/Engineers:</strong> Provide domain expertise, define use cases, and validate the accuracy and effectiveness of the AI agents.</li>
  <li><strong>Machine Learning Engineers/Data Scientists:</strong> Develop and fine-tune LLM models, implement new anomaly detection algorithms, and improve the overall intelligence of the agents.</li>
  <li><strong>Software Developers:</strong> Build new agent modules, enhance existing code, integrate with other security tools, and improve the project’s scalability and robustness.</li>
  <li><strong>Researchers:</strong> Explore novel applications of LLMs in cybersecurity, contribute to the theoretical foundations, and propose innovative solutions.</li>
</ul>

<p>If you’re looking to make a tangible impact on the future of security operations and work with cutting-edge AI technologies, the LLM-SOC-Agent project offers a collaborative environment to learn, build, and innovate. Check out the GitHub repository, explore the existing code, and don’t hesitate to engage with the community to find out how you can contribute!</p>

<p>This is more than just a coding project; it’s about building the next generation of intelligent SOCs, empowering security professionals, and strengthening our defenses against evolving cyber threats.</p>]]></content><author><name>Hazem Elbaz</name></author><category term="LLM" /><category term="SOC" /><category term="Automation" /><category term="Cybersecurity" /><summary type="html"><![CDATA[This project envisions a future where LLMs act as intelligent assistants and even executing response actions autonomously.]]></summary></entry><entry><title type="html">From Alert Fatigue to Smart Triage: Building an LLM-Powered SOC Agent</title><link href="https://elbazhazem.github.io/llm-soc-agent-intro/" rel="alternate" type="text/html" title="From Alert Fatigue to Smart Triage: Building an LLM-Powered SOC Agent" /><published>2025-07-13T00:00:00+00:00</published><updated>2025-07-13T00:00:00+00:00</updated><id>https://elbazhazem.github.io/llm-soc-agent-intro</id><content type="html" xml:base="https://elbazhazem.github.io/llm-soc-agent-intro/"><![CDATA[<h1 id="from-alert-fatigue-to-smart-triage-building-an-llmpowered-soc-agent">From Alert Fatigue to Smart Triage: Building an LLM‑Powered SOC Agent</h1>

<blockquote>
  <p><em>“Security teams drown in tens of thousands of alerts every day. What if a lightweight language model could triage them for you in real time?”</em></p>
</blockquote>

<h2 id="1-the-pain-alert-overload--mttr">1. The Pain: Alert Overload &amp; MTTR</h2>

<p>Security Operations Centers (SOCs) rely on SIEM and SOAR tools, but <strong>rule‑based playbooks</strong> often miss context, generating floods of false positives. Analysts spend hours weeding out noise, and <strong>Mean‑Time‑To‑Respond (MTTR)</strong> balloons.</p>

<h2 id="2-our-idea-contextaware-enrichment-with-llms">2. Our Idea: Context‑Aware Enrichment With LLMs</h2>

<p>We fine‑tuned <strong>DistilRoBERTa</strong> using LoRA adapters on a blended corpus of _CIC‑IDS 2018_ logs and our synthetic <strong>SOC‑Sim</strong> stream. The agent:</p>

<ol>
  <li><strong>Enriches</strong> each alert with entity context (IP reputation, MITRE ATT\&amp;CK techniques).</li>
  <li><strong>Clusters</strong> alerts that share root cause, shrinking queue length.</li>
  <li><strong>Prioritises</strong> by assigning a risk score using chain‑of‑thought prompting.</li>
</ol>

<h2 id="3-architecture">3. Architecture</h2>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌──────────┐      ┌────────────┐     ┌─────────────┐
│  Logs    │──►──▶│ Preprocess │──►──▶│ LLM Enrich  │
└──────────┘      └────────────┘     └─────┬───────┘
                                           │  Clusters
                                           ▼
                                     ┌─────────────┐
                                     │  Prioritise │
                                     └─────┬───────┘
                                           ▼
                                     Analyst Dashboard
</code></pre></div></div>

<p><em>(A detailed diagram with component icons will be released in the repo’s <code class="language-plaintext highlighter-rouge">/docs</code>.)</em></p>

<h2 id="4-dataset--training-pipeline">4. Dataset &amp; Training Pipeline</h2>

<table>
  <thead>
    <tr>
      <th>Dataset</th>
      <th>Records</th>
      <th>Label Strategy</th>
      <th>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>CIC‑IDS 2018</td>
      <td>2.9 M</td>
      <td>Original attack labels</td>
      <td>Cleaned &amp; deduped</td>
    </tr>
    <tr>
      <td>SOC‑Sim</td>
      <td>1.2 M</td>
      <td>Synthetic MITRE mapping</td>
      <td>Covers phish, ransomware</td>
    </tr>
  </tbody>
</table>

<p>Training lasted <strong>4 h on a single RTX 4090</strong>. LoRA reduced GPU memory to &lt; 12 GB.</p>

<h2 id="5-early-results">5. Early Results</h2>

<table>
  <thead>
    <tr>
      <th>Metric</th>
      <th>Rule‑based SOAR</th>
      <th>LLM‑SOC‑Agent</th>
      <th>Δ</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>MTTR (median)</td>
      <td>47 min</td>
      <td><strong>32 min</strong></td>
      <td>↓ 32 %</td>
    </tr>
    <tr>
      <td>False positives</td>
      <td>18 %</td>
      <td><strong>11 %</strong></td>
      <td>↓ 7 pp</td>
    </tr>
    <tr>
      <td>Analyst effort (alerts/day)</td>
      <td>1 200</td>
      <td><strong>820</strong></td>
      <td>↓ 31 %</td>
    </tr>
  </tbody>
</table>

<h2 id="6-whats-next">6. What’s Next</h2>

<ul>
  <li>Real‑time Zeek telemetry ingest</li>
  <li>Adversarial robustness testing (IBM ART)</li>
  <li>Feedback loop to fine‑tune on analyst decisions</li>
</ul>

<h2 id="7-call-to-action">7. Call to Action</h2>

<ul>
  <li>⭐ <strong>Star</strong> the repo: <a href="https://github.com/ai-soc-automation/LLM-SOC-Agent">https://github.com/ai-soc-automation/LLM-SOC-Agent</a></li>
  <li>🐞 <strong>File issues</strong> or suggest datasets</li>
  <li>💬 Join the discussion on <a href="https://www.linkedin.com/in/hazemelbaz">LinkedIn</a>.</li>
</ul>

<hr />

<blockquote>
  <p><em>This post is part of my ongoing series on <strong>AI‑Driven SOC Automation</strong>. Browse the entire journey on the <a href="/projects/ai-soc-automation/">AI‑SOC page</a>.</em></p>
</blockquote>]]></content><author><name>Hazem Elbaz</name></author><category term="LLM" /><category term="SOC" /><category term="Automation" /><category term="Cybersecurity" /><summary type="html"><![CDATA[How we cut Mean‑Time‑To‑Respond by 30 % with a lightweight LLM pipeline.]]></summary></entry><entry><title type="html">Detecting Network Anomalies with XGBoost and SMOTE</title><link href="https://elbazhazem.github.io/Detecting-Network-Anomalies/" rel="alternate" type="text/html" title="Detecting Network Anomalies with XGBoost and SMOTE" /><published>2025-06-20T00:00:00+00:00</published><updated>2025-06-20T00:00:00+00:00</updated><id>https://elbazhazem.github.io/Detecting-Network-Anomalies</id><content type="html" xml:base="https://elbazhazem.github.io/Detecting-Network-Anomalies/"><![CDATA[<h3 id="️-blog-title">✍️ <strong>Blog Title:</strong></h3>

<p><strong>Detecting Network Anomalies with XGBoost and SMOTE: From Cybersecurity Logs to AI Models</strong></p>

<hr />

<h3 id="-introduction">🧠 <strong>Introduction</strong></h3>

<p>As someone transitioning from a cybersecurity background into AI, I recently challenged myself to turn raw network traffic into intelligent insights. The result? A complete machine learning pipeline that detects DoS (Denial-of-Service) attacks with <strong>99.9%+ accuracy and AUC</strong>, built on top of real-world IoT traffic.</p>

<p>This project marks a key milestone in my journey — transforming my hands-on experience with logs and network security into a practical AI application.</p>

<hr />

<h3 id="-what-problem-are-we-solving">🔍 <strong>What Problem Are We Solving?</strong></h3>

<p>Traditional intrusion detection systems (IDS) often fail to detect sophisticated or low-rate DoS attacks. Moreover, the volume of network logs and the class imbalance between normal and malicious traffic make this task even harder.</p>

<p>So I asked myself:</p>

<blockquote>
  <p><em>Can we use modern machine learning to detect anomalies directly from network logs?</em></p>
</blockquote>

<hr />

<h3 id="-dataset-iotid20-extended-2024">💾 <strong>Dataset: IoTID20-Extended (2024)</strong></h3>

<p>We used the <strong>IoTID20-Extended dataset</strong>, a recent and comprehensive collection of real IoT network traffic. It includes labeled flows representing normal and various attack types — including DoS and DDoS.</p>

<p>📌 Dataset link: <a href="https://www.kaggle.com/datasets/rohulaminlabid/iotid20-dataset">Kaggle – IoTID20 Dataset</a></p>

<hr />

<h3 id="️-approach-overview">🛠️ <strong>Approach Overview</strong></h3>

<p>We designed an end-to-end pipeline with the following stages:</p>

<ol>
  <li>
    <p><strong>Data Preprocessing</strong></p>

    <ul>
      <li>Handle missing values, encode categorical features, scale numerical ones.</li>
    </ul>
  </li>
  <li>
    <p><strong>Feature Selection</strong></p>

    <ul>
      <li>Used <code class="language-plaintext highlighter-rouge">SelectKBest</code> to extract top predictive features.</li>
    </ul>
  </li>
  <li>
    <p><strong>Class Balancing</strong></p>

    <ul>
      <li>Applied <code class="language-plaintext highlighter-rouge">SMOTE</code> to synthetically oversample underrepresented attack traffic.</li>
    </ul>
  </li>
  <li>
    <p><strong>Model Training</strong></p>

    <ul>
      <li>Used <code class="language-plaintext highlighter-rouge">XGBoost</code>, known for performance on tabular datasets.</li>
    </ul>
  </li>
  <li>
    <p><strong>Evaluation</strong></p>

    <ul>
      <li>10-Fold Cross-Validation using <code class="language-plaintext highlighter-rouge">F1-score</code> and <code class="language-plaintext highlighter-rouge">ROC-AUC</code>.</li>
    </ul>
  </li>
</ol>

<hr />

<h3 id="-results">📈 <strong>Results</strong></h3>

<p>The model achieved:</p>

<ul>
  <li>✅ <strong>Accuracy</strong>: 100%</li>
  <li>✅ <strong>F1 Score</strong>: 1.00</li>
  <li>✅ <strong>ROC-AUC</strong>: 1.00</li>
</ul>

<blockquote>
  <p>These results are exceptional, but they reflect a balanced, clean dataset. In real-world deployments, we’d expect slightly lower but still strong performance.</p>
</blockquote>

<p>📊 Confusion Matrix and ROC Curve plots were also generated (see GitHub).</p>

<hr />

<h3 id="-why-this-matters">💡 <strong>Why This Matters</strong></h3>

<p>This project proves that AI can effectively augment traditional network security — not just by detecting anomalies, but by <strong>learning from raw or semi-structured data</strong> like logs. It’s a step toward <strong>AI-driven intrusion detection systems</strong>.</p>

<p>As a cybersecurity expert now stepping into AI, this fusion of domains is exactly where I plan to build next.</p>

<hr />

<h3 id="-try-it-yourself">📂 <strong>Try It Yourself</strong></h3>

<p>Full project code, notebook, and results are available on GitHub:</p>

<p>🔗 <a href="https://github.com/elbazhazem/log-anomaly-detection">GitHub Repo – Log Anomaly Detection</a></p>

<p>Includes:</p>

<ul>
  <li>Notebook with all steps</li>
  <li>Visual results</li>
  <li>Cleaned dataset path</li>
  <li><code class="language-plaintext highlighter-rouge">README.md</code> + <code class="language-plaintext highlighter-rouge">requirements.txt</code></li>
</ul>

<hr />

<h3 id="-next-steps">🚀 <strong>Next Steps</strong></h3>

<p>This is just the beginning. My roadmap includes:</p>

<ul>
  <li>Applying LLMs to raw <code class="language-plaintext highlighter-rouge">.log</code> files</li>
  <li>Integrating SHAP/LIME for model explainability</li>
  <li>Deploying real-time log anomaly detectors</li>
  <li>Combining clustering + classification in hybrid models</li>
</ul>

<hr />

<h3 id="-about-me">👨‍💻 About Me</h3>

<p>I’m <strong>Hazem Elbaz</strong>, a cybersecurity researcher shifting toward applied AI and intelligent automation in network defense.</p>

<p>🧭 Follow my journey of building real-world AI from the ground up at:
🔗 <a href="https://elbazhazem.github.io">elbazhazem.github.io</a></p>

<hr />

<h3 id="question-for-you">❓Question for You</h3>

<p>Have you tried using ML or AI in log analysis or cybersecurity? What tools or datasets worked for you?</p>

<p>👇 Let’s discuss in the comments.</p>]]></content><author><name>Hazem Elbaz</name></author><summary type="html"><![CDATA[✍️ Blog Title: Detecting Network Anomalies with XGBoost and SMOTE: From Cybersecurity Logs to AI Models 🧠 Introduction As someone transitioning from a cybersecurity background into AI, I recently challenged myself to turn raw network traffic into intelligent insights. The result? A complete machine learning pipeline that detects DoS (Denial-of-Service) attacks with 99.9%+ accuracy and AUC, built on top of real-world IoT traffic. This project marks a key milestone in my journey — transforming my hands-on experience with logs and network security into a practical AI application. 🔍 What Problem Are We Solving? Traditional intrusion detection systems (IDS) often fail to detect sophisticated or low-rate DoS attacks. Moreover, the volume of network logs and the class imbalance between normal and malicious traffic make this task even harder. So I asked myself: Can we use modern machine learning to detect anomalies directly from network logs? 💾 Dataset: IoTID20-Extended (2024) We used the IoTID20-Extended dataset, a recent and comprehensive collection of real IoT network traffic. It includes labeled flows representing normal and various attack types — including DoS and DDoS. 📌 Dataset link: Kaggle – IoTID20 Dataset 🛠️ Approach Overview We designed an end-to-end pipeline with the following stages: Data Preprocessing Handle missing values, encode categorical features, scale numerical ones. Feature Selection Used SelectKBest to extract top predictive features. Class Balancing Applied SMOTE to synthetically oversample underrepresented attack traffic. Model Training Used XGBoost, known for performance on tabular datasets. Evaluation 10-Fold Cross-Validation using F1-score and ROC-AUC. 📈 Results The model achieved: ✅ Accuracy: 100% ✅ F1 Score: 1.00 ✅ ROC-AUC: 1.00 These results are exceptional, but they reflect a balanced, clean dataset. In real-world deployments, we’d expect slightly lower but still strong performance. 📊 Confusion Matrix and ROC Curve plots were also generated (see GitHub). 💡 Why This Matters This project proves that AI can effectively augment traditional network security — not just by detecting anomalies, but by learning from raw or semi-structured data like logs. It’s a step toward AI-driven intrusion detection systems. As a cybersecurity expert now stepping into AI, this fusion of domains is exactly where I plan to build next. 📂 Try It Yourself Full project code, notebook, and results are available on GitHub: 🔗 GitHub Repo – Log Anomaly Detection Includes: Notebook with all steps Visual results Cleaned dataset path README.md + requirements.txt 🚀 Next Steps This is just the beginning. My roadmap includes: Applying LLMs to raw .log files Integrating SHAP/LIME for model explainability Deploying real-time log anomaly detectors Combining clustering + classification in hybrid models 👨‍💻 About Me I’m Hazem Elbaz, a cybersecurity researcher shifting toward applied AI and intelligent automation in network defense. 🧭 Follow my journey of building real-world AI from the ground up at: 🔗 elbazhazem.github.io ❓Question for You Have you tried using ML or AI in log analysis or cybersecurity? What tools or datasets worked for you? 👇 Let’s discuss in the comments.]]></summary></entry><entry><title type="html">My Roadmap: From Cybersecurity to Applied AI</title><link href="https://elbazhazem.github.io/roadmap-cyber-to-ai/" rel="alternate" type="text/html" title="My Roadmap: From Cybersecurity to Applied AI" /><published>2025-06-19T00:00:00+00:00</published><updated>2025-06-19T00:00:00+00:00</updated><id>https://elbazhazem.github.io/roadmap-cyber-to-ai</id><content type="html" xml:base="https://elbazhazem.github.io/roadmap-cyber-to-ai/"><![CDATA[<h1 id="my-roadmap-from-cybersecurity-to-applied-ai">My Roadmap: From Cybersecurity to Applied AI</h1>

<p>I’ve spent most of my career in cybersecurity.<br />
In 2024, I decided to pivot — not away from cyber, but <strong>toward AI-powered security</strong>.</p>

<p>Here’s my roadmap for the transition.</p>

<hr />

<h2 id="-step-1-define-a-use-case">🎯 Step 1: Define a Use Case</h2>

<p>I didn’t start with models — I started with <strong>a problem</strong>:</p>
<blockquote>
  <p>“How can I make logs easier to understand and analyze?”</p>
</blockquote>

<p>That became my first AI-for-cyber project.</p>

<hr />

<h2 id="-step-2-learn-the-basics-of-aiml">📚 Step 2: Learn the Basics of AI/ML</h2>

<p>I focused on:</p>
<ul>
  <li>Python for data and APIs</li>
  <li>Numpy, Pandas</li>
  <li>Scikit-learn for traditional models</li>
  <li>HuggingFace + OpenAI for LLMs</li>
  <li>LangChain for chaining prompts</li>
</ul>

<hr />

<h2 id="-step-3-build-something-small-fast">🔬 Step 3: Build Something Small, Fast</h2>

<p>→ <a href="https://github.com/elbazhazem/log-analyzer-LLM">Log Analyzer LLM</a><br />
This was my MVP to apply what I learned.</p>

<hr />

<h2 id="-step-4-go-deeper">📈 Step 4: Go Deeper</h2>

<p>I’m now:</p>
<ul>
  <li>Learning clustering + classification</li>
  <li>Experimenting with fine-tuning</li>
  <li>Studying academic papers</li>
  <li>Rebuilding my GitHub profile with applied projects</li>
</ul>

<hr />

<h2 id="-step-5-share-reflect-and-publish">🧠 Step 5: Share, Reflect, and Publish</h2>

<p>This blog is part of that effort.</p>

<p>I’m also:</p>
<ul>
  <li>Writing a research paper</li>
  <li>Applying for academic/industry roles</li>
  <li>Building a public portfolio of AI + cybersecurity tools</li>
</ul>

<hr />

<h2 id="lessons-so-far">Lessons So Far</h2>

<ul>
  <li>AI is not a destination — it’s a toolkit</li>
  <li>Focus on <strong>usefulness</strong>, not hype</li>
  <li>You don’t need a PhD to start</li>
</ul>

<hr />

<p>🔍 Curious about my work?<br />
Check out my <a href="../projects">projects</a> or connect on <a href="https://www.linkedin.com/in/hazem-elbaz">LinkedIn</a>.</p>]]></content><author><name>Hazem Elbaz</name></author><category term="learning" /><category term="transition" /><category term="ai-roadmap" /><category term="cyber-to-ai" /><category term="career" /><summary type="html"><![CDATA[An honest roadmap for how I’m transitioning into AI from a cybersecurity background — step-by-step.]]></summary></entry><entry><title type="html">Why Cybersecurity Needs AI More Than Ever</title><link href="https://elbazhazem.github.io/why-cyber-needs-ai/" rel="alternate" type="text/html" title="Why Cybersecurity Needs AI More Than Ever" /><published>2025-06-18T00:00:00+00:00</published><updated>2025-06-18T00:00:00+00:00</updated><id>https://elbazhazem.github.io/why-cyber-needs-ai</id><content type="html" xml:base="https://elbazhazem.github.io/why-cyber-needs-ai/"><![CDATA[<h1 id="why-cybersecurity-needs-ai-more-than-ever">Why Cybersecurity Needs AI More Than Ever</h1>

<p>Today’s cybersecurity teams are overloaded:</p>

<ul>
  <li>📈 Alert fatigue</li>
  <li>⌛ Shortage of skilled analysts</li>
  <li>🚨 False positives everywhere</li>
  <li>🕵️‍♂️ Sophisticated, evasive threats</li>
</ul>

<p>In a modern SOC, the real challenge isn’t detection — it’s <strong>prioritization and interpretation</strong>.</p>

<hr />

<h2 id="where-ai-can-help">Where AI Can Help</h2>

<h3 id="1-intelligent-summarization">1. Intelligent Summarization</h3>

<p>LLMs can:</p>
<ul>
  <li>Digest 500 lines of logs</li>
  <li>Summarize what happened</li>
  <li>Highlight what matters</li>
</ul>

<h3 id="2-threat-contextualization">2. Threat Contextualization</h3>

<p>Instead of just “block port 443”, LLMs can explain:</p>
<blockquote>
  <p><em>“This appears to be a reverse shell attempt based on behavior and timing.”</em></p>
</blockquote>

<h3 id="3-automation-of-repetitive-work">3. Automation of Repetitive Work</h3>

<ul>
  <li>Categorize phishing emails</li>
  <li>Triage alerts</li>
  <li>Recommend mitigation steps</li>
</ul>

<p>All these can be supported by fine-tuned models or simple LLM prompts.</p>

<hr />

<h2 id="this-is-not-a-future-vision--its-now">This is Not a Future Vision — It’s Now</h2>

<p>Tools like:</p>
<ul>
  <li>GPT-4 + Python</li>
  <li>LangChain + SIEM integrations</li>
  <li>Vector databases + threat intel
are already being tested in production environments.</li>
</ul>

<hr />

<h2 id="but-beware-the-hype">But Beware the Hype</h2>

<p>AI ≠ Magic.</p>

<ul>
  <li>It needs validation</li>
  <li>It requires tuning</li>
  <li>It must be explainable</li>
</ul>

<hr />

<h2 id="final-thought">Final Thought</h2>

<p>Cybersecurity needs more than automation.<br />
It needs <strong>intelligent augmentation</strong> — and that’s where AI shines.</p>

<p>The future analyst is part human, part machine.</p>

<hr />

<p>🤖 I’m exploring this space deeply in my own projects — see <a href="https://github.com/elbazhazem/log-analyzer-LLM">Log Analyzer LLM</a><br />
🔁 Let’s co-build the next-gen SOC tools.</p>]]></content><author><name>Hazem Elbaz</name></author><category term="cybersecurity" /><category term="SOC" /><category term="AI" /><category term="automation" /><category term="burnout" /><summary type="html"><![CDATA[Cybersecurity is drowning in noise, complexity, and fatigue — here’s how AI can help.]]></summary></entry><entry><title type="html">Lessons from Building My First Log Analyzer with GPT-4</title><link href="https://elbazhazem.github.io/lessons-from-first-ai-tool/" rel="alternate" type="text/html" title="Lessons from Building My First Log Analyzer with GPT-4" /><published>2025-06-17T00:00:00+00:00</published><updated>2025-06-17T00:00:00+00:00</updated><id>https://elbazhazem.github.io/lessons-from-first-ai-tool</id><content type="html" xml:base="https://elbazhazem.github.io/lessons-from-first-ai-tool/"><![CDATA[<h1 id="lessons-from-building-my-first-log-analyzer-with-gpt-4">Lessons from Building My First Log Analyzer with GPT-4</h1>

<p>When I started building <a href="https://github.com/elbazhazem/log-analyzer-LLM">Log Analyzer LLM</a>, I had one goal:</p>

<blockquote>
  <p>“Make logs readable, fast.”</p>
</blockquote>

<p>Logs are noisy, verbose, and contextless. I wanted an AI assistant that could summarize logs and highlight meaningful events — something traditional SIEMs don’t do well.</p>

<p>Here’s what I learned along the way.</p>

<hr />

<h2 id="1-structure-matters-more-than-you-think">1. Structure Matters More Than You Think</h2>

<p>The biggest challenge?<br />
Logs are not standardized.</p>

<p>Some logs are JSON. Others are multiline strings, or worse — key-value chaos.</p>

<p><strong>Solution:</strong><br />
I started by building simple pre-processing steps to:</p>
<ul>
  <li>Remove noise and timestamps</li>
  <li>Break logs into chunks</li>
  <li>Group them by similarity</li>
</ul>

<hr />

<h2 id="2-prompt-engineering-is-critical">2. Prompt Engineering Is Critical</h2>

<p>LLMs are powerful — but they need guidance.</p>

<p>💡 I tested several prompts:</p>
<ul>
  <li>“Summarize these log lines in plain English.”</li>
  <li>“Detect any anomalies in these logs.”</li>
  <li>“Explain what this log segment means.”</li>
</ul>

<p>Best results came from combining:</p>
<ul>
  <li>System-level instructions (e.g., <em>“You are a cybersecurity analyst.”</em>)</li>
  <li>Contextual samples (few-shot prompting)</li>
</ul>

<hr />

<h2 id="3-dont-trust-the-ai-blindly">3. Don’t Trust the AI Blindly</h2>

<p>LLMs hallucinate. Always.</p>

<p>Sometimes it summarized error logs as “successful operations”.<br />
Other times it guessed at causes.</p>

<p>🚨 <strong>Lesson:</strong> Always cross-check with known events or ground truth.</p>

<hr />

<h2 id="4-python--openai--fast-prototyping">4. Python + OpenAI = Fast Prototyping</h2>

<p>I used:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">openai</code> Python SDK</li>
  <li>Simple <code class="language-plaintext highlighter-rouge">.log</code> file reader</li>
  <li>Streamlit for UI (optional)</li>
</ul>

<p>Within hours, I had a working proof of concept.</p>

<hr />

<h2 id="whats-next">What’s Next?</h2>

<ul>
  <li>Auto-grouping similar events (clustering)</li>
  <li>Hybrid models: rules + LLMs</li>
  <li>Anomaly scoring</li>
  <li>Integration with real-time log streams</li>
</ul>

<hr />

<p>Building with GPT-4 taught me one thing:<br />
<strong>AI isn’t perfect, but it’s incredibly useful if used right.</strong></p>

<p>If you’re in cybersecurity, you should start experimenting.</p>

<hr />
<p>🧠 Repo: <a href="https://github.com/elbazhazem/log-analyzer-LLM">Log Analyzer LLM</a><br />
💬 DM me if you’re building something similar — let’s connect.</p>]]></content><author><name>Hazem Elbaz</name></author><category term="LLM" /><category term="cybersecurity" /><category term="log-analysis" /><category term="GPT-4" /><category term="openai" /><summary type="html"><![CDATA[Behind the scenes of building a log summarizer using LLMs — what worked, what didn’t, and what's next.]]></summary></entry></feed>