AI-Powered Log Analyzer

Your Mission

You are a junior security analyst. Your first task is to build a Python tool that can read a server's log file, find suspicious entries, and use AI to help you understand what they mean.

You will practice:

📄 Reading from files (`open`, `read`)
✍️ Writing to files (`open`, `write`)
📦 Using Python libraries (to make web requests)
🤖 Calling an AI (Gemini) to get insights

Lesson Resources

Here are the starter files you'll need. Use the "Copy" button to save the content to your own local files (e.g., `log_analyzer.py` and `web_logs.txt`).

`log_analyzer.py` (Starter Script)

import json
import http.client # Using a built-in library

# --- AI Function (Pre-written for you) ---

async def get_ai_analysis(user_prompt):
    """
    Sends a prompt to the Gemini AI and returns its response.
    """
    print(f"Asking AI: {user_prompt}...")
    try:
        # Note: In a real app, you'd use a library like 'requests'
        # but we'll use built-in ones to keep it simple.
        
        # This is the Gemini Flash model
        model = "gemini-2.5-flash-preview-09-2025"
        
        # Leave this blank! The browser will handle the API key.
        api_key = ""
        
        url = f"/v1beta/models/{model}:generateContent?key={api_key}"
        
        headers = {'Content-Type': 'application/json'}
        
        payload = {
            "contents": [{
                "parts": [{
                    "text": user_prompt
                }]
            }]
        }

        # This part makes the web request
        conn = http.client.HTTPSConnection("generativelanguage.googleapis.com")
        conn.request("POST", url, json.dumps(payload), headers)
        
        response = conn.getresponse()
        response_text = response.read().decode('utf-8')
        
        if response.status != 200:
            print(f"Error from AI: {response.status} {response_text}")
            return f"Error: Could not get AI analysis. Status: {response.status}"

        result = json.loads(response_text)
        
        # Extract the text from the AI's response
        text = result.get('candidates', [{}])[0] \
                     .get('content', {}) \
                     .get('parts', [{}])[0] \
                     .get('text', 'AI response not found.')
        
        conn.close()
        print("...AI replied.")
        return text

    except Exception as e:
        print(f"An exception occurred: {e}")
        return f"Error during AI call: {e}"

# --- Main Project ---

async def main():
    """
    This is the main function where your code will go.
    """
    
    # Define file names
    log_file = "web_logs.txt"
    report_file = "report.txt"

    # We'll clear the old report file first
    with open(report_file, 'w') as f:
        f.write("--- SUSPICIOUS LOG REPORT ---\n\n")

    print(f"Starting log analysis of {log_file}...")

    # --- TODO: PART 1 ---
    # 1. Open 'log_file' for reading ('r').
    # 2. Use a 'for' loop to read each 'line' in the file.
    # 3. Check if the 'line' contains the string "ERROR" or "404".
    # 4. If it does, open 'report_file' for *appending* ('a').
    # 5. Write the 'line' to the 'report_file'.
    #
    # (Delete this comment block and add your code here)
    #
    # Example (to get you started):
    # with open(log_file, 'r') as f_in:
    #   for line in f_in:
    #     if "ERROR" in line or "404" in line:
    #       print(f"Found suspicious line: {line.strip()}")
    #       # Now, add the code to write this to 'report_file'
    
    
    # --- TODO: PART 2 (Optional: Modify your Part 1 code) ---
    # 1. Inside your `if` statement, instead of writing the whole line,
    #    try to .split() the line.
    # 2. Extract the timestamp, type, and message.
    # 3. Write these to 'report_file' in a nice format (see README).
    #
    # (Delete this comment block and add your code here)
    

    # --- TODO: PART 3 (Optional: Modify your Part 2 code) ---
    # 1. After you extract the 'message' from the log line:
    # 2. Create a 'prompt' string for the AI.
    #    e.g., prompt = f"Explain this server error simply: {message}"
    # 3. Call the AI function:
    #    ai_comment = await get_ai_analysis(prompt)
    # 4. Write the 'ai_comment' to your 'report_file'.
    #
    # (Delete this comment block and add your code here)


    # --- End of your code ---
    
    print(f"Log analysis complete. Check {report_file} for results.")


# This code runs the 'main' function
if __name__ == "__main__":
    import asyncio
    # asyncio.run() is needed to run 'async' functions
    try:
        asyncio.run(main())
    except Exception as e:
        print(f"An error occurred in main: {e}")

`web_logs.txt` (Sample Data)

[2024-10-24 10:50:01] INFO: 192.168.1.1 - GET /index.html
[2024-10-24 10:50:02] INFO: 192.168.1.1 - GET /style.css
[2024-10-24 10:51:15] INFO: 77.12.55.1 - GET /login
[2024-10-24 10:51:16] WARN: 77.12.55.1 - POST /login - No CSRF token
[2024-10-24 10:52:30] INFO: 192.168.1.1 - GET /images/logo.png
[2024-10-24 10:53:01] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:05] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:10] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:11] WARN: 50.112.0.1 - Account 'admin' locked due to too many failed attempts.
[2024-10-24 10:54:00] INFO: 201.22.8.5 - GET /products
[2024-10-24 10:55:01] INFO: 201.22.8.5 - GET /products/item?id=123
[2024-10-24 10:55:02] INFO: 201.22.8.5 - GET /products/item?id=124
[2024-10-24 10:55:10] ERROR: 104.25.11.2 - 404 Not Found - /admin/panel.php
[2024-10-24 10:55:11] WARN: 104.25.11.2 - User attempted to access non-existent admin panel.
[2024-10-24 10:56:00] INFO: 192.168.1.1 - GET /about-us
[2024-10-24 10:57:30] ERROR: 15.115.10.9 - 404 Not Found - /wp-login.php
[2024-10-24 10:58:01] ERROR: 99.12.13.14 - Possible SQL_INJECTION attempt: ' OR 1=1; --
[2024-10-24 10:59:00] INFO: 77.12.55.1 - GET /dashboard
[2024-10-24 10:59:05] INFO: 77.12.55.1 - POST /logout

Part 1: The Basic Scan (File I/O)

Goal: Read from `web_logs.txt` (which you copied from the Resources tab) and write all "suspicious" lines to a new file called `report.txt`.

For now, a "suspicious" line is any line that contains the word `ERROR` or the code `404`.

Your Tasks:

Open the `log_analyzer.py` file you created.
Find the `TODO` comment for Part 1.
Write Python code to:
- Open and read `web_logs.txt` line by line.
- Check if a line contains `ERROR` or `404`.
- If it does, open (or create) `report.txt` and append that suspicious line to it, followed by a newline.
Run your script! You should see a new file, `report.txt`, appear with only the error lines.

Part 2: The "Pretty" Report (String Manipulation)

Goal: Make the report more readable. Instead of writing the *whole line*, just extract the key info.

A log line looks like this: `[2024-10-24 10:53:01] ERROR: 50.112.0.1 - Failed login attempt`

Your Tasks:

Go to the `TODO` for Part 2 (you can modify your Part 1 code).
When you find a suspicious line, use string methods (like `.split()`) to break it apart.
Try to extract:
- The timestamp (e.g., `[2024-10-24 10:53:01]`)
- The type (e.g., `ERROR` or `INFO`)
- The message (e.g., `Failed login attempt`)
Write this *structured* information to `report.txt` in a nice format.

Example for `report.txt`:

SUSPICIOUS LOG ENTRY:
  Timestamp: [2024-10-24 10:53:01]
  Type: ERROR
  Message: Failed login attempt
------------------------------------

Part 3: The AI Analyst (AI & Libraries)

Goal: Use AI to explain *why* a log entry is suspicious.

We will use the pre-written `get_ai_analysis()` function in the script. It takes a text prompt and returns the AI's answer.

Your Tasks:

Find the `TODO` for Part 3.
When your code finds a suspicious line and extracts the message (from Part 2), create a `prompt`.
- Example Prompt: `Explain this server error in one simple sentence: "Failed login attempt"`
Call the `get_ai_analysis(prompt)` function and store the AI's response in a variable (e.g., `ai_comment`).
Write the AI's response to your `report.txt` file under the log entry.

Example for `report.txt`:

SUSPICIOUS LOG ENTRY:
  Timestamp: [2024-10-24 10:53:01]
  Type: ERROR
  Message: Failed login attempt
------------------------------------
  >> AI ANALYSIS: This error indicates a user or bot tried to access an account with the wrong password.

🚀 Stretch Goals (If you finish early)

Finished the main mission? Challenge yourself with these bonus tasks!

More Keywords: Add more suspicious keywords (e.g., `403`, `DENIED`, `SQL_INJECTION`).
Better Parsing: Use the `re` (regular expressions) library to parse the log lines more accurately.
AI Prompting: Improve your prompt! Ask the AI to rate the "threat level" from 1 to 10.
Meme Generator: Try the *other* project idea. Use the `Pillow` library to write text on an image. Use the AI to generate a "funny caption for a cat picture" and save it as `meme.jpg`.