Your Mission
You are a junior security analyst. Your first task is to build a Python tool that can read a server's log file, find suspicious entries, and use AI to help you understand what they mean.
You will practice:
- 📄 Reading from files (`open`, `read`)
- ✍️ Writing to files (`open`, `write`)
- 📦 Using Python libraries (to make web requests)
- 🤖 Calling an AI (Gemini) to get insights
Lesson Resources
Here are the starter files you'll need. Use the "Copy" button to save the content to your own local files (e.g., `log_analyzer.py` and `web_logs.txt`).
`log_analyzer.py` (Starter Script)
import json
import http.client # Using a built-in library
# --- AI Function (Pre-written for you) ---
async def get_ai_analysis(user_prompt):
"""
Sends a prompt to the Gemini AI and returns its response.
"""
print(f"Asking AI: {user_prompt}...")
try:
# Note: In a real app, you'd use a library like 'requests'
# but we'll use built-in ones to keep it simple.
# This is the Gemini Flash model
model = "gemini-2.5-flash-preview-09-2025"
# Leave this blank! The browser will handle the API key.
api_key = ""
url = f"/v1beta/models/{model}:generateContent?key={api_key}"
headers = {'Content-Type': 'application/json'}
payload = {
"contents": [{
"parts": [{
"text": user_prompt
}]
}]
}
# This part makes the web request
conn = http.client.HTTPSConnection("generativelanguage.googleapis.com")
conn.request("POST", url, json.dumps(payload), headers)
response = conn.getresponse()
response_text = response.read().decode('utf-8')
if response.status != 200:
print(f"Error from AI: {response.status} {response_text}")
return f"Error: Could not get AI analysis. Status: {response.status}"
result = json.loads(response_text)
# Extract the text from the AI's response
text = result.get('candidates', [{}])[0] \
.get('content', {}) \
.get('parts', [{}])[0] \
.get('text', 'AI response not found.')
conn.close()
print("...AI replied.")
return text
except Exception as e:
print(f"An exception occurred: {e}")
return f"Error during AI call: {e}"
# --- Main Project ---
async def main():
"""
This is the main function where your code will go.
"""
# Define file names
log_file = "web_logs.txt"
report_file = "report.txt"
# We'll clear the old report file first
with open(report_file, 'w') as f:
f.write("--- SUSPICIOUS LOG REPORT ---\n\n")
print(f"Starting log analysis of {log_file}...")
# --- TODO: PART 1 ---
# 1. Open 'log_file' for reading ('r').
# 2. Use a 'for' loop to read each 'line' in the file.
# 3. Check if the 'line' contains the string "ERROR" or "404".
# 4. If it does, open 'report_file' for *appending* ('a').
# 5. Write the 'line' to the 'report_file'.
#
# (Delete this comment block and add your code here)
#
# Example (to get you started):
# with open(log_file, 'r') as f_in:
# for line in f_in:
# if "ERROR" in line or "404" in line:
# print(f"Found suspicious line: {line.strip()}")
# # Now, add the code to write this to 'report_file'
# --- TODO: PART 2 (Optional: Modify your Part 1 code) ---
# 1. Inside your `if` statement, instead of writing the whole line,
# try to .split() the line.
# 2. Extract the timestamp, type, and message.
# 3. Write these to 'report_file' in a nice format (see README).
#
# (Delete this comment block and add your code here)
# --- TODO: PART 3 (Optional: Modify your Part 2 code) ---
# 1. After you extract the 'message' from the log line:
# 2. Create a 'prompt' string for the AI.
# e.g., prompt = f"Explain this server error simply: {message}"
# 3. Call the AI function:
# ai_comment = await get_ai_analysis(prompt)
# 4. Write the 'ai_comment' to your 'report_file'.
#
# (Delete this comment block and add your code here)
# --- End of your code ---
print(f"Log analysis complete. Check {report_file} for results.")
# This code runs the 'main' function
if __name__ == "__main__":
import asyncio
# asyncio.run() is needed to run 'async' functions
try:
asyncio.run(main())
except Exception as e:
print(f"An error occurred in main: {e}")
`web_logs.txt` (Sample Data)
[2024-10-24 10:50:01] INFO: 192.168.1.1 - GET /index.html
[2024-10-24 10:50:02] INFO: 192.168.1.1 - GET /style.css
[2024-10-24 10:51:15] INFO: 77.12.55.1 - GET /login
[2024-10-24 10:51:16] WARN: 77.12.55.1 - POST /login - No CSRF token
[2024-10-24 10:52:30] INFO: 192.168.1.1 - GET /images/logo.png
[2024-10-24 10:53:01] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:05] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:10] ERROR: 50.112.0.1 - Failed login attempt for user 'admin'
[2024-10-24 10:53:11] WARN: 50.112.0.1 - Account 'admin' locked due to too many failed attempts.
[2024-10-24 10:54:00] INFO: 201.22.8.5 - GET /products
[2024-10-24 10:55:01] INFO: 201.22.8.5 - GET /products/item?id=123
[2024-10-24 10:55:02] INFO: 201.22.8.5 - GET /products/item?id=124
[2024-10-24 10:55:10] ERROR: 104.25.11.2 - 404 Not Found - /admin/panel.php
[2024-10-24 10:55:11] WARN: 104.25.11.2 - User attempted to access non-existent admin panel.
[2024-10-24 10:56:00] INFO: 192.168.1.1 - GET /about-us
[2024-10-24 10:57:30] ERROR: 15.115.10.9 - 404 Not Found - /wp-login.php
[2024-10-24 10:58:01] ERROR: 99.12.13.14 - Possible SQL_INJECTION attempt: ' OR 1=1; --
[2024-10-24 10:59:00] INFO: 77.12.55.1 - GET /dashboard
[2024-10-24 10:59:05] INFO: 77.12.55.1 - POST /logout
Part 1: The Basic Scan (File I/O)
Goal: Read from `web_logs.txt` (which you copied from the Resources tab) and write all "suspicious" lines to a new file called `report.txt`.
For now, a "suspicious" line is any line that contains the word `ERROR` or the code `404`.
Your Tasks:
- Open the `log_analyzer.py` file you created.
- Find the `TODO` comment for Part 1.
- Write Python code to:
- Open and read `web_logs.txt` line by line.
- Check if a line contains `ERROR` or `404`.
- If it does, open (or create) `report.txt` and append that suspicious line to it, followed by a newline.
- Run your script! You should see a new file, `report.txt`, appear with only the error lines.
Part 2: The "Pretty" Report (String Manipulation)
Goal: Make the report more readable. Instead of writing the *whole line*, just extract the key info.
A log line looks like this: `[2024-10-24 10:53:01] ERROR: 50.112.0.1 - Failed login attempt`
Your Tasks:
- Go to the `TODO` for Part 2 (you can modify your Part 1 code).
- When you find a suspicious line, use string methods (like `.split()`) to break it apart.
- Try to extract:
- The timestamp (e.g., `[2024-10-24 10:53:01]`)
- The type (e.g., `ERROR` or `INFO`)
- The message (e.g., `Failed login attempt`)
- Write this *structured* information to `report.txt` in a nice format.
Example for `report.txt`:
SUSPICIOUS LOG ENTRY:
Timestamp: [2024-10-24 10:53:01]
Type: ERROR
Message: Failed login attempt
------------------------------------
Part 3: The AI Analyst (AI & Libraries)
Goal: Use AI to explain *why* a log entry is suspicious.
We will use the pre-written `get_ai_analysis()` function in the script. It takes a text prompt and returns the AI's answer.
Your Tasks:
- Find the `TODO` for Part 3.
- When your code finds a suspicious line and extracts the message (from Part 2), create a
`prompt`.
- Example Prompt: `Explain this server error in one simple sentence: "Failed login attempt"`
- Call the `get_ai_analysis(prompt)` function and store the AI's response in a variable (e.g., `ai_comment`).
- Write the AI's response to your `report.txt` file under the log entry.
Example for `report.txt`:
SUSPICIOUS LOG ENTRY:
Timestamp: [2024-10-24 10:53:01]
Type: ERROR
Message: Failed login attempt
------------------------------------
>> AI ANALYSIS: This error indicates a user or bot tried to access an account with the wrong password.
🚀 Stretch Goals (If you finish early)
Finished the main mission? Challenge yourself with these bonus tasks!
- More Keywords: Add more suspicious keywords (e.g., `403`, `DENIED`, `SQL_INJECTION`).
- Better Parsing: Use the `re` (regular expressions) library to parse the log lines more accurately.
- AI Prompting: Improve your prompt! Ask the AI to rate the "threat level" from 1 to 10.
- Meme Generator: Try the *other* project idea. Use the `Pillow` library to write text on an image. Use the AI to generate a "funny caption for a cat picture" and save it as `meme.jpg`.