5 Beginner AI Projects That Will Get You Your First AI Solutions Architect Job
Hiring managers for AI architect roles are not impressed by certificates alone. They are impressed by GitHub repositories with working code, clear architecture diagrams, and READMEs that explain the problem being solved. Here are five projects that will impress them, with full instructions.
Why Projects Beat Certificates Every Time
Every candidate for an AI architect role has done tutorials. Many have certifications. What most do not have is evidence that they can build something real, something that solves an actual problem, that handles edge cases, that someone could actually deploy.
The job interview question that determines outcomes is: “Walk me through something you have built.” Candidates who have a strong answer to this question, something specific, technical, and real, consistently outperform candidates who can only discuss what they have studied.
These five projects are designed to be your answer to that question. They are ordered by complexity. Build them in sequence. Put them all on GitHub.
Project 1: The Serverless AI API
What you are building: A serverless REST API that takes a user prompt, sends it to Claude on AWS Bedrock, and returns a structured JSON response.
Why it matters: This is the foundational pattern of almost every AI application. Demonstrating you can wire AWS services together correctly, Lambda, API Gateway, Bedrock, IAM, with proper error handling and cost tracking signals real architectural competence.
The architecture:
Client → API Gateway (POST /generate) → Lambda → Bedrock (Claude) → JSON Response
Step 1: Enable Bedrock AWS Console → Amazon Bedrock → Model access → Enable Anthropic Claude 3 Haiku → Submit.
Step 2: Create the Lambda function
python
import boto3
import json
import time
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
def lambda_handler(event, context):
start_time = time.time()
try:
body = json.loads(event.get('body', '{}'))
prompt = body.get('prompt', '').strip()
if not prompt:
return build_response(400, {'error': 'prompt is required'})
if len(prompt) > 2000:
return build_response(400, {'error': 'prompt exceeds maximum length of 2000 characters'})
bedrock_body = {
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1000,
"system": "You are a helpful, concise assistant. Respond in plain text.",
"messages": [{"role": "user", "content": prompt}]
}
response = bedrock.invoke_model(
modelId='anthropic.claude-3-haiku-20240307-v1:0',
body=json.dumps(bedrock_body)
)
result = json.loads(response['body'].read())
content = result['content'][0]['text']
input_tokens = result['usage']['input_tokens']
output_tokens = result['usage']['output_tokens']
latency_ms = int((time.time() - start_time) * 1000)
# Log for cost tracking
print(json.dumps({
"event": "bedrock_call",
"inputTokens": input_tokens,
"outputTokens": output_tokens,
"totalTokens": input_tokens + output_tokens,
"estimatedCostUsd": round((input_tokens * 0.00000025) + (output_tokens * 0.00000125), 8),
"latencyMs": latency_ms
}))
return build_response(200, {
'response': content,
'usage': {
'inputTokens': input_tokens,
'outputTokens': output_tokens,
'estimatedCostUsd': round((input_tokens * 0.00000025) + (output_tokens * 0.00000125), 8)
}
})
except bedrock.exceptions.ThrottlingException:
return build_response(429, {'error': 'Rate limit exceeded. Please retry in a moment.'})
except Exception as e:
print(f"Error: {str(e)}")
return build_response(500, {'error': 'Internal server error'})
def build_response(status_code, body):
return {
'statusCode': status_code,
'headers': {
'Content-Type': 'application/json',
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Headers': 'Content-Type',
'Access-Control-Allow-Methods': 'POST, OPTIONS'
},
'body': json.dumps(body)
}
Step 3: IAM permissions Lambda execution role → Attach: AmazonBedrockReadOnly + inline policy allowing bedrock:InvokeModel for your specific model ARN.
Step 4: Create HTTP API Gateway API Gateway → HTTP API → Lambda integration → POST /generate → Deploy to prod stage.
Step 5: Test
bash
curl -X POST https://YOUR_API_ID.execute-api.us-east-1.amazonaws.com/prod/generate
-H "Content-Type: application/json"
-d '{"prompt": "What is the CAP theorem in distributed systems?"}'
What makes this impressive: Add API key authentication. Add CloudWatch alarms on error rate and cost. Write a README with the architecture diagram and an explanation of each design decision.
Project 2: The Document Intelligence App (RAG)
What you are building: Upload PDFs, ask questions, get accurate answers with citations, using AWS Bedrock Knowledge Bases.
Why it matters: RAG is the most commercially deployed AI pattern in enterprise today. Every organisation has documents. Every organisation wants them to be searchable. Demonstrating this architecture is immediately commercially relevant.
The architecture:
PDFs → S3 → Bedrock Knowledge Base → OpenSearch Serverless (vectors)
Client → API Gateway → Lambda (query) → Knowledge Base → Response with citations
Setup via AWS Console:
- Create S3 bucket:
my-kb-documents-[unique-suffix] - Upload 5–10 PDF documents (technical docs, company policies, anything relevant)
- Bedrock → Knowledge Bases → Create knowledge base:
- Name:
document-intelligence-kb - IAM role: create new
- Data source: your S3 bucket
- Embeddings:
amazon.titan-embed-text-v2:0 - Vector store: OpenSearch Serverless (auto-created)
- Sync the knowledge base (10–15 minutes for first sync)
Lambda query handler:
python
import boto3
import json
bedrock_agent = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
KNOWLEDGE_BASE_ID = 'YOUR_KB_ID_HERE'
MODEL_ARN = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0'
def lambda_handler(event, context):
body = json.loads(event.get('body', '{}'))
question = body.get('question', '').strip()
if not question:
return response(400, {'error': 'question is required'})
try:
result = bedrock_agent.retrieve_and_generate(
input={'text': question},
retrieveAndGenerateConfiguration={
'type': 'KNOWLEDGE_BASE',
'knowledgeBaseConfiguration': {
'knowledgeBaseId': KNOWLEDGE_BASE_ID,
'modelArn': MODEL_ARN,
'retrievalConfiguration': {
'vectorSearchConfiguration': {'numberOfResults': 5}
}
}
}
)
answer = result['output']['text']
citations = []
for citation in result.get('citations', []):
for ref in citation.get('retrievedReferences', []):
source = ref.get('location', {}).get('s3Location', {}).get('uri', 'Unknown')
excerpt = ref.get('content', {}).get('text', '')[:200]
citations.append({'source': source, 'excerpt': excerpt})
return response(200, {
'answer': answer,
'citations': citations,
'citationCount': len(citations)
})
except Exception as e:
print(f"Error: {str(e)}")
return response(500, {'error': str(e)})
def response(status_code, body):
return {
'statusCode': status_code,
'headers': {'Content-Type': 'application/json', 'Access-Control-Allow-Origin': '*'},
'body': json.dumps(body)
}
What makes this impressive: Include citations in your response showing which document and passage the answer came from. This demonstrates you understand verifiability in AI output, a critical enterprise concern.
Project 3: AI Security Scanner for GitHub PRs
What you are building: GitHub Actions pipeline that scans PRs with Semgrep and Trivy, uses an LLM to synthesise findings, and posts a structured Slack report.
Why it matters: DevSecOps + AI is one of the most in-demand skill combinations. This project lives at the intersection of platform engineering, security, and AI, rare and valued.
Full implementation: This is covered in Part 2 of the CI/CD series in this publication. The key components are:
- GitHub Actions workflow triggering on PR
- Semgrep scanning source code (runs in container)
- Trivy scanning the built Docker image
- Node.js script that loads scanner output, sends to GPT-4o or Bedrock, and receives a structured risk assessment
- Slack Block Kit message with risk score, top findings, and remediation steps
What makes this impressive: Add historical tracking, store scan results in DynamoDB, and add a simple dashboard showing trend lines. Does the security posture of this repository improve over time? Being able to show improvement over time transforms a scanner into a governance tool.
Project 4: Multi-Cloud Cost Intelligence Dashboard
What you are building: A tool that queries cloud billing APIs, uses an LLM to generate natural-language cost analysis, and surfaces anomalies in a simple dashboard.
Why it matters: Cloud cost waste averages 28% across enterprises [1]. FinOps + AI is a commercially urgent problem. Building even a basic solution demonstrates commercial awareness and architectural creativity.
AWS Cost Explorer integration:
python
import boto3
import json
from datetime import datetime, timedelta
ce = boto3.client('ce', region_name='us-east-1')
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
def get_cost_data(days=30):
end = datetime.now().strftime('%Y-%m-%d')
start = (datetime.now() - timedelta(days=days)).strftime('%Y-%m-%d')
response = ce.get_cost_and_usage(
TimePeriod={'Start': start, 'End': end},
Granularity='DAILY',
Metrics=['BlendedCost'],
GroupBy=[
{'Type': 'DIMENSION', 'Key': 'SERVICE'},
{'Type': 'DIMENSION', 'Key': 'REGION'}
]
)
return response['ResultsByTime']
def analyse_costs_with_ai(cost_data):
# Summarise cost data
service_totals = {}
for day in cost_data:
for group in day['Groups']:
service = group['Keys'][0]
cost = float(group['Metrics']['BlendedCost']['Amount'])
service_totals[service] = service_totals.get(service, 0) + cost
top_services = sorted(service_totals.items(), key=lambda x: x[1], reverse=True)[:10]
prompt = f"""You are a FinOps engineer. Analyse this AWS cost data and provide:
1. A 2-sentence executive summary of the cost picture
2. Top 3 cost concerns or anomalies
3. Three specific, actionable optimisation recommendations with estimated savings
Top 10 services by spend (last 30 days):
{json.dumps(top_services, indent=2)}
Total spend: ${sum(service_totals.values()):.2f}
Return only valid JSON with keys: summary, concerns, recommendations"""
response = bedrock.invoke_model(
modelId='anthropic.claude-3-haiku-20240307-v1:0',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 800,
"messages": [{"role": "user", "content": prompt}]
})
)
result = json.loads(response['body'].read())
content = result['content'][0]['text'].strip()
content = content.replace('```json', '').replace('```', '').strip()
return json.loads(content)
What makes this impressive: Add a simple React dashboard that visualises the cost trends and AI analysis side by side. Add anomaly detection: flag any service with week-over-week cost increase greater than 20%.
Project 5: Bedrock Agent with Tool Use
What you are building: An AI agent powered by Bedrock Agents that can answer questions AND take multi-step actions, query a database, retrieve from a knowledge base, and return structured answers.
Why it matters: Agentic AI is the frontier of enterprise AI deployment. Demonstrating you understand action groups, tool use, and human-in-the-loop design is leading-edge architectural knowledge.
Setup:
- Create a Bedrock Agent in the console
- Define an action group with a Lambda executor:
python
def lambda_handler(event, context):
action_group = event['actionGroup']
function = event['function']
parameters = {p['name']: p['value'] for p in event.get('parameters', [])}
if function == 'get_order_status':
order_id = parameters.get('order_id')
# Query DynamoDB for order
result = get_order_from_dynamodb(order_id)
response_body = {'application/json': {'body': json.dumps(result)}}
elif function == 'calculate_delivery_estimate':
postcode = parameters.get('postcode')
service_level = parameters.get('service_level', 'standard')
result = calculate_delivery(postcode, service_level)
response_body = {'application/json': {'body': json.dumps(result)}}
else:
response_body = {'application/json': {'body': json.dumps({'error': f'Unknown function: {function}'})}}
return {
'actionGroup': action_group,
'function': function,
'functionResponse': {'responseBody': response_body}
}
- Test: “What is the status of order ORD-12345 and when will it be delivered to SW1A 1AA?”
- The agent decides to call
get_order_statusthencalculate_delivery_estimate, synthesises the results, and responds naturally.
What makes this impressive: Add a human approval step for actions with side effects (updating orders, processing refunds). This demonstrates responsible AI design, understanding that some AI-initiated actions need human confirmation before execution.
How to Present These Projects
Every project needs:
README.md structure:
- Problem statement (what real problem does this solve?)
- Architecture diagram (ASCII or linked image)
- Tech stack (services used and why each was chosen)
- Getting started (how to deploy it)
- What I learned / what I would do differently
In interviews: “Walk me through something you’ve built.” Pick the project most relevant to the role. Explain: the problem, the architecture decision, one tradeoff you made, and one thing you would do differently now.
The specificity of your answer, the tradeoff you made, the mistake you corrected, the edge case that surprised you, is what distinguishes a project you actually built from one you read about.
References
[1] Flexera. 2026 State of the Cloud Report: Organisations waste 28% of cloud spend. https://info.flexera.com/CM-REPORT-State-of-the-Cloud
[2] AWS. Amazon Bedrock Documentation. https://docs.aws.amazon.com/bedrock/
[3] Iternal.ai. AI Skills Gap 2026: 1.6 million open AI-related positions globally. https://iternal.ai/ai-skills-gap
[4] GitHub / TechCrunch. GitHub Copilot crosses 20 million users. July 2025. https://techcrunch.com/2025/07/30/github-copilot-crosses-20-million-all-time-users/
[5] DORA. Accelerate State of DevOps Report 2024. https://dora.dev/research/2024/dora-report/
Emmanuela Opurum is a Solutions Architect and Cloud Engineer building AI-native cloud tooling and contributing to open source projects in the CNCF ecosystem. She writes about platform engineering, cloud architecture, and developer experience.
GitHub: Cloud-Architect-Emma