Initial InfraPulse scaffold
This commit is contained in:
@@ -0,0 +1,29 @@
|
||||
# Alerting Design
|
||||
|
||||
Alerting is built around alert rules, incidents, notification policies, and notification history.
|
||||
|
||||
## Alert Rules
|
||||
|
||||
An alert rule turns monitor status or metric data into an incident. Initial rule behavior should support:
|
||||
|
||||
- Failure thresholds
|
||||
- Recovery notifications
|
||||
- Cooldown
|
||||
- Severity
|
||||
- Acknowledge
|
||||
- Silence
|
||||
|
||||
## Incidents
|
||||
|
||||
Incidents represent active or historical alert events. They include opened time, resolved time, current status, severity, related asset, related monitor, related alert rule, notification history, acknowledgement, and silence state.
|
||||
|
||||
## Notifications
|
||||
|
||||
Initial channels:
|
||||
|
||||
- Email / SMTP
|
||||
- Mattermost incoming webhook
|
||||
- Zoom Team Chat incoming webhook
|
||||
- Generic webhook
|
||||
|
||||
Alert messages should be human-readable and include asset, check, status, duration, timestamps, and a link back to InfraPulse.
|
||||
@@ -0,0 +1,28 @@
|
||||
# Architecture
|
||||
|
||||
InfraPulse is a monorepo with four main areas:
|
||||
|
||||
- `backend`: FastAPI service exposing REST endpoints and owning database access.
|
||||
- `worker`: Background scheduler and collectors for checks and alert evaluation.
|
||||
- `frontend`: React application for authenticated operations.
|
||||
- `docs`: Product, security, alerting, discovery, and planning documents.
|
||||
|
||||
## Backend
|
||||
|
||||
The backend uses FastAPI, SQLAlchemy, Alembic, Pydantic, PostgreSQL, and JWT authentication. It owns core domain models: users, assets, credentials, monitors, check results, metrics, alert rules, incidents, notification channels, and audit events.
|
||||
|
||||
## Worker
|
||||
|
||||
The worker is a separate Python process. It will poll due monitors, run collectors, write check results and metrics, evaluate alert rules, open or resolve incidents, and enqueue notification delivery.
|
||||
|
||||
## Frontend
|
||||
|
||||
The frontend uses React, TypeScript, Vite, and Tailwind CSS. It starts with protected routes, a login flow, and dashboard/inventory shells.
|
||||
|
||||
## Queue and Scheduling
|
||||
|
||||
Redis is included from the beginning so background work can move from a simple scheduler to a real queue without changing the deployment shape.
|
||||
|
||||
## Plugin Direction
|
||||
|
||||
Plugins will eventually implement connection tests, discovery, collection, and default alert rule suggestions. Initial collectors can be simpler, but they should not block future plugin extraction.
|
||||
@@ -0,0 +1,39 @@
|
||||
# Development
|
||||
|
||||
## Local Docker Stack
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
docker compose -f docker-compose.dev.yml up --build
|
||||
```
|
||||
|
||||
The dev stack runs PostgreSQL, Redis, backend, worker, and frontend.
|
||||
|
||||
## Backend
|
||||
|
||||
Backend source lives in `backend/app`. Migrations live in `backend/alembic`.
|
||||
|
||||
Useful commands from `backend/`:
|
||||
|
||||
```bash
|
||||
alembic upgrade head
|
||||
uvicorn app.main:app --reload
|
||||
```
|
||||
|
||||
## Frontend
|
||||
|
||||
Frontend source lives in `frontend/src`.
|
||||
|
||||
Useful commands from `frontend/`:
|
||||
|
||||
```bash
|
||||
npm install
|
||||
npm run dev
|
||||
```
|
||||
|
||||
## Tests and Checks
|
||||
|
||||
```bash
|
||||
./scripts/lint.sh
|
||||
./scripts/test.sh
|
||||
```
|
||||
@@ -0,0 +1,30 @@
|
||||
# Discovery Design
|
||||
|
||||
Guided discovery is a core InfraPulse workflow.
|
||||
|
||||
```text
|
||||
Add target
|
||||
Choose target type
|
||||
Enter address and credentials
|
||||
Test connection
|
||||
Discover available items
|
||||
Show friendly list of discovered items
|
||||
User selects what to monitor
|
||||
User selects what should alert
|
||||
Create monitors and optional alert rules
|
||||
```
|
||||
|
||||
## Monitor vs Alert Separation
|
||||
|
||||
InfraPulse must allow monitoring without alerting. Every discovered item should eventually support separate choices:
|
||||
|
||||
- Collect metric
|
||||
- Graph metric
|
||||
- Show on dashboard
|
||||
- Alert on condition
|
||||
|
||||
This prevents every monitor from automatically becoming an alert source.
|
||||
|
||||
## Friendly SNMP
|
||||
|
||||
The normal UI must not show raw OIDs. SNMP profiles should translate implementation details into friendly labels such as interface names, traffic counters, status, errors, uptime, CPU, and memory.
|
||||
@@ -0,0 +1,49 @@
|
||||
# Gitea Issue Plan
|
||||
|
||||
## Milestones
|
||||
|
||||
- Milestone 1: Project Foundation
|
||||
- Milestone 2: Authentication and Security
|
||||
- Milestone 3: Inventory and Monitor Core
|
||||
- Milestone 4: First Checks and Metrics
|
||||
- Milestone 5: Alerting and Notifications
|
||||
- Milestone 6: Guided Discovery
|
||||
- Milestone 7: MVP Polish
|
||||
|
||||
## Suggested Initial Issues
|
||||
|
||||
1. Create repository structure
|
||||
2. Add Docker Compose development environment
|
||||
3. Create FastAPI backend skeleton
|
||||
4. Create React frontend skeleton
|
||||
5. Add PostgreSQL and Alembic migrations
|
||||
6. Add user model and authentication
|
||||
7. Add role-based access control
|
||||
8. Add asset data model
|
||||
9. Add credential vault model with encrypted secrets
|
||||
10. Add monitor data model
|
||||
11. Add check result and metric models
|
||||
12. Add alert rule and incident models
|
||||
13. Add notification channel model
|
||||
14. Implement ping monitor
|
||||
15. Implement TCP port monitor
|
||||
16. Implement HTTP status monitor
|
||||
17. Implement website content monitor
|
||||
18. Implement TLS expiry monitor
|
||||
19. Implement basic alert evaluation
|
||||
20. Implement email notifications
|
||||
21. Implement Mattermost webhook notifications
|
||||
22. Implement Zoom webhook notifications
|
||||
23. Implement generic webhook notifications
|
||||
24. Add login page
|
||||
25. Add dashboard shell
|
||||
26. Add asset list page
|
||||
27. Add asset detail page
|
||||
28. Add alert center page
|
||||
29. Add notification settings page
|
||||
30. Add credential vault page
|
||||
31. Add first guided monitor creation wizard
|
||||
32. Add audit log foundation
|
||||
33. Add README setup instructions
|
||||
34. Add architecture documentation
|
||||
35. Add security documentation
|
||||
@@ -0,0 +1,36 @@
|
||||
# Plugin Design
|
||||
|
||||
Plugins will let InfraPulse add collectors and discovery logic without hard-coding every integration into the core API.
|
||||
|
||||
Target shape:
|
||||
|
||||
```python
|
||||
class InfraPulsePlugin:
|
||||
name: str
|
||||
display_name: str
|
||||
|
||||
def test_connection(self, target, credentials):
|
||||
pass
|
||||
|
||||
def discover(self, target, credentials):
|
||||
pass
|
||||
|
||||
def collect(self, monitor):
|
||||
pass
|
||||
|
||||
def default_alert_rules(self, discovered_item):
|
||||
pass
|
||||
```
|
||||
|
||||
The first implementation can use simple internal collectors, but the interfaces should preserve this path.
|
||||
|
||||
Planned plugin areas:
|
||||
|
||||
- Website checks
|
||||
- Generic SNMP
|
||||
- Proxmox VE
|
||||
- Docker
|
||||
- UniFi
|
||||
- TrueNAS
|
||||
- Technitium DNS
|
||||
- Active Directory
|
||||
@@ -0,0 +1,48 @@
|
||||
# Roadmap
|
||||
|
||||
## v0.1
|
||||
|
||||
- Login system, local users, roles, and protected routes
|
||||
- PostgreSQL, Alembic, API service, worker service, and frontend app
|
||||
- Assets, credentials, monitors, alert rules, incidents, and notification channels
|
||||
- HTTP/HTTPS status checks, expected text checks, TLS expiry checks
|
||||
- Alert evaluation, incident acknowledgement, silence, and notification history
|
||||
- Email, Mattermost, Zoom Team Chat, and generic webhook notification foundations
|
||||
- Basic dashboard, website monitor creation, alert center, credential vault, and admin pages
|
||||
|
||||
## v0.2
|
||||
|
||||
- Proxmox VE plugin
|
||||
- Docker plugin
|
||||
- Linux and Windows exporter support
|
||||
- Better graphing
|
||||
- Maintenance windows
|
||||
- Notification routing
|
||||
|
||||
## v0.3
|
||||
|
||||
- UniFi plugin
|
||||
- TrueNAS plugin
|
||||
- Technitium DNS plugin
|
||||
- Active Directory health checks
|
||||
- LDAP/AD login
|
||||
- Audit log expansion
|
||||
|
||||
## v0.4
|
||||
|
||||
- Distributed collectors
|
||||
- Subnet discovery
|
||||
- Device templates
|
||||
- Custom dashboards
|
||||
- Public/internal status pages
|
||||
|
||||
## Future NMS Expansion
|
||||
|
||||
- SNMP traps
|
||||
- Syslog
|
||||
- NetFlow/sFlow
|
||||
- Topology maps
|
||||
- Config backups
|
||||
- Multi-site support
|
||||
- Escalation policies
|
||||
- On-call schedules
|
||||
@@ -0,0 +1,30 @@
|
||||
# Security
|
||||
|
||||
InfraPulse must be secure from the beginning because it will store infrastructure credentials.
|
||||
|
||||
## Authentication
|
||||
|
||||
The initial implementation supports local username/password login with hashed passwords and JWT bearer tokens. Dashboard and API access must not be available anonymously.
|
||||
|
||||
Initial roles:
|
||||
|
||||
- Viewer: can view dashboards, assets, monitors, graphs, and alerts.
|
||||
- Operator: can acknowledge alerts, silence alerts, and manage incidents.
|
||||
- Admin: can manage assets, monitors, credentials, notification channels, and alert rules.
|
||||
- Owner: can manage users, roles, global settings, and authentication settings.
|
||||
|
||||
## Credential Storage
|
||||
|
||||
Credential records are modeled separately from monitors and assets. Secret fields must be encrypted at rest before real credential storage is enabled. Stored secret values must never be returned to the frontend after creation.
|
||||
|
||||
Rules:
|
||||
|
||||
- Use `INFRAPULSE_SECRET_KEY` from the environment.
|
||||
- Never log secrets.
|
||||
- Mask saved secrets in the UI.
|
||||
- Audit credential create, update, and delete events.
|
||||
- Prefer read-only API tokens and least-privileged credentials.
|
||||
|
||||
## Future Authentication
|
||||
|
||||
Planned future options include LDAP/Active Directory login, OIDC, SAML if needed, and API tokens.
|
||||
@@ -0,0 +1,24 @@
|
||||
# InfraPulse Vision
|
||||
|
||||
InfraPulse is a secure, self-hosted monitoring platform for homelabs, small businesses, and internal IT teams.
|
||||
|
||||
The v0.1 product should feel like a polished appliance, not a pile of raw monitoring configuration. Users should be guided through adding targets, testing connections, discovering useful items, choosing what to monitor, and separately choosing what should alert.
|
||||
|
||||
## Design Philosophy
|
||||
|
||||
InfraPulse exposes intent, not implementation details.
|
||||
|
||||
Raw SNMP OIDs, probe internals, and collector details belong behind friendly profiles and advanced tools. The normal UI should say things like "Port 5 outbound traffic", "Graph this port", and "Alert if port goes down".
|
||||
|
||||
## Initial Scope
|
||||
|
||||
The initial release targets:
|
||||
|
||||
- Authentication and roles
|
||||
- Assets, monitors, alert rules, incidents, credentials, and notification channels
|
||||
- Website checks for HTTP status, expected text, and TLS expiry
|
||||
- Dashboard status views
|
||||
- Mattermost, Zoom Team Chat, email, and generic webhook notifications
|
||||
- Foundations for guided discovery and plugins
|
||||
|
||||
Advanced NMS features such as topology, traps, syslog, NetFlow, distributed pollers, and config backup are future work.
|
||||
Reference in New Issue
Block a user