- JavaScript 49.9%
- Java 48.5%
- HTML 0.9%
- Dockerfile 0.5%
- CSS 0.2%
| .github/workflows | ||
| .vscode | ||
| Backend | ||
| docs/plans | ||
| Frontend/informate-frontend | ||
| .dockerignore | ||
| .DS_Store | ||
| .env.example | ||
| .gitignore | ||
| articles.db | ||
| data.db | ||
| docker-compose.yml | ||
| DOCKER.md | ||
| env.template | ||
| README.md | ||
| README.txt | ||
Informate - News Article Scraper and Summarizer
A full-stack application that scrapes news articles from websites, automatically summarizes them using AI, and stores them locally for reference.
Features
- 🔍 Web Scraping: Extract article content from news websites
- 🤖 AI Summarization: Generate concise summaries using OpenAI GPT
- 🔑 Keyword Extraction: Automatically identify key topics
- 🖼️ Image Download: Save relevant images from articles
- 👤 User Authentication: Secure login and registration system
- 💾 Local Storage: SQLite database for article persistence
- 🌐 REST API: Backend API for frontend integration
- ⚛️ React Frontend: Modern, responsive web interface with dark mode
- 🎨 Modern UI: Clean, minimalist design with Tailwind CSS
- 📱 Responsive Design: Optimized for desktop, tablet, and mobile
- 🌙 Dark Mode: System-aware theme with manual toggle
- 🐳 Docker Ready: One-command deployment with Docker Compose
Tech Stack
Backend
- Java 17+ with Maven
- Spark Framework for REST API
- SQLite for data storage
- JSoup for web scraping
- OpenAI API for AI summarization
Frontend
- React 19 with modern hooks
- React Router for navigation
- Axios for API communication
- Context API for state management
- Tailwind CSS for modern styling
- Dark mode support
Deployment
- Docker & Docker Compose for containerization
- Nginx for serving frontend in production
- Multi-stage builds for optimized images
Getting Started
You can run Informate either locally or using Docker. Docker is recommended for easier setup and deployment.
Option 1: Docker (Recommended)
Prerequisites
- Docker 20.10+
- Docker Compose 2.0+
- OpenAI API key
Quick Start
-
Clone the repository
git clone https://github.com/yourusername/informate.git cd informate -
Create the proxy network
docker network create proxy -
Set up environment variables
cp .env.example .env # Edit .env and add your OpenAI API key -
Run with Docker Compose
docker-compose up -d -
Access the application
- Frontend: http://localhost:3001
- Backend API: http://localhost:8080
For detailed Docker documentation, see DOCKER.md
Option 2: Local Installation
Prerequisites
- Java Development Kit (JDK) 17 or later
- Maven 3.6+
- Node.js 16+ and npm
- OpenAI API key
Installation
-
Clone the repository
git clone https://github.com/yourusername/informate.git cd informate -
Set up environment variables
# Copy template files cp env.template .env cp Backend/informate/env.template Backend/informate/.env # Edit the .env files and add your OpenAI API key OPENAI_API_KEY=your_openai_api_key_here -
Build and run the backend
cd Backend/informate mvn clean install mvn exec:java -Dexec.mainClass="com.example.informate.main" -
Install and run the frontend
cd Frontend/informate-frontend npm install npm start
Getting an OpenAI API Key
- Visit OpenAI API Keys
- Create an account or log in
- Generate a new API key
- Copy the key (starts with
sk-...) - Add it to your
.envfiles
Usage
Web Interface
- Open your browser to
http://localhost:3000 - Register a new account or log in
- Use the dashboard to:
- Add new articles by URL
- View article summaries
- Search and filter articles
- View full article details with images
API Endpoints
Authentication
POST /api/auth/register- Register new userPOST /api/auth/login- User loginGET /api/auth/validate- Validate token
Articles
POST /api/articles/add- Add new articleGET /api/articles/all- Get all articlesGET /api/articles/:title- Get article by title
Project Structure
Informate/
├── Backend/
│ └── informate/
│ ├── src/main/java/com/example/informate/
│ │ ├── main.java # Main application entry point
│ │ ├── auth.java # Authentication logic
│ │ ├── articles.java # Article management
│ │ ├── scraper.java # Web scraping functionality
│ │ ├── AI.java # OpenAI integration
│ │ └── EnvLoader.java # Environment variable loader
│ ├── pom.xml # Maven dependencies
│ └── env.template # Environment template
├── Frontend/
│ └── informate-frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── contexts/ # Context providers
│ │ └── App.js # Main app component
│ └── package.json # Node.js dependencies
├── .gitignore # Git ignore rules
├── env.template # Environment template
└── README.md # This file
Security Notes
- Environment variables (
.envfiles) are excluded from version control - Database files are not committed to prevent data leaks
- API keys and sensitive data should never be hardcoded
Development
Adding New Features
- Backend changes go in
Backend/informate/src/main/java/com/example/informate/ - Frontend changes go in
Frontend/informate-frontend/src/ - Update API documentation when adding new endpoints
Database Schema
The application uses SQLite with two main tables:
user- User authentication dataarticles- Article content and metadata
Troubleshooting
Common Issues
-
"OpenAI API key not found"
- Ensure
.envfiles are created with valid API key - Check that the key starts with
sk-
- Ensure
-
"Database connection failed"
- Ensure SQLite is available
- Check file permissions in project directory
-
"Article scraping failed"
- Verify URL is accessible
- Some websites may block automated access
-
Frontend not connecting to backend
- Ensure backend is running on port 8080
- Check CORS configuration
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
License
This project is for educational purposes. Please respect website terms of service when scraping content.
Contact
For questions or support, please contact: AGI105@student.aru.ac.uk