Portuguese Legal Document PDF Metadata Extractor

Created By

geek2geeksa year ago

MCP server for extracting metadata from Portuguese legal documents using advanced PDF processing and database architecture

# mcp-portuguese-legal-extractor

# pdf-metadata

Overview Content Tools Comments

Overview

What is the Portuguese Legal Document PDF Metadata Extractor?

The Portuguese Legal Document PDF Metadata Extractor is a robust Python tool designed to extract structured metadata from Portuguese legal document PDFs, specifically those formatted according to the European Case Law Identifier (ECLI).

How to use the Portuguese Legal Document PDF Metadata Extractor?

To use the extractor, clone the project repository, install the required dependencies, and place your PDF files in the designated directory. You can then utilize the PortugueseLegalPDFExtractor class to extract metadata from individual PDFs or batch process multiple documents.

Key features of the Portuguese Legal Document PDF Metadata Extractor?

High accuracy with a 100% confidence score and 96.84% exact match rate.
Production-ready with two extractor variants for different use cases.
Robust error handling and comprehensive validation.
Flexible confidence scoring options.
User-friendly interface with clear progress reporting.

Use cases of the Portuguese Legal Document PDF Metadata Extractor?

Extracting metadata from legal documents for research purposes.
Automating the processing of large volumes of legal PDFs.
Validating the accuracy of extracted data against ground truth.

FAQ from the Portuguese Legal Document PDF Metadata Extractor?

What types of documents can be processed?

The extractor is designed for Portuguese legal documents formatted in ECLI.
Is there a command line interface available?

Yes, the production extractor includes a full CLI for easy usage.
What are the prerequisites for installation?

You need Python 3.8+ and the pdfplumber package installed.

Project Info

Created At

a year ago

Updated At

a year ago

Author Name

geek2geeks

Star

Language

Python

License

Recommend Servers

View All

Acopio

@Daniel Valcarce

Save developer tools once — repos, CLIs, API docs — then let Claude, Cursor, and any MCP client search and recommend from your own curated catalog instead of generic model knowledge. Remote MCP over Streamable HTTP (OAuth 2.0 + DCR).

a day ago

MCP Server with 10 Tools

@Ritika Bhati

A powerful MCP Server with 10 tools - weather, web search, currency converter, news, translator, QR code, image generator, IP info, Hacker News, URL fetcher. All free APIs!

a day ago

Tatsu55

15 hours ago

XGR.Network MCP

@xgr-network

XGR.Network MCP is a remote MCP server that gives AI agents access to the XGR stack. It supports XDaLa workflow preparation, XGRChain evidence lookup, Explorer data access and programmable process automation through a public streamable HTTP endpoint.

16 hours ago

Agent Signals — Hiring, SEC, Research, GitHub & HN Data

@hassanahashish-design

Five live data tools for AI agents — company hiring signals (Greenhouse/Lever/Ashby), SEC EDGAR filings, academic papers (OpenAlex), GitHub repos, and Hacker News — as flat, citation-ready JSON. Hosted on Apify; pay-per-result billing to the caller's own Apify account, empty results cost $0.

2 days ago

Measurement Uncertainty (gum / Iso 17025)

@kyb8801

The first Model Context Protocol server for GUM-compliant measurement uncertainty analysis. Built for ISO/IEC 17025 calibration labs, ISO 10012:2026 measurement management systems, and KOLAS / A2LA / UKAS accredited testing. Exposes 10 tools: Type A/B uncertainty, combined standard uncertainty, Welch-Satterthwaite effective DoF, expanded uncertainty with coverage factor k, Monte Carlo propagation (JCGM 101:2008), and pre-built KOLAS-ready uncertainty budget templates (CMM, CD-SEM, DMM, thermocouple, balance, OCD). Standards-referenceable to JCGM 100:2008 (GUM). Works in Claude Desktop, Cursor, Windsurf, Cline. Hosted endpoint, no install required.

a day ago

Mnemom

2 days ago

Rockmoon Financial Data

@RockMoon

Cross-market (US/JP/KR) financial data over MCP — financials, segments, ownership, valuation metrics & prices, every value traceable to its source filing. 18 tools; one API key works for both REST and MCP.

13 hours ago

Almega

@almega-ai

Give your AI agents a wallet they can't abuse. Almega is an MCP server that puts a control layer in front of every payment: per-agent spending limits, allow-listed categories, 1-click human approval on sensitive transactions, and a full audit ledger. Two backends ship in one file — `memory` (zero-config, 30-second demo) and `stripe` (real Stripe Issuing test-mode virtual cards, no real money). 7 tools, stdio transport, Python 3.10+, MIT.

2 days ago

Lu71 — Agentic Dispute Resolution

@Team Lu71

Agentic dispute resolution. File chargebacks on Visa, Mastercard, Stripe Issuing, and Lithic when AI agent purchases go wrong. Verify crypto addresses against sanctions and scam databases before sending (10 chains). Record signed purchase intent as tamper proof evidence. Real time webhook updates. 5% success fee only when you win.

a day ago

Gp Intel

@gparientee

Verified European private equity ownership data: who owns a company, PE firm portfolios, exits by year. 21,000+ companies, 900+ GPs, hand-checked, source link on every response. No auth.

2 days ago

Memara Memory

@Memara

Persistent memory API and MCP server for AI agents and workflows. Store, search, and retrieve memories with semantic search across Claude, ChatGPT, n8n, Zapier, Dify, and more. Built for developers who need reliable, long-term context for their AI applications.

2 days ago

Btcvision Oracle

@welove111

Live Bitcoin price, AI-powered predictions for 2027-2030 (82% accuracy across 4 halving cycles), halving countdown analytics, on-chain market signals, and Lightning donation tools. Free, open MCP server — no auth required for core tools.

20 hours ago

//beforeyouship — LLM Cost Modeling From Your Editor

@Indiegoing

Query realistic LLM cost models without leaving your editor. beforeyouship models the **true monthly cost** of an LLM app architecture — retries, prompt caching, batch discounts, infra overhead, and 3×/10× growth — across GPT-5.x, Claude, Gemini, DeepSeek, and more. Not a token calculator: a planning tool for the design phase, before you commit to a stack. **No API key needed to try it** — demo mode covers the six free-tier models. A Pro key from [beforeyouship.dev](https://beforeyouship.dev) unlocks the full 18-model catalog. ## What you can ask - "How much will a RAG chatbot cost at 10,000 requests/day?" - "Compare Claude Haiku vs Gemini Flash pricing for my workload" - "What's the cheapest model for a multi-step agent at scale?" - "Show me current per-token prices for Anthropic models" ## Tools ### `estimate_cost` Full cost model for an architecture at a given usage level. Returns Naive / Realistic / Worst Case monthly cost per model, 3×/10× growth scenarios, and an opinionated recommendation with reasoning. ### `get_model_prices` Current per-1M-token pricing — input, output, cached input, batch — with context windows and staleness metadata. ### `list_archetypes` Seven preset architecture patterns (simple chatbot, chatbot with history, RAG pipeline, multi-model router, coding assistant, document processor, multi-step agent) used as starting points for estimates. ## Setup **Claude Code:** ```bash claude mcp add --transport http beforeyouship https://beforeyouship.dev/api/mcp ``` **Cursor / other clients** — add a remote server: ```json { "mcpServers": { "beforeyouship": { "type": "streamable-http", "url": "https://beforeyouship.dev/api/mcp" } } } ``` Add an `Authorization: Bearer bys_...` header with a Pro key for the full catalog. ## Try it > Estimate the monthly cost of a RAG pipeline at 10,000 requests/day

2 days ago

flatten-mcp

@shayaShav

An MCP server that flattens Claude Code sessions — keeping every prompt and event verbatim while reclaiming context tokens, so you resume the exact same raw conversation at a lower token count instead of compacting it into a lossy summary. It moves bulky tool output (large file reads, command logs, base64 screenshots) into a sidecar file, leaving a tiny retrievable reference in its place. Crash-safe, idempotent, and fully reversible. Real example from the README: a 317,236-token session flattened to 182,287 tokens.

2 days ago

Cliqo Mcp

Create and manage short links - shorten URLs, list / inspect links, track credits. No subscriptions.

2 days ago

MCP Server for Milvus

@zilliztech

The Milvus MCP server enables AI applications to interact with Milvus vector databases using natural language commands. It allows AI models to perform vector searches, manage collections, and retrieve data without writing custom database queries. This integration facilitates seamless access to vector data, enhancing the capabilities of AI tools like Claude Desktop and Cursor.

a year ago

Jina AI MCP Tools

@PsychArch

A Model Context Protocol (MCP) server that integrates with Jina AI Search Foundation APIs.

JavaScript

a year ago

Search1API

One API for Search, Crawling, and Sitemaps

a year ago

PostgreSQL

@modelcontextprotocol

Read-only database access with schema inspection

a year ago

Cirdan

@adanb13

Cirdan maps and watches the live infrastructure your agent session can reach — Docker, Kubernetes, cloud, IaC, and telemetry — then exposes it over MCP. It fingerprints the environment, builds a dependency graph, detects incidents, and can run evidence-backed actions. It inherits the session's own access and never escalates beyond it.

21 hours ago

Time

@modelcontextprotocol

A Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

4 months ago

Aiimagemultistyle

@codecraftm

A Model Context Protocol (MCP) server for image generation and manipulation using fal.ai's Stable Diffusion model.

a year ago

Sequential Thinking

@modelcontextprotocol

An MCP server implementation that provides a tool for dynamic and reflective problem-solving through a structured thinking process.

a year ago

Snapforge

@sporty303

Screenshot & HTML-to-PDF rendering API for AI agents — capture any URL or raw HTML as PNG/JPEG/PDF via a managed Chromium fleet. Free tier.

2 hours ago

Framelink Figma MCP Server

@GLips

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

Brave Search

@modelcontextprotocol

Web and local search using Brave's Search API

a year ago

Redis

@modelcontextprotocol

A Model Context Protocol server that provides access to Redis databases. This server enables LLMs to interact with Redis key-value stores through a set of standardized tools.

a year ago

Linkedai

@DatTheMaster

LinkedIn for AI agents. Agents register structured profiles, list projects, evaluate fit via FitReports, and propose connections — all through a hosted MCP server. 27 tools, zero install. Handlers (humans) approve connections. Built on Cloudflare Workers + KV.

2 days ago

Video Overlay Kit

@alichherawalla

AI-driven animated b-roll overlay renderer for short-form video. Paste your script into your AI coding tool, the MCP server writes the scene spec and renders an mp4. Free, MIT, local.

8 hours ago