Portuguese Legal Document PDF Metadata Extractor

Created By

geek2geeksa year ago

MCP server for extracting metadata from Portuguese legal documents using advanced PDF processing and database architecture

# mcp-portuguese-legal-extractor

# pdf-metadata

Overview Content Tools Comments

Overview

What is the Portuguese Legal Document PDF Metadata Extractor?

The Portuguese Legal Document PDF Metadata Extractor is a robust Python tool designed to extract structured metadata from Portuguese legal document PDFs, specifically those formatted according to the European Case Law Identifier (ECLI).

How to use the Portuguese Legal Document PDF Metadata Extractor?

To use the extractor, clone the project repository, install the required dependencies, and place your PDF files in the designated directory. You can then utilize the PortugueseLegalPDFExtractor class to extract metadata from individual PDFs or batch process multiple documents.

Key features of the Portuguese Legal Document PDF Metadata Extractor?

High accuracy with a 100% confidence score and 96.84% exact match rate.
Production-ready with two extractor variants for different use cases.
Robust error handling and comprehensive validation.
Flexible confidence scoring options.
User-friendly interface with clear progress reporting.

Use cases of the Portuguese Legal Document PDF Metadata Extractor?

Extracting metadata from legal documents for research purposes.
Automating the processing of large volumes of legal PDFs.
Validating the accuracy of extracted data against ground truth.

FAQ from the Portuguese Legal Document PDF Metadata Extractor?

What types of documents can be processed?

The extractor is designed for Portuguese legal documents formatted in ECLI.
Is there a command line interface available?

Yes, the production extractor includes a full CLI for easy usage.
What are the prerequisites for installation?

You need Python 3.8+ and the pdfplumber package installed.

Project Info

Created At

a year ago

Updated At

a year ago

Author Name

geek2geeks

Star

Language

Python

License

Recommend Servers

View All

Framelink Figma MCP Server

@GLips

MCP server to provide Figma layout information to AI coding agents like Cursor

TypeScript

a year ago

Meok Tacho Audit Mcp

2 days ago

Mcp Server Chatsum

@chatmcp

summarize chat message

typescript

a year ago

MCP Server for Milvus

@zilliztech

The Milvus MCP server enables AI applications to interact with Milvus vector databases using natural language commands. It allows AI models to perform vector searches, manage collections, and retrieve data without writing custom database queries. This integration facilitates seamless access to vector data, enhancing the capabilities of AI tools like Claude Desktop and Cursor.

a year ago

Zhipu Web Search

@BigModel

Zhipu Web Search MCP Server is a search engine specifically designed for large models. It integrates four search engines, allowing users to flexibly compare and switch between them. Building upon the web crawling and ranking capabilities of traditional search engines, it enhances intent recognition capabilities, returning results more suitable for large model processing (such as webpage titles, URLs, summaries, site names, site icons, etc.). This helps AI applications achieve "dynamic knowledge acquisition" and "precise scenario adaptation" capabilities.

a year ago

Qiniu MCP Server

@Qiniu

基于七牛云产品构建的 Model Context Protocol (MCP) Server，支持用户在 AI 大模型客户端的上下文中通过该 MCP Server 来访问七牛云存储资源、利用 Dora 服务进行图片操作等。如果有什么需求欢迎在下方评论，您也可以在 github 仓库中提 issue。

Python

a year ago

Puppeteer

@modelcontextprotocol

Browser automation and web scraping

a year ago

Amap Maps

@amap

高德地图官方 MCP Server

a year ago

Filesystem

@modelcontextprotocol

2 months ago

Favcrm

@favcrm

FavCRM is an agentic CRM for service businesses. Its public MCP server exposes **190+ typed tools** — bookings, customers, loyalty, WhatsApp/SMS/email, invoicing — at **100% annotation coverage** (read-only / destructive / open-world labelled; outbound sends approval-gated). Any MCP client (Claude, Cursor, ChatGPT) can register a workspace and operate a real CRM backend from chat. **Agentic registration** lets the agent create the account itself. Free tier, no card.

4 hours ago

Filesystem

Secure file operations with configurable access controls

a year ago

11 minutes ago

A Serper MCP Server

Python

a year ago

Sentry

@modelcontextprotocol

Retrieving and analyzing issues from Sentry.io

a year ago

Neon MCP Server

@neondatabase-labs

MCP server for interacting with Neon Management API and databases

TypeScript

a year ago

Firecrawl Mcp Server

@mendableai

Official Firecrawl MCP Server - Adds powerful web scraping to Cursor, Claude and any other LLM clients.

JavaScript

a year ago

Status 200 Uploads

@iboydroid

Schedule Once. Post Everywhere. Across 8 platforms. TikTok, Instagram, Facebook, YouTube, X, LinkedIn, Pinterest, and Threads. Three ways to publish: from Claude, Cursor, or any MCP client; via REST API or from a clean dashboard. Simple scheduling for creators. Full automation for developers. One workspace covers both. Includes AI caption generation, OAuth account connections, a 5GB media library, deep per platform controls (TikTok privacy levels, YouTube categories, X reply rules, Pinterest boards, branded content disclosures), workspaces with team roles and post approvals, full audit log, and CSV analytics exports. Free forever plan. No credit card, no trial. Paid tiers from $9/mo unlock X posting and higher daily limits.

4 hours ago

Livonian

3 hours ago

Sentiment Analyzer

@DON-VXNKS

8 hours ago

Autostackup Ultimate Pack

@Autostackup

Business strategy frameworks for Claude — as MCP tools. BANT + MEDDIC lead qualification, ICP scoring, weighted pipeline forecasting, Kotler 9-element marketing audit, full IMC campaign planning. No API keys required for most tools. Open source. MIT.

3 hours ago

SearxNG MCP

@ihor-sokoliuk

Private web search for AI assistants — connect any SearXNG instance to Claude, Cursor, and more.

an hour ago

Test

@modelcontextprotocol

test

6 months ago

Slack

@modelcontextprotocol

Channel management and messaging capabilities

a year ago

Time

@modelcontextprotocol

A Model Context Protocol server that provides time and timezone conversion capabilities. This server enables LLMs to get current time information and perform timezone conversions using IANA timezone names, with automatic system timezone detection.

4 months ago

Livonian

5 hours ago

Google Maps

@modelcontextprotocol

Location services, directions, and place details

a year ago

Claude For Safari

@Lyosis

Safari Web Extension + Node.js MCP bridge giving Claude Desktop full control over Safari — navigate, read pages, click elements, fill forms, and manage tabs. No Playwright or WebDriver dependency.

a day ago

18 hours ago

a day ago

Synx is an AI first board for projects and tasks. It enhances your agentic workflow by helping the AI (and the human) manage the development process from start to finish.

2 days ago