Offshorly Logo

AI Developer

RemotePhilippinesFull-time
₱70,000 - ₱100,000 monthly
About the Job
Role Overview:
We are looking for an AI Developer who can build intelligent document processing pipelines — primarily focused on extracting structured and unstructured data from PDFs, with principles that extend to image-based inputs. You will design and ship OCR-powered solutions that turn raw documents (contracts, forms, reports) into clean, queryable data, and integrate LLM reasoning layers on top using OpenAI and Anthropic APIs.
This is a hands-on engineering role. You will own the full pipeline: ingestion, OCR, parsing, prompt engineering, and API delivery via FastAPI.

What You'll Do:
Document Intelligence & OCR:
  • Design and build end-to-end PDF and document extraction pipelines (flat text and structured output).
  • Select and implement the right OCR strategy per document type — native PDF text layer, layout-aware parsing, or image-based OCR.
  • Parse complex layouts: multi-column text, tables, headers/footers, embedded figures, form fields.
  • Output clean structured JSON or relational data from raw document inputs.

Backend API Development
  • Build and maintain FastAPI services that expose document processing capabilities.
  • Design async endpoints for large document batches; handle timeouts, retries, and partial failures gracefully.
  • Write clean, testable Python code; follow REST best practices
  • Integrate with storage layers (S3 / GCS), queues (Google Pub/Sub), and downstream systems as needed.

Must-Have Requirements:
Core — these are non-negotiable. Hands-on experience building OCR or document extraction pipelines in production. Strong Python skills — clean, maintainable code with proper error handling. Practical experience with FastAPI (routing, dependency injection, async, middleware). Prompt engineering experience with OpenAI or Anthropic APIs — not just calling the API, but designing reliable extraction chainsFamiliarity with PDF internals: text layers, bounding boxes, embedded fonts, page structure

OCR & Document Processing Skills:
We work primarily with PDFs, but the underlying principles apply equally to scanned images. You should know when and how to apply each approach:

Approach | When to Use / What to Know
Native PDF text extraction | pdfplumber, PyMuPDF, pdfminer — fast, accurate when the text layer exists; must detect and fall back when it doesn't
Layout-aware parsing | Preserve reading order across columns, tables, and mixed content blocks
Image-based OCR | Tesseract, EasyOCR, or cloud OCR (AWS Textract, Google Document AI, Azure Form Recognizer) for scanned inputs
Table extraction | Structured output from tabular data — row/column alignment, merged cells, nested tables
Output formats | Flat text, structured JSON, markdown — output type driven by downstream use case


Nice to Have:
  • Experience with vision-language models (GPT-4V, Claude 3 vision) for image-heavy documents.
  • Comfortable with AI-driven development (fully Developer-in-the-loop) 
  • Cloud OCR: AWS Textract, Google Document AI, or Azure Form Recognizer.
  • LangChain, LlamaIndex, or similar orchestration frameworks.
  • Vector search / RAG pipelines for document Q&A.
  • Docker, basic CI/CD, and cloud deployment (AWS / GCP / Azure).
  • Experience with agentic workflows (tool use, multi-step LLM chains).
LLM Integration & Prompt Engineering:
  • Write, test, and iterate prompts for OpenAI (GPT-4o, GPT-4 Turbo) and Anthropic (Claude) models.
  • Agents & Orchestration
  • Apply prompt engineering techniques: chain-of-thought, few-shot, structured output forcing, tool use/function calling
  • Build extraction agents that combine OCR output with LLM reasoning for ambiguous or complex documents
  • Evaluate and benchmark prompt strategies; document what works and why.
You'll Thrive Here If:
  • You care about output quality — you're not happy until the extraction is clean and reliable.
  • You test your prompts like you test your code — systematically, with real data.
  • You know when to use an LLM and when a regex is the better tool.
  • You can communicate tradeoffs clearly to non-technical stakeholders.
  • You're comfortable in a fast-moving, remote-first environment.