Data Scientist (NLP/LLM)

Europe Europe (remote) Remote (Ukraine) Ukraine Data Science Engineering Python

Required skills

Python / expert
GenAI / expert
SQL / expert
NLP / strong
LLM / strong

Join Sigma Software’s Data Competency Center as a Data Scientist focused on Generative AI & Agent Systems. You’ll be a key player in implementing, building, and helping to ship production-ready GenAI systems – spanning text, vision, and structured data – that tackle real-world business challenges while adhering to enterprise standards for quality, security, and compliance.

You’ll work closely with senior Data Scientists and client teams from solution design through to deployment. Your primary focus will be on the hands-on implementation of the GenAI stack, including building agentic workflows and advanced Retrieval-Augmented Generation (RAG) systems. This role is a fantastic blend of hands-on model work, system implementation, and gaining technical exposure to the full solution lifecycle.

Requirements

  • Professional Experience: 1.5 to 3 years of hands-on professional experience in a Data Science, Machine Learning, or AI/Software Engineering role, with significant focus on Generative AI or related NLP/Search technologies in the last 12-18 months
  • Programming Proficiency: Python
  • GenAI Stack Experience: Hands-on experience with at least one major orchestration framework (e.g., LangChain, LlamaIndex)
  • Search & Retrieval Expertise: Practical experience implementing vector databases, creating embeddings, and designing indexing and chunking strategies for RAG systems
  • Data Fluency: Solid understanding of SQL and data modeling; practical experience handling and transforming unstructured data
  • LLMOps Fundamentals: Familiarity with the principles and tooling for experiment tracking, version control for prompts/pipelines, and tracing/observability in an LLM context
  • Demonstrated Ability: A strong portfolio or track record showing the successful implementation (in production or advanced project settings) of Generative AI components
  • Upper-Intermediate level of English

WOULD BE A PLUS:

  • Experience or deep interest in agentic design patterns and multi-step reasoning
  • Familiarity with foundational MLOps tools for CI/CD and production deployment
  • Familiarity with coding assistants like Cursor, Copilot, Claude Code, Windsurf, etc.
  • Exposure to multimodal applications (text/image/video/audio)
  • Academic background or practical experience in core Machine Learning and Deep Learning concepts

Responsibilities

  • Translate business requirements into functional AI systems (e.g., intelligent assistants, copilots, simple autonomous agents) with defined quality and performance metrics
  • Build and implement cutting-edge RAG systems (“RAG 2.0”), focusing on:
    • Implementing hybrid retrieval (vector, keyword) and structured data retrieval methods
    • Designing and testing effective chunking strategies and embedding models
    • Implementing memory and conversational history management for agents
  • Develop effective prompt engineering techniques and data pipelines to efficiently utilize long-context models for document-heavy use cases
  • Contribute to the evaluation and selection of models (proprietary APIs vs. open-weight models) based on performance, cost, and deployment requirements
  • Implement production elements of the GenAI stack: data handling, prompt orchestration (e.g., using frameworks), tracing, and caching for performance

WHY US

  • Diversity of Domains & Businesses
  • Variety of technology
  • Health & Legal support
  • Active professional community
  • Continuous education and growing
  • Flexible schedule
  • Remote work
  • Outstanding offices (if you choose it)
  • Sports and community activities

REF3683T

Share this vacancy

apply now

apply now

    OR

    Drop your CV here, or

    Supports: DOC, DOCX, PDF, max size 5 Mb

    Take a quiz

    Take a quiz

      Was it comfortable to apply the CV?


      How did you find us?




      Did you hear about us before visiting the site?