Job Summary: Development of algorithms for data cleaning, standardization, and processing tasks in an NLP project. Employee will be working in multi-disciplinary team with Data Scientists, Deep Learning Engineers, Oil and Gas industry Engineers for the development of LLM based AI Agent.

Responsibilities: 

  • Develop Data Cleaning, Preparation, Standardization, and Text Processing scripts 
  • Develop Tabular Data Processing algorithms (including OCR)
  • Perform Effective Database Management  
  • Perform Data Exploration and Analysis 
  • Collaborate with NLP and Oil & Gas Engineers as well as Web-Developers
  • Perform ad-hoc requests related to data processing for NLP algorithms 

Requirements: 

  1. Education: 
  • Currently pursuing a degree in Computer Science, Data Science, Engineering, or a related field. 
  • Exposure to courses related to machine learning, data analysis, or natural language processing is preferred.

  1. Technical Skills: 
  • Knowledge of Python for data manipulation and text processing. 
  • Familiarity with libraries like NLTK, SpaCy.
  • Knowledge of data formats such as CSV, JSON, XML.
  • Understanding of regular expressions (Regex) for text search and manipulation.
  • Familiarity with OCR technologies.

  1. Eligibility:
  • Time availability: 4 hours per day as minimum