Job Summary: Development of algorithms for data cleaning, standardization, and processing tasks in an NLP project. Employee will be working in multi-disciplinary team with Data Scientists, Deep Learning Engineers, Oil and Gas industry Engineers for the development of LLM based AI Agent.
Responsibilities:
- Develop Data Cleaning, Preparation, Standardization, and Text Processing scripts
- Develop Tabular Data Processing algorithms (including OCR)
- Perform Effective Database Management
- Perform Data Exploration and Analysis
- Collaborate with NLP and Oil & Gas Engineers as well as Web-Developers
- Perform ad-hoc requests related to data processing for NLP algorithms
Requirements:
- Education:
- Currently pursuing a degree in Computer Science, Data Science, Engineering, or a related field.
- Exposure to courses related to machine learning, data analysis, or natural language processing is preferred.
- Technical Skills:
- Knowledge of Python for data manipulation and text processing.
- Familiarity with libraries like NLTK, SpaCy.
- Knowledge of data formats such as CSV, JSON, XML.
- Understanding of regular expressions (Regex) for text search and manipulation.
- Familiarity with OCR technologies.
- Eligibility:
- Time availability: 4 hours per day as minimum