logo

JobNob

Your Career. Our Passion.

Platform Engineer - Robotics


Billow People Services Private Limited


Location

Bangalore | India


Job description

Job Summary:As a Platform Engineer at Ati Motors, you will be responsible for designing, implementing, and maintaining the platform infrastructure that supports our robotic systems. Your role will involve leveraging your expertise in Linux systems, and Nvidia GPU, to ensure the secure, efficient and reliable operation of our robotic platform. You will collaborate closely with cross-functional teams, including software developers, hardware engineers, and system architects, to build a high-performance, scalable, and GPU-accelerated platform that powers our cutting-edge robotics applications.Responsibilities:1. System Design and Configuration:- Design and configure Linux-based systems and environments to support the robotics platform, considering factors such as performance, scalability, and security.- Install, configure, and maintain Linux distributions, kernel modules, drivers, and system-level components.- Implement and enforce system-level configuration standards and best practices.2. Nvidia GPU Integration and Optimization:- Integrate and optimize Nvidia GPU technologies, including CUDA, cuDNN, and TensorRT, within the platform infrastructure to accelerate robotic applications.- Configure and manage GPU drivers, libraries, and frameworks to ensure optimal performance and compatibility.- Collaborate with software developers to leverage GPU capabilities in developing efficient and parallelized algorithms for robotics tasks.3. System Monitoring and Performance Optimization:- Develop and implement monitoring tools and methodologies to ensure the health, performance, and utilization of the platform infrastructure.- Analyze system logs, performance metrics, and GPU utilization to identify and resolve bottlenecks and performance issues.- Optimize GPU resource allocation and workload distribution to maximize performance and efficiency.4. System Security and Maintenance:- Perform routine system administration tasks, including user management, permissions management, and system backups.- Incorporate best security practices including intrusion detection systems, and access controls.- Apply patches, updates, and security fixes to keep the platform infrastructure secure and up to date.5. Scripting and Automation:- Develop and maintain scripts and automation tools to streamline system administrationtasks, deployment processes, and configuration management.- Automate system monitoring, log analysis, and performance reporting to enable proactive identification and resolution of issues.6. Collaboration and Troubleshooting :- Collaborate with software developers, hardware engineers, and system architects to troubleshoot and resolve platform-related issues.- Provide technical guidance and support to cross-functional teams regarding Linux and Nvidia GPU related requirements and optimizations.- Participate in root cause analysis and problem-solving activities to improve the stability and reliability of the platform infrastructure.7. Documentation and Knowledge Sharing :- Create and maintain documentation, including system architecture diagrams, configuration guides, troubleshooting procedures, and GPU utilization guidelines.- Share knowledge and provide training to team members on Linux administration, Nvidia GPU integration, and best practices.Requirements :1. Education and Experience :- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.- Proven experience (3+ years) as a Platform Engineer, or similar role in a robotics or high-performance computing environment.- Solid understanding of Nvidia GPU technologies, including CUDA programming, cuDNN, and TensorRT.2. Technical Skills :- Strong expertise in Linux system administration, including Linux distributions (e.g., Ubuntu, Debian, CentOS), shell scripting (e.g., Bash), and system-level configuration.- Proficiency in integrating and optimizing Nvidia GPU drivers, libraries, and frameworks.- Familiarity with containerization technologies (e.g., Docker) and container orchestration (eg., Kubernetes) is a plus.3. System Monitoring and Troubleshooting :- Experience with monitoring tools (e.g., Nagios, Zabbix, Prometheus) and log analysis tools (e.g., ELK Stack) is a plus.- Ability to diagnose and resolve system-level issues using system logs, performance metrics, and debugging tools.4. Collaboration and Communication :- Strong collaboration skills to work effectively with cross-functional teams, including software developers, hardware engineers, and system architects.- Excellent communication skills, both verbal and written, to convey complex technical concepts to team members and stakeholders.5. GPU Performance Optimization :- Experience in optimizing GPU resource allocation, workload distribution, and parallel computing techniques.- Knowledge of profiling and debugging tools to identify performance bottlenecks and optimize GPU-accelerated applications. (ref:hirist.tech)


Job tags



Salary

All rights reserved