Engineer

New Today

Role: Production Engineering -Monitor and analyze application and system logs to identify issues and anomalies. -Troubleshoot and resolve incidents related to system performance, application errors, and infrastructure issues. -Work closely with cross-functional teams to diagnose and resolve complex technical problems. -Implement proactive measures to prevent incidents and improve system reliability. -Respond to major incidents in a timely manner and communicate effectively with stakeholders to provide updates and coordinate resolution efforts. -Document incident details, troubleshooting steps, and resolutions for future reference. -Continuously monitor system health and performance metrics to identify potential issues and areas for optimization. -Participate in on-call rotation to provide 24/7 support for incident management. -Proficiency in Unix/Linux operating systems and command-line utilities. -Experience with troubleshooting application and system logs using tools like grep, awk, and sed. -Familiarity with monitoring and alerting tools such as Datadog, SignalFx, and Splunk. -Strong understanding of networking concepts, including TCP/IP, DNS, and routing. -Familiarity with scripting languages such as Bash, Python, or Perl for automation and task automation. -Knowledge of configuration management tools like Puppet, Chef, or Ansible. -Strong problem-solving and analytical skills, with the ability to quickly diagnose and resolve complex technical issues.
Location:
Omaha

We found some similar jobs based on your search