Company – Appnomic Systems
Location – Mumbai
Status – Full Time, Employee
Job Category – Senior Analyst for Incident Analysis and Alert management
Relevant Work Experience – 4 – 8 Years
Career Level – Experienced (Non Manager)
Education Level – Bachelor’s degree or higher in Computer Science, Engineering
About Appnomic:
Appnomic is a leading provider of innovative IT operations management solutions, specializing in Application Performance Monitoring (APM) and Artificial Intelligence for IT Operations (AIOps). Founded with a mission to transform how businesses manage and optimize their IT infrastructure, Appnomic leverages advanced analytics, machine learning, and automation to deliver unparalleled insights and proactive solutions. Our comprehensive platform empowers organizations to predict, detect, and resolve IT issues before they impact end-users, ensuring seamless operations and enhanced user experiences.
Read more at appnomic.com and follow us on Twitter and on LinkedIn.
Responsibilities:
- Identify areas of opportunity for stabilization of Application environments,
- Minimize application outages and cost saves through better management of HW/SW resources, with the help of other infrastructure team.
- Enable App owners to identify new ways to improve resilience, performance levels and supportability Weekly/Monthly Alert analysis to review the benefits shown out of HEAL, by doing;
o Process optimization
o Application Hygiene activity
o Application Capacity Upgrade
o Incident integration with existing ITSM tool
o Alert management, checking for all alerts getting captured. Fine tuning for False alert, wrong alerts.
o Ongoing adjustment of thresholds alerts in all application.
o Integrating HEAL cluster data with other existing tools in Customer environment like ITSM and other APM Tools
Requirement
- Software development experience in JAVA: Should have minimum 3 years of hands-on programming / development / Debugging experience in JAVA
- Experience with monitoring tools like Prometheus, dashboarding tools like Grafana and log monitoring tools like ELK stack.
- Proficiency in using SIEM tools like Splunk, IBM QRadar, ArcSight, or LogRhythm for monitoring and analyzing security events.
- Understanding how to identify, investigate, and respond to security incidents such as malware attacks, phishing, unauthorized access, and other anomalies.
- Proficiency in analyzing logs from various sources (e.g., firewalls, servers, SIEM) to detect suspicious activity.
- Experience in developing and following incident response plans, including containment, eradication, and recovery strategies.
- Proficiency in scripting languages like Python, PowerShell, or Bash for automating alert triage, response tasks, and other repetitive activities.
- Proficiency in using ticketing systems (e.g., ServiceNow, JIRA) for tracking incidents, assigning tasks, and documenting responses.