- AI Agent for DevOps

From Frustration to Confidence: Trust-Centered UX for AI in DevOps

Duration: Q4 2024 – Present

Team:  PMs, ML Engineers, ML Researchers

Tools: Figma, Hotjar, OpenAI API, Anthropic API, Harness

Responsibility: Led end-to-end UX—from defining AI trust principles to designing, prototyping, and validating conversational experiences for YAML generation and editing.

Goal: Help users generate and modify complex pipeline YAMLs using natural language—while building trust and ensuring clarity, transparency, and control in every interaction.

Problem

Devops Engineers are not always masters in Yaml, and users often struggle with configuring pipeline YAMLs—especially when new to CI/CD or unfamiliar with YAML syntax.

Traditional tools fall short in supporting:
  • Edge cases and custom configurations
  • Error recovery and misunderstood inputs
  • Building confidence in AI-generated outputs

Design Challenges & Success Criteria

  • Design for Understandability and Control: Make AI decisions transparent and ensure users feel informed and empowered, not overruled.
  • Handle Ambiguity and Errors Gracefully: Manage incomplete, technical, or misunderstood inputs while minimizing user frustration.
  • Build Scalable Trust for Critical Workflows: Create a trust framework that supports reliability and user confidence in high-stakes DevOps environments.

research & insights

Methods:

Developer interviews and usability testing

Observing real YAML editing workflows in Harness

Comparative analysis with YAML linters, copilot-like tools, and schema validators

Feature Parity :)
Key Insights:

Users don't trust what they don’t understand

Confidence increases with clear explanations and status updates

Giving users the final say in edits builds long-term adoption
To build a trustworthy AI experience for YAML pipeline generation, I grounded my design approach in three core principles: Explainability, User Control, and Error Transparency.

These principles guided every interaction decision and ensured that users—regardless of their technical proficiency—could confidently work with the AI assistant.

Design Principles & Key Decisions

1. Principle: Make AI Thinking Visible

Design Decision: Introduced “Stream of Thoughts” and YAML Summaries

To mitigate the black-box nature of AI, I replaced passive loading states with a dynamic “stream of thoughts” that shows what the agent is doing in real time—e.g., “checking syntax” or “resolving dependencies.” This reassures users and offers insight into the agent’s logic. Once generation completes, the assistant also provides a plain-language summary of the YAML output. This step helps users validate the configuration’s intent before applying it, especially if they’re less fluent in YAML.

Design Principles & Key Decisions

2. Principle: Preserve Human Oversight

Design Decision: Editable Output, Inline Feedback, and Versioning Tools

Users need to feel in control of automation. I designed a flexible canvas interface that allows inline edits of generated YAML, with plans for a diff view and version rollback to support iterative work. Quick feedback mechanisms (like thumbs up/down) also make the assistant feel collaborative and responsive, not prescriptive.

Design Principles & Key Decisions

To build a trustworthy AI experience for YAML pipeline generation, I grounded my design approach in three core principles: Explainability, User Control, and Error Transparency.

These principles guided every interaction decision and ensured that users—regardless of their technical proficiency—could confidently work with the AI assistant.

Prototyping & Testing

Created mid-to-high fidelity prototypes of the chat interface and inline YAML previews

Tested agent behavior and fallback states with internal users simulating real-world pipeline setup tasks

Iterated on interaction patterns based on feedback around trust, error recovery, and language tone

Results & Impact

💬 “This is my new life saver” — direct customer feedback highlighting strong user satisfaction and trust.

📈 Achieved a 10% increase in conversion rate over the previous AI support chat, based on post-launch usage (Week 2–3 data).

🚀 Improved user adoption beyond novelty use, with sustained engagement after initial trial week.