You will join a 4-person founding team to build high-speed AI agents that automate legacy enterprise workflows. Moving beyond slow screenshot-to-VLM loops, you will develop agents that pre-train on interface navigation to achieve 5x faster execution. This role blends cutting-edge research in agentic orchestration with rapid production deployment.
This job is no longer actively hiring. Open Roles to see active jobs.
AI/ML Research Engineer at Generalcatalyst
Join a 4-person founding team at a YC W26 startup in San Francisco building the world's fastest computer-use agents. While others rely on slow screenshot loops, this team is pre-training agents to understand interfaces upfront, resulting in 5x faster automation for enterprise legacy systems. If you're a PhD or Master's grad with deep experience in VLMs and agentic orchestration who wants to ship production code rather than just papers, this $150k-$350k role offers a massive equity stake and the chance to define the future of AI-driven work.
Want to apply for this role?
This role is no longer actively hiring, but Jack can still help you discover similar open roles that fit.
Location
San Francisco, United States
Compensation
$150k-$350k + Equity
Company
Generalcatalyst
Role overview
General Catalyst is a global venture capital and investment firm partnering with entrepreneurs from seed stage to growth, specializing in transformational investments across sectors including technology, healthcare, fintech, and applied artificial intelligence. Founded in 2000, the firm manages $40+ billion assets under management as of June 2025 and has a portfolio of over 800 companies such as Airbnb, Stripe, HubSpot, and Snap. With offices in San Francisco, New York City, Boston, Berlin, Bangalore, and London, General Catalyst collaborates with founders to drive innovation, global resilience, and technology transformation.
What you will do
- Research and implement novel agentic architectures for GUI automation using multi-agent coordination, memory, and context management.
- Build and evaluate reasoning pipelines—including chain-of-thought and reflexion loops—that maintain reliability under distribution shifts in enterprise environments.
- Develop interface pre-training methods and VLM-based screen understanding to enable deterministic execution and self-healing for automated enterprise agents.
Who this is a fit for
- Early-career researcher (0-4 years) with a Master's or PhD in CS/AI from a top-tier program or a track record at a premier research lab.
- Strong engineering skills in Python, PyTorch, and agentic frameworks like LangGraph or AutoGen, with the ability to move from paper to prototype rapidly.
- Deep curiosity for computer-use agents and GUI understanding, evidenced by top-tier publications (NeurIPS, ICLR, CVPR) or significant production-grade AI projects.
Why this role is remarkable
- Join a Y Combinator W26 company at the ground floor, working directly with founders on the core technology that defines the product's intelligence.
- Solve a massive enterprise bottleneck by building deterministic, self-healing agents that operate complex legacy software without APIs or structured data interfaces.
- High-impact environment where your research in reasoning models and vision-language architectures is shipped to production for real enterprise customers immediately.
How Jack & Jill work together
Jack gets to know what you're great at and what you want next, then searches 15 million jobs daily and helps you discover roles at companies like this.
Meet Jack
What happens next?
Jack’s an AI agent for job searching and career coaching. He works for you.
Jill is the AI recruiter working for the company. She recruits from Jack’s network.
If your profile’s a match and Generalcatalyst wants to meet, Jill will make the intro. In the meantime, Jack will send you excellent alternatives.